Selected problems in lattice statistical mechanics - CiteSeerX

Selected problems in lattice statisticalmechanics

Yao-ban Chan

September 12, 2005

Submitted in total fulfilment of the requirements of the degree of

Doctor of Philosophy

Department of Mathematics and Statistics

The University of Melbourne

ABSTRACT

This thesis consists of an introduction and four chapters, with each chapter covering adifferent topic. In the introduction, we introduce the models that we study in the laterchapters.

In Chapter 2, we study corner transfer matrices as a method for generating series ex-pansions in statistical mechanical models. This is based on the work of Baxter, whose CTMequations we re-derive in detail. We then propose two methods that utilise these CTM equa-tions to derive series expansions. The first one is based on iterating through the equationssequentially. We ran this algorithm on the hard squares model and produced 48 series terms.The second method, based on the corner transfer matrix renormalization group method ofNishino and Okunishi, is much faster (though still exponential in time), but only currentlyworks for numerical calculations.

In Chapter 3, we apply the finite lattice method and our renormalization group cor-ner transfer matrix method to the Ising model with second-nearest-neighbour interactions.In particular, we study the crossover exponent of the critical line near the point J1 =0, tanh J2

kT=

√2 − 1. Through a scaling assumption, we predict this exponent to be 4

7,

which is supported by our numerical methods. We also estimate the location of the criticallines, and the critical exponent of the magnetization along the lower phase boundary.

In Chapter 4, we study the problem of n-friendly directed lattice walks confined in ahorizontal strip. We find the general transfer matrix for two walkers, and use this and amethod of recurrences to calculate generating functions for certain specific cases via Padeapproximants. Using generating function arguments, we then derive the generating functionfor the number of walks for two walkers in strips of width 3 and 4, three walkers in a strip ofwidth 4, and p vicious walkers in strips of width 2p−1, 2p, and 2p+1. Finally we generalisethe model by introducing another parameter, called bandwidth.

In Chapter 5, we study mean unknotting times, in a problem motivated by DNA entan-glement. Firstly, we find the mean unknotting times for minimal embeddings of low-crossingknots. Then we look at generating random embeddings by using self-avoiding polygon trails(SAPTs). After proving Kesten’s pattern theorem, we use it and the model of a walk on ann-cube to estimate that the mean unknotting time grows exponentially in the length of theSAPT. We try to find the growth constant by using the pivot algorithm to generate SAPTs,but the low-length behaviour of the mean unknotting time appears to follow a power law,leading us to believe that much longer trails are needed.

Declaration

This is to certify that

1. the thesis comprises only my original work towards the PhD except where indicated inthe Preface,

2. due acknowledgement has been made in the text to all other material used,

3. the thesis is less than 100,000 words in length, exclusive of tables, maps, bibliographiesand appendices.

Yao-ban Chan

Preface

This thesis was written under the supervision of Prof. Tony Guttmann (AJG), Dr. AndrewRechnitzer (AR) (The University of Melbourne) and Prof. Ian Enting (IGE) (MASCOS,formerly CSIRO).

• Chapter 1 (Introduction) contains no new research, just introductions to the modelsthat the thesis studies.

• Chapter 2 is joint work with IGE and AR. Sections 2.4 to 2.6 are largely taken from thework of Baxter, while the method in Section 2.10 is based on Nishino and Okunishi’scorner transfer matrix renormalization group method.

• Chapter 3 is also joint work with IGE and AR. Section 3.2 is taken from previousworks by Enting, while Section 3.4 is based on work by Hankey and Stanley.

• Chapter 4 is joint work with AJG. A shortened version of this chapter has been pub-lished in a conference journal volume of Discrete Mathematics and Theoretical Com-puter Science. Theorems 4.6.1 and 4.6.2 were proved by one of the referees of thatpaper.

• Chapter 5 is joint work with AR and Aleks Owczarek (The University of Melbourne).Section 5.3 is based on works by Madras and Slade and by van Rensburg et al. Theorem5.5.1 was given to us by Gordon Slade. Section 5.8 is adapted from the work of Dubinset al.

3

Acknowledgements

• To my supervisors:

– Tony, for looking after me from beginning to end (and after);

– Ian, for looking after me at conferences among other things (and reading 230pages while attending one); and

– Andrew, for putting up with me knocking on his door every other day or so.

• To my family:

– My parents, for always supporting, feeding and housing me;

– Yi-Shuen, for having someone to talk to and play games with;

– Mandy, for providing entertainment; and last but not least,

– Bilbo, for looking cute.

• To inanimate objects:

– Computer games, for keeping me sane;

– Table tennis, for keeping me entertained, exercised and friend-ly at the same time;and

– Piano, for giving me something else to do.

• To the money-providers — the Australian Government, CSIRO, and MASCOS. I likedthe hotels.

4

CONTENTS

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.1 In search of simplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.2 Series and lattice animals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.3 Combinatorial objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.4 Polygons and knots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.5 In this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2. The corner transfer matrix method . . . . . . . . . . . . . . . . . . . . . . . . . . 352.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2 Transfer matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.3 A variational result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492.4 The CTM equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.4.1 An expression for the partition function . . . . . . . . . . . . . . . . 512.4.2 Eigenvalue equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 582.4.3 Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592.4.4 The CTM equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

2.5 The infinite-dimensional solution . . . . . . . . . . . . . . . . . . . . . . . . 642.6 Calculating quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662.7 The 1x1 solution — an example . . . . . . . . . . . . . . . . . . . . . . . . . 712.8 Matrix algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

2.8.1 Cholesky decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 732.8.2 The Arnoldi method . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

2.9 The iterative CTM method . . . . . . . . . . . . . . . . . . . . . . . . . . . 742.9.1 The hard squares model — an example . . . . . . . . . . . . . . . . . 782.9.2 Convergence/results . . . . . . . . . . . . . . . . . . . . . . . . . . . 792.9.3 Technical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802.9.4 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

2.10 The renormalization group method . . . . . . . . . . . . . . . . . . . . . . . 832.10.1 Convergence/results . . . . . . . . . . . . . . . . . . . . . . . . . . . 862.10.2 Technical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872.10.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

2.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

3. The second-neighbour Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . 913.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913.2 The finite lattice method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

3.2.1 Finite lattice approximation . . . . . . . . . . . . . . . . . . . . . . . 1003.2.2 Transfer matrix method . . . . . . . . . . . . . . . . . . . . . . . . . 1023.2.3 The Ising model — an example . . . . . . . . . . . . . . . . . . . . . 105

3.3 Convergence of the CTM method . . . . . . . . . . . . . . . . . . . . . . . . 1063.3.1 Number of iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073.3.2 Matrix size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

3.4 Scaling theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1153.5 Scaling and the crossover exponent . . . . . . . . . . . . . . . . . . . . . . . 1173.6 Finding the critical lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

3.6.1 The upper line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1223.6.2 The lower line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1233.6.3 The disorder point line . . . . . . . . . . . . . . . . . . . . . . . . . . 127

3.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1273.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

4. Directed lattice walks in a strip . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1334.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1334.2 Finding the generating function . . . . . . . . . . . . . . . . . . . . . . . . . 1404.3 A transfer matrix algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 1424.4 A method of recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1454.5 One walker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1474.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

4.6.1 Variable friendliness . . . . . . . . . . . . . . . . . . . . . . . . . . . 1504.6.2 Variable number of walkers . . . . . . . . . . . . . . . . . . . . . . . . 161

4.7 Growth constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1704.8 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1734.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

5. Mean unknotting times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1755.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1755.2 Small knots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1855.3 Kesten’s pattern theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1885.4 Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2055.5 A walk on an n-cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2065.6 Bounds on the mean unknotting time . . . . . . . . . . . . . . . . . . . . . . 2135.7 The pivot algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2165.8 Validity of the pivot algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 2185.9 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2225.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

6

Appendix 226

A. Generating functions for walks in a strip . . . . . . . . . . . . . . . . . . . . . . . 227

B. Critical points for walks in a strip . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

7

LIST OF FIGURES

1.1 Pictorial representations of spins. . . . . . . . . . . . . . . . . . . . . . . . . 151.2 A two-dimensional square grid. We place our spins at each vertex of this grid. 151.3 Some 2-dimensional grids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.4 Each vertex in the square grid is connected to 4 other vertices. . . . . . . . . 161.5 Only a small number of spins significantly affect any one spin. . . . . . . . . 171.6 The energy-creating interactions in the model, including the external field. . 181.7 Some variations on the Ising model. . . . . . . . . . . . . . . . . . . . . . . . 201.8 Graphs on the square lattice. . . . . . . . . . . . . . . . . . . . . . . . . . . 211.9 Small graphs which contribute to the Ising partition function series. . . . . . 221.10 Some combinatorial objects of interest. . . . . . . . . . . . . . . . . . . . . . 241.11 Some combinatorial objects of interest that can be constructed from a walk. 261.12 Every directed walk that ends at a fixed point has the same number of hori-

zontal steps and vertical steps. . . . . . . . . . . . . . . . . . . . . . . . . . . 261.13 An example of vicious walks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.14 A Dyck path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.15 Variations on vicious walks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 291.16 Self-avoiding polygon trails and knots resemble each other. . . . . . . . . . . 301.17 Some embeddings of the unknot. . . . . . . . . . . . . . . . . . . . . . . . . 311.18 Reversing a crossing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.19 The knot 74 can be unknotted with two crossing reversals. . . . . . . . . . . 32

2.1 A 3-dimensional cubic lattice with atoms at each vertex. . . . . . . . . . . . 352.2 A square lattice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.3 V is a matrix which transfers a column of spins. . . . . . . . . . . . . . . . . 392.4 Calculating the first few terms of the low-temperature series for the Ising

model partition function. Hollow circles denote spins with value -1, anddashed bonds denote unlike bonds. . . . . . . . . . . . . . . . . . . . . . . . 41

2.5 A corner transfer matrix incorporates the weight of a quarter of the plane. . 422.6 IRF models can be described solely by their effect on a single cell. All of these

interactions apply at the same time. . . . . . . . . . . . . . . . . . . . . . . . 432.7 A one-dimensional lattice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.8 One-dimensional transference. . . . . . . . . . . . . . . . . . . . . . . . . . . 482.9 Two-dimensional transference. . . . . . . . . . . . . . . . . . . . . . . . . . . 492.10 Toroidal boundary conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . 522.11 Multiple column transfer matrices. . . . . . . . . . . . . . . . . . . . . . . . 52

2.12 ω gives the weight of a single cell. . . . . . . . . . . . . . . . . . . . . . . . . 532.13 Decomposition of a column matrix into single cells. . . . . . . . . . . . . . . 542.14 Reflection symmetry. The weights of configurations (a) and (b) are identical

in undirected models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.15 At optimality, ψ is an eigenvector of V . . . . . . . . . . . . . . . . . . . . . . 552.16 Decomposition of ψ into m F s. . . . . . . . . . . . . . . . . . . . . . . . . . 562.17 Full-row transfer matrix interpretation of R. . . . . . . . . . . . . . . . . . . 562.18 Full-row transfer matrix interpretation of S. . . . . . . . . . . . . . . . . . . 572.19 Half-plane transfer matrix interpretation of X. . . . . . . . . . . . . . . . . . 582.20 Half-plane transfer matrix interpretation of Y . . . . . . . . . . . . . . . . . . 592.21 Graphical interpretation of Equation 2.68. . . . . . . . . . . . . . . . . . . . 622.22 The graphical interpretation of A as a corner transfer matrix gives interpre-

tations of Equations 2.69 and 2.70. . . . . . . . . . . . . . . . . . . . . . . . 632.23 Graphical interpretation of Equation 2.84. . . . . . . . . . . . . . . . . . . . 662.24 Calculating κ. The expression we use is (a)×(c)

(b)2. . . . . . . . . . . . . . . . . . 68

2.25 Expansion of F matrices in Equation 2.137. . . . . . . . . . . . . . . . . . . 842.26 Expansion of A matrices in Equation 2.140. . . . . . . . . . . . . . . . . . . 852.27 Log-log plot of approximated κ vs. final matrix size . . . . . . . . . . . . . . 88

3.1 The interactions around a cell for the second-neighbour Ising model. . . . . . 923.2 A lattice with two different types of spins (filled and hollow). . . . . . . . . . 923.3 A typical configuration in the ferromagnetic low-temperature phase of the

Ising model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933.4 A typical configuration in the anti-ferromagnetic low-temperature phase of

the Ising model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943.5 A typical configuration in the high-temperature phase of the Ising model. . . 953.6 We can divide the lattice into two sets of spins such that every nearest-

neighbour bond connects one spin from each set. . . . . . . . . . . . . . . . . 963.7 The Ising model is symmetrical in the parameter J . . . . . . . . . . . . . . . 963.8 With no nearest-neighbour interaction, the lattice decouples into two separate

square lattices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973.9 An approximate phase diagram in the variables u and v. . . . . . . . . . . . 983.10 A typical configuration in the super anti-ferromagnetic phase of the second-

neighbour Ising model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 983.11 Transfer matrices for a finite lattice. Hollow spins are not in the lattice. . . . 1033.12 Single-cell transfer matrices. The first moves an n-spin cut to an n + 1-spin

cut; the second moves the cut further to the right and down; the third reducesit to n spins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

3.13 Calculated magnetization vs. number of iterations at the point (0.42, 0) withmatrix size 7. The value converges monotonically. . . . . . . . . . . . . . . . 108

3.14 Magnetization vs. iterations at (0.42, 0) with matrix size 8. The value oscil-lates, but converges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

9

3.15 Magnetization vs. iterations at (0.42, 0) with matrix size 10. The value ap-pears to be periodic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

3.16 Magnetization vs. iterations at (0.43, 0) with matrix size 19. The value staysaround the same value, but without discernible periodic behaviour. . . . . . 109

3.17 Magnetization vs. iterations at (0.42, 0) with matrix size 9. The value oscil-lates, eventually diverging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3.18 Magnetization vs. iterations at (0.42, 0) with matrix size 18. The valueswitches to 1 −m halfway. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3.19 Magnetization vs. iterations at (0.414, 0) with matrix size 8. The value is stillincreasing significantly after 1000 iterations. . . . . . . . . . . . . . . . . . . 111

3.20 Calculated magnetization vs. final matrix size at the point (0.42, 0). . . . . . 1123.21 Log-log plot of magnetization vs. size at the point (0.5, 0). . . . . . . . . . . 1133.22 Magnetization vs. size at the point (0.41, 0). At sizes higher than 2, the

calculated magnetization is almost exactly 12. . . . . . . . . . . . . . . . . . . 113

3.23 Calculated magnetization along the u-axis for final sizes 1-10. The leftmostline represents size 1, and the size increases as we move to the right. . . . . . 114

3.24 Estimated critical lines for sizes 1-5. The lowest line represents size 1, andthe size increases as we move upwards. . . . . . . . . . . . . . . . . . . . . . 114

3.25 Log-log plot of critical points on the u-axis vs. matrix size. . . . . . . . . . . 1153.26 Critical exponents near the crossover point. . . . . . . . . . . . . . . . . . . 1193.27 We evaluate the magnetization along vertical and horizontal lines to estimate

the location of the critical line. . . . . . . . . . . . . . . . . . . . . . . . . . 1223.28 Calculated (left) and actual (right) magnetization along the u-axis for matrix

size 3 × 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1243.29 Estimating critical points by assuming a constant error on the critical line.

For this matrix size (3 × 3), we then estimate the critical points to be wherem(u, v) = 0.224. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.30 (12−m)8 for calculated (left, size 3) and actual (right) magnetization. . . . . 125

3.31 Calculating critical points by fitting a line to ( 12−m)8. Our estimate is the

intercept of the line with the x-axis — in this case, 0.411. . . . . . . . . . . . 1253.32 Magnetization vs. inverse size at the point (0.0005,

√2 − 1). . . . . . . . . . 128

3.33 Log-log plot of magnetization vs. u on the line v =√

2 − 1. . . . . . . . . . . 1293.34 Figure 3.33 with fitted lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1293.35 Estimated critical lines in the uv plane. The disorder point line is also shown. 1303.36 Estimated critical exponents along the lower phase boundary. . . . . . . . . 1313.37 Estimated critical exponents vs. −J2

J1. . . . . . . . . . . . . . . . . . . . . . . 131

4.1 General walks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.2 The square lattice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.3 A self-avoiding walk on the square lattice. . . . . . . . . . . . . . . . . . . . 1344.4 The stretched and rotated square lattice. . . . . . . . . . . . . . . . . . . . . 1354.5 An example configuration of 4 vicious walkers. . . . . . . . . . . . . . . . . . 136

10

4.6 An example configuration of four 3-friendly walkers. The thicker lines containmore than one walker. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

4.7 Example configuration of four 3-friendly walkers in a strip of width 6. . . . . 1394.8 Some simple transfer graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1414.9 The distance between walkers can only change by -2, 0, or 2 for any step. . . 1444.10 Possible first steps for two walkers. . . . . . . . . . . . . . . . . . . . . . . . 1464.11 Dividing a single walk in a strip at the points where the walker returns to

height 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1484.12 Dividing walks in a strip of width 3 when the walkers are 2 units apart. . . . 1514.13 Dividing walks in a strip of width 4 whenever the walkers are apart. . . . . . 1574.14 A possible way in which walkers may separate and then join in a single step. 1584.15 A configuration showing all possible even-height states for three vicious walk-

ers in a strip of width 7. There is only one state where the lowest walker isat height 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

4.16 Dividing configurations when the first walker reaches height 2. . . . . . . . . 1664.17 Possible non-trivial paths for the first walker in an end-segment. . . . . . . . 1674.18 Growth constants vs. friendliness for 2 walkers, width 3 (plus signs) and 4

(crosses). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.19 Log-plot of Figure 4.18 for width 4, with asymptotic fitted line. . . . . . . . 1724.20 Growth constant vs. strip width for 2 walkers, for vicious (plus signs) and

1-friendly (crosses) walkers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1724.21 Example configuration for three 4-friendly walks in a strip of width 5, band-

width 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

5.1 Electron micrograph of tangled DNA (from [138]). . . . . . . . . . . . . . . . 1765.2 Some knots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1765.3 Reidemeister moves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1765.4 Different positions for a single crossing. . . . . . . . . . . . . . . . . . . . . . 1785.5 Calculating the Alexander polynomial of an example knot. . . . . . . . . . . 1795.6 Different types of crossing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1795.7 The action of topoisomerase II (from [118]). . . . . . . . . . . . . . . . . . . 1805.8 Reversing a crossing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1815.9 A knot demonstrating why it is not sufficient to consider minimal embeddings

to find the unknotting number. . . . . . . . . . . . . . . . . . . . . . . . . . 1825.10 The knot 51 takes two crossing reversals to become the unknot. . . . . . . . 1825.11 Example of a SAPT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1845.12 Converting a SAPT to a knot. . . . . . . . . . . . . . . . . . . . . . . . . . . 1845.13 Some possible reversals of 51. . . . . . . . . . . . . . . . . . . . . . . . . . . 1865.14 Transfer diagram for 51. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1865.15 Three types of trails. The starting vertices are denoted by hollow circles. . . 1905.16 Decomposition of a half-space trail. Here A1 = 6, A2 = 5, A3 = 3 and A4 = 2.

We also have n1 = 8, n2 = 21, n3 = 29, and n4 = n = 33. . . . . . . . . . . . 1915.17 Transformation of a half-space trail. . . . . . . . . . . . . . . . . . . . . . . . 192

11

5.18 Decomposition of a self-avoiding trail into two half-space trails. (a) is theoriginal trail; it decomposes into (b) and (c). . . . . . . . . . . . . . . . . . . 194

5.19 Joining two SAPTs to form a larger one. . . . . . . . . . . . . . . . . . . . . 1995.20 Transformation of bridges into self-avoiding trails which do not cross a line. . 2015.21 Joining two trails to make a self-avoiding trail which ends at a neighbour of

the origin. We used the transformed trails in Figure 5.20. . . . . . . . . . . . 2015.22 A prime pattern which induces a crossing. . . . . . . . . . . . . . . . . . . . 2045.23 Trefoil segment; (a) without crossings and (b) with crossings. . . . . . . . . . 2055.24 Bijection between paths on the n-cube that pass through certain points. . . . 2125.25 All possible crossing reversals of a trefoil segment. The two knots in the centre

are knotted; the outer knots are unknotted. The highlighted crossings are theones which differ from the connected centre knot. . . . . . . . . . . . . . . . 214

5.26 Terminology for SAPTs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2195.27 Transforming a SAPT into a rectangle through pivots. There are 3 pivots

between (c) and (d). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2205.28 SAPT with vertical support line and non-intersecting segments. After rotation

a diagonal support line can be drawn. . . . . . . . . . . . . . . . . . . . . . . 2215.29 Comparing the steps −−−→pi−1pi and −−−→pj−1pj in a SAPT with a diagonal support

line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2225.30 Transforming one rectangle to another via a rotation and a reflection. . . . . 2235.31 Mean unknotting time vs. length of generating SAPT. . . . . . . . . . . . . 2235.32 Log-log plot of mean unknotting time vs. length. . . . . . . . . . . . . . . . 224

12

1. INTRODUCTION

1.1 In search of simplicity

In the physical sciences, it is of great interest to study the properties of a magnet. In general,there are two ways that one can achieve this end — by experimentation or by theoreticalmeans. Considering that this is a theoretical thesis, we choose the latter way. We will startby stating the goal of a fair-sized section of statistical mechanics:

To find and/or estimate the physical behaviour of a magnet via theoreticalmeans.

Since we are taking a theoretical approach, we will approximate the physical magnetwith a theoretical model. There are many ways to create such a model, of varying degreesof usefulness. This leads us to the goal of this section:

To create an accurate and useful theoretical model of a magnet.

As one might suspect, there is a distinct trade-off between the accuracy of a model and itscomplexity — if we make our model too realistic, there are simply too many factors to keeptrack of, whereas if we simplify too much, our results become more and more inaccurate. Thekey lies in finding the right balance, so that we can actually get results, while still retainingenough realism to make those results worthwhile.

The next question is: how do we model a magnet? This question has many answers,and various ways of modelling have different levels of usefulness depending on what we wishto know. The approach taken by statistical mechanics is the microscopic approach. Thereasoning is that since a magnet is known to be made up of many atoms, each with its ownmagnetic spin, it should be possible, and reasonable, to model a magnet as a very largenumber of these atoms.

Interestingly, more than just magnets can be modelled in this way. Since every substanceis composed of lots of atoms, a many-atom model is applicable to many materials. However,for the purposes of this thesis, we will concentrate on a magnetic model. As stated, themagnet contains many magnetic atoms. We would like to incorporate these atoms into amodel which we can use to estimate or calculate the properties of a magnet.

The first thing we must do is to figure out what factors affect the behaviour of the magnet.These factors include, but are not limited to:

• The nature of the atoms

• The arrangement of the atoms

• The interatomic interactions generated by the presence of many atoms in a small space

and external variables such as

• The surrounding temperature

• The surrounding external magnetic field.

The question then becomes what values to set all of these factors to.Obviously, there are many possibilities for all of the abovementioned factors. By varying

these factors (or, indeed, introducing new ones) many different magnetic models can be made,some of which can give wildly varying results to others. The majority of these models aresimply too complicated for us to even consider anything more than the simplest calculations.These do not interest us. What we want is a simple model that we can generate results from,even if our assumptions are not quite correct and our results not quite exact. At least thismodel will give us some results!

To get to such a simple model, we start by specifying our factors. We will consider eachfactor in turn — but note that there are many possibilities for each factor, of which morethan one may be equally interesting (and valid). We will only consider one possibility tostart off with.

1. The nature of the atoms.

The first thing we do is to specify the nature of our atoms exactly. While it is certainlypossible to take into consideration the mass or physical size of the atoms, or whether wehave different types of atoms, we are ultimately searching for simplicity. As such, wemodel the atom by the simplest possible object — a single point. This point has exactlyone property, its magnetic spin, which we represent by a single number. Because ofthis, we will also refer to the atoms as spins from now on.

In fact we can even do better than this; not only will we set the magnetic spin of anatom to be just one number, but we also force that number to take only two values.These values are more or less arbitrary, as a linear transformation will transform anytwo values into any other. On the other hand, we would like to have both positive andnegative magnetic strengths, and it seems reasonable that the basic unit of strengthshould be equal in both directions. Therefore we take the possible spin values (alsoknown as states) to be 1 and -1. Sometimes these spins are called up and down spins,mostly due to their possible pictorial representations. We show these in Figure 1.1.

2. The arrangement of the atoms.

Next, we look at the physical arrangement of the atoms. Magnets are always solidobjects, and solids have their atoms fixed in a regular pattern. We imitate this byplacing each spin on the vertex of a regular grid, for example a two-dimensional squaregrid as shown in Figure 1.2. The question then becomes what grid we ought to placethe spins on. We can separate this into two factors:

14

(a) Pictorial representation of up and downspins.

(b) We will depict positive and negativespins with filled and hollow circles when weneed to differentiate between them.

Fig. 1.1: Pictorial representations of spins.

Fig. 1.2: A two-dimensional square grid. We place our spins at each vertex of this grid.

• The dimension of the grid, and

• The geometry of the grid.

Given that magnets are 3-dimensional objects, we would undoubtedly like to use a 3-dimensional grid for our model. On the other hand, 3-dimensional grids often engendervery complex and difficult to solve models. The only other realistic alternatives are 1-and 2- dimensional grids. It turns out that a 1-dimensional grid often yields a relativelysimple calculation, but is not particularly realistic. However, it is still useful for testingtechniques that we would like to extend to higher dimensions.

This leaves the 2-dimensional grids. These possess a surprisingly great deal of complex-ity compared to their 1-dimensional counterparts. In fact, while many 1-dimensionalmodels are exactly solvable, most 2-dimensional models remain unsolved, even today.In fact 2-dimensional models bear much more similarity to 3-dimensional models interms of solvability and the techniques used on the models. However, since they areundoubtedly simpler, it seems best for us to use 2-dimensional grids.

For the geometry of the grid, there are again many possibilities, a lot of which are validand interesting. We have already shown the 2-dimensional square grid in Figure 1.2;other simple possibilities in two dimensions include triangular and hexagonal lattices,which we show in Figure 1.3. In 3 dimensions, the choice becomes even more com-plicated, as we encounter lattices like the face-centred cubic and body-centred cubic

15

(a) Triangular grid. (b) Hexagonal grid.

Fig. 1.3: Some 2-dimensional grids.

Fig. 1.4: Each vertex in the square grid is connected to 4 other vertices.

lattices. Ultimately, we would like to have models for all grids, but for the moment, wewill have to just choose one. We choose to work on a 2-dimensional square grid (Z2),placing a spin at every vertex of this grid.

In this grid, each vertex is connected via a bond to 4 other vertices, as shown in Figure1.4. The number of bonds incident on a single vertex is called the co-ordination numberof the grid, denoted by q. Because each bond connects 2 vertices, there are twice asmany bonds as vertices.

It is worth noting that in other statistical mechanical models, the atoms need not befixed at all. This occurs most often in models of liquids and/or gases, which makessense as the atoms are not in fixed positions in these substances in actuality. Onthe other hand, magnetic models can also be used to model gases and liquids, withreasonable success.

3. The interatomic interactions.

Considering that we are representing the atoms solely by their magnetic spin, the onlyatomic interactions that we can count are the magnetic interactions between atoms.But a problem soon arises: in the real world, every magnetic spin would interactwith every other spin in the magnet, so we must take into account all possible spinpairs. However, the number of spin pairs is much greater than the number of spins,so this is highly undesirable, especially when we note that the vast majority of spins

16

Fig. 1.5: Only a small number of spins significantly affect any one spin.

are comparatively far away from any given spin, and therefore would have a very smallinteraction with that atom. We show this in Figure 1.5.

We can extend this thought to its logical conclusion — given any one spin, the spinswith the greatest interaction with that spin are those spins which are nearest to it,i.e. the spins which are connected by an edge of the lattice, or the innermost circle inFigure 1.5. These spins are known as the nearest neighbours of the original spin, forobvious reasons. We will make a sweeping simplification and say that these are theonly spins which interact with our given spin.

Given that only nearest neighbours interact, we now ask: how do they interact? Be-cause our atoms are fixed in position, they will never move closer or farther away fromeach other. Instead, the magnetic interaction between two spins produces an internalenergy in the system. Because the magnet tries to achieve a state which has low energy,keeping track of this energy enables us to set up a probabilistic model of the magnet.Before we do this, we will quantify the spin interactions and the energy that theseinteractions produce.

Since the spins are either positive or negative, there are only two possible interactions,which occur if the spins are aligned or not aligned (we say two spins are aligned if theytake the same value). We let there be a certain amount of energy in the system whenthe spins are aligned, and an equal but negative amount otherwise. We denote thisenergy by −J , keeping it as a parameter of the model. This means that if we havetwo neighbouring spins with values σi and σj, the energy generated by the interatomicinteraction between these two spins will be −Jσiσj.

4. The surrounding temperature.

Considering that the temperature of the magnet can be easily varied, it makes senseto keep this as a parameter of our model. We denote it by T . At higher temperatures,the atoms are more agitated, so configurations with higher energy are more likely tooccur than at lower temperatures.

17

Fig. 1.6: The energy-creating interactions in the model, including the external field.

5. The external magnetic field.

Again, this is a quantity which can be easily varied. The external field will interactwith each spin in a very similar manner to the spins interacting with each other. Welet the energy generated by the interaction of a positive spin with the external fieldbe −H, and take H to be the energy generated by a negative spin. Then the energygenerated by any spin with value σi is equal to −Hσi.

We are now ready to set up our probabilistic model. Suppose that we have N spins inour model, where ideally N is very large, or even infinite. We label these spins arbitrarilyfrom 1 to N , and let spin i have the value σi. The total energy of any configuration resultingfrom the magnetic interactions, both internal and external, is

−J∑

<i,j>

σiσj −H∑

i

σi (1.1)

where the first sum is over all nearest-neighbour pairs of spins i and j. This quantity iscalled the Hamiltonian of the system, and is denoted H(σ1, σ2, . . . , σN). Now, we want toset up the probabilities for the model so that configurations of spins with lower energy aremore likely to occur, while configurations with higher energy are less likely. We would alsolike all configurations with the same energy to be equally likely.

To achieve this, we take the probability of any configuration to be proportional to theexponential of a negative multiple of the Hamiltonian. This means that

P (σ1, σ2, . . . , σN ) ∝ e−βH(σ1 ,σ2,...,σN ) = exp

(

−β(

−J∑

<i,j>

σiσj −H∑

i

σi

))

. (1.2)

Because higher temperatures increase the probability of higher-energy states, we take β =1kT

, where k is Boltzmann’s constant. This distribution is known as the Gibbs canonicaldistribution.

From here, it is a simple matter of normalisation to find the probabilities, since the sumof all probabilities must equal 1. The normalising factor is

ZN =∑

σ1,σ2,...,σN

exp

(

βJ∑

<i,j>

σiσj + βH∑

i

σi

)

(1.3)

18

which we call the partition function. This means that the probability of any configuration is

P (σ1, σ2, . . . , σN) =1

ZNexp

(

βJ∑

<i,j>

σiσj + βH∑

i

σi

)

. (1.4)

Not by coincidence, we have gradually arrived at the most famous and well-studied modelin statistical mechanics, the Ising model (which in this case is more properly called the spin- 1

2

two-dimensional square lattice Ising model), which was introduced by Ising in 1925 ([72]).Evaluating the partition function is the most important task in studying these models, as wecan derive all of the information that we want to know about the magnet from it. Thereforethe aim of every statistical mechanical model is to derive an exact, closed form expressionfor the partition function. If we can do this, we call the model solved.

In a famous paper in 1944, Onsager ([114]) managed to exactly solve for the partitionfunction of the Ising model in the case of zero field (H = 0). However, it is a testament tothe complexity of the model (despite the simplicity of its definition) that the arbitrary-fieldcase still has not been solved, even 60 years later.

Many of the choices that we have made in setting up this model are quite arbitrary,although they are reasonable. By relaxing or altering our conditions, many other interestingmodels can be formed. Such models include:

• The hard squares model. In this model, we dispense with the quantification of thespin-pair interaction between the atoms. Instead we let each spin have a base state (1)and an excited state (-1), and we forbid any configurations where two excited statesare adjacent to each other. If we draw a square around each excited state with verticesat the nearest neighbours of that spin, then this constraint implies that no two of thesesquares can intersect — hence the term hard squares. Because we have removed oneparameter, this model is usually slightly less complicated than the Ising model. Weillustrate this model in Figure 1.7(a), and will study it further in Chapter 2 (mostlyas an example).

• The q-state Potts model. This model does away with the assumption that theatoms can only take two states. Instead, each spin has q possible states, numberedfrom 0 to q − 1. This forces us to make some changes to the way the interactionswork — only spins in state 0 are affected by the external magnetic field, and the onlynearest-neighbour pairs of spins which interact are those for which the spins are in thesame state.

• The second-neighbour Ising model. In this model, we remove the restriction thata spin can only interact with its nearest neighbour. Now we allow spins to interactalso with spins which are diagonally removed across a square in the grid (their secondneighbours). Because the interaction between first neighbours is obviously of differentstrength to the interaction between second neighbours, we set the strength of the first-neighbour interaction (previously J) to be J1, while the second-neighbour interaction

19

(a) A configuration in the hardsquares model. Hollow circles denotenegative spins.

(b) Interactions in the second-neighbour Ising model.

Fig. 1.7: Some variations on the Ising model.

strength is denoted by J2. The partition function of this model then becomes

ZN =∑

σ1,σ2,...,σN

exp

(

βJ1

∑

<i,j>

σiσj + βJ2

∑

<i,k>2

σiσk + βH∑

i

σi

)

(1.5)

where the second sum in the exponential is over all second-neighbour spin pairs i andk. We will explore this model in detail in Chapter 3, but for the moment, we illustratethe interactions in Figure 1.7(b).

1.2 Series and lattice animals

The Ising model is a statistical mechanical model, but hidden underneath the surface liesome very deep combinatorial aspects. One link comes from the partition function ZN . Wecan write this as

ZN =∑

σ1,σ2,...,σN

∏

<i,j>

eβJσiσj∏

i

eβHσi . (1.6)

Now suppose, for the sake of simplification, that there is no external magnetic field, i.e.H = 0. Then the second exponential in the above expression is always 1. Looking at thefirst exponential, because the spins can only take the values 1 and -1, we have

eβJσiσj =

{

eβJ if σi = σje−βJ otherwise

(1.7)

for any i and j. Therefore we can take

eβJσiσj = cosh βJ(1 + σiσj tanh βJ). (1.8)

20

(a) A graph on the square lat-tice.

(b) A connected graph on thesquare lattice. All vertices haveeven degree.

Fig. 1.8: Graphs on the square lattice.

Since there are 2N nearest-neighbour pairs on the square lattice, we can write the zero-fieldpartition function as

ZN = (cosh βJ)2N∑

σ1,σ2,...,σN

∏

<i,j>

(1 + σiσj tanh βJ). (1.9)

Now consider the product in the above equation. If we expand this product, we willhave a sum of 22N terms, since there are 2N bonds. Each of these terms is a product ofcontributions of either 1 or σiσj tanh βJ from each bond. Now select one term (arbitrarily)and envision a square grid. If we say that for every nearest-neighbour pair which contributesσiσj tanh βJ , there is a bond on the grid, then each term will correspond to a graph on thegrid. We show some of the possible graphs in Figure 1.8.

Now if we look at each term, we see that the power of σi in any term is equal to thedegree of the vertex i in the corresponding graph. However, because σi can only take thevalues 1 or -1, we know that

∑

σiσni is 0 if n is odd and 1 if n is even. Therefore the only

possible non-zero sums occur when there are even powers of every σ in the term. This meansthat every graph which contributes to the partition function must have even degree in allvertices. An example of such a graph is shown in Figure 1.8(b).

For these terms, the sum is taken over all 2N possible configurations of spins. Therefore,if we set u = tanh βJ , we can write ZN as the series

ZN = 2N(cosh βJ)2N∑

g

u|g| (1.10)

where the sum is over all graphs g on the square lattice which have even degree at everyvertex, and |g| denotes the number of bonds in g. Therefore ZN is a multiple of the generatingfunction of these graphs, and its first few terms can be calculated by enumerating thesegraphs. This is called the high temperature expansion of ZN , introduced by van der Waerdenin 1941 ([132]). When cast in this fashion, the problem of calculating the partition functionof the Ising model is shown to be equivalent to a combinatorial problem.

21

(a) All valid graphs, up totranslation, with 4 bonds.

(b) All valid graphs, up to translation,with 6 bonds.

(c) All valid connected graphs, up to rotation and trans-lation, with 8 bonds.

Fig. 1.9: Small graphs which contribute to the Ising partition function series.

Now we look at ways that we can solve this combinatorial problem. The most obvioussolution is to count them directly, which is known as direct enumeration. For the Ising model,the non-trivial graph with the least number of bonds which has even degree in every vertexis the square shown in Figure 1.9(a). The lower left vertex of the square (say) can be placedat any vertex in the grid, which means that there are N such graphs that can be placed onthe square lattice.

The next graphs (with 6 bonds) that satisfy our conditions are the rectangles in Figure1.9(b). By the same argument, there are 2N such graphs on the grid. However, when we goto 8 bonds, the situation becomes more complicated — in addition to the figures in Figure1.9(c), which number 9N , we have the possibility of two disconnected squares. There areN ways to place the first square and N − 9 ways to place the second, but this counts eachplacement twice, so the total number of valid graphs with 8 bonds is 1

2N2 + 9

2N .

This gives the first few terms of ZN :

ZN = 2N(cosh βJ)2N

(

1 +Nu4 + 2Nu6 + (1

2N2 +

9

2N)u8 + . . .

)

(1.11)

It quickly becomes apparent that direct enumeration rapidly becomes far too complex andtime-consuming to use for finding any but the very first few terms of ZN . In fact, for thisproblem direct enumeration is an exponential time algorithm, meaning that if we wanted tocompute n terms of the partition function series using direct enumeration, the time we willneed is approximately proportional to αn for some constant α. This is very inefficient — forexample, if α = 2, which is not an overly large number, finding any one term will take aslong as finding every single term before it! For the Ising model, direct enumeration is evenmore inefficient than this, as α = 2

√2.

22

To find more series terms, we need more efficient algorithms. There are two ways to dothis. One is to lower the growth constant α. While this will produce an algorithm that isexponentially quicker than before, it is still an exponential time algorithm. The ‘holy grail’of series enumeration (not only of the Ising model partition function, but almost all series) isto find a polynomial time algorithm (or faster, but that would just be unrealistic), which isan algorithm which takes time on the order of nβ to compute n series terms. Unfortunatelysuch algorithms are few and far between. Of course, these are just the two extremes ofefficiency; it is entirely possible to have algorithms which are sub-exponential and still notpolynomial, for instance taking time on the order of γ

√n.

In the search for more efficient algorithms to enumerate our series, there have been devisedmany sophisticated methods for finding the partition function of the Ising model. The mostnotable of these are the well-known finite lattice method or FLM, which we shall describe inChapter 3, and the lesser-known but potentially more efficient corner transfer matrix methodor CTM method, which we study in detail in Chapter 2 (and apply in Chapter 3).

1.3 Combinatorial objects

In the previous section, we saw that it was possible to calculate the Ising model partitionfunction simply by counting graphs on a square grid. In fact this is by no means a uniquephenomenon: a whole host of other important quantities can be calculated by enumeratingcombinatorial objects. Indeed, often it is interesting to count the objects even if they do nothave a direct application in statistical mechanics. For these objects, we therefore ask thequestion:

Given a class of objects with a size measure, how many objects of size n arethere?

Some such objects of interest include:

• Bond animals. Very similar to the graphs that are counted by the high temperatureexpansion of the Ising model partition function, a bond animal is a connected set ofbonds.

• Site animals. As with bond animals, a site animal is a connected set of sites. Wedefine two sites as connected if they are nearest neighbours. Interestingly, there isalso a series formulation of the Ising model partition function, appropriate for lowtemperatures, which involves counting sets of site animals.

• Polyominoes. Polyominoes can also be thought of as ‘cell animals’. If we define eachunit square on the grid as a cell, then polyominoes are connected sets of these cells.

There are also interesting specializations of these objects such as

• Polygons. A polygon (or self-avoiding polygon) is a bond animal where each vertexhas either degree 2 or degree 0. We can also think of the interior of a polygon as aspecial kind of polyomino with no ‘holes’.

23

(a) A bond ani-mal.

(b) A site animal.The sites in theanimal are filled.

(c) A polyomino. (d) A self-avoiding polygon.

(e) A directedbond animal.

(f) A column-convexpolygon.

(g) A staircase poly-gon.

Fig. 1.10: Some combinatorial objects of interest.

• Directed bond and site animals. These are bond and site animals in which everybond or site can be reached from one bond/site (called the root) by a path that takesonly steps in certain directions, commonly north and east.

• Column-convex polygons. A polygon is column-convex if the intersection of itsinterior with any vertical line is connected.

• Staircase polygons. These are polygons which are the union of two walks which useonly steps in two directions, commonly north and east.

We show examples of these objects on a square lattice in Figure 1.10.Many of these objects (in particular polygons and specialized polygons) can be con-

structed from a walk on the lattice. A walk is just a single path on the lattice, which startsand ends at a vertex. It can also be thought of as the locus of a walker (characterised solelyby its location) moving at constant velocity on the lattice. We take the size of the walk tobe its length (or, equivalently, the time taken to complete the walk).

Polygons can be constructed directly from a single walk by envisioning a walker whichstarts at an arbitrary point and must finish at that same point, but otherwise cannot visit avertex that it has already visited (even the starting point). Many restricted polygons (e.g.column-convex polygons) can be constructed in a similar manner. Some similar objectswhich can also be constructed from a single walk are:

24

• Self-avoiding walks. A self-avoiding walk is a walk which cannot revisit any vertexwhich it has already visited.

• Directed walks. A directed walk can only walk in certain directions, usually northand east.

• Self-avoiding trails. A self-avoiding trail is similar in nature to a self-avoiding walk,but the walker cannot traverse any edge that it has already traversed (visiting alreadyvisited points is allowed as long as edges are not revisited).

• Self-avoiding polygon trails. This is an amalgamation of self-avoiding polygonsand self-avoiding trails — the walk cannot revisit edges (points are okay), but mustfinish at its starting point.

• Spiral walks. These are self-avoiding walks which can only turn in one direction,creating a spiral-like pattern.

• 3-choice walks. These are walks which are forbidden to make clockwise turns after astep in the horizontal direction. So if it moves east in one step, it cannot move southin the next; similarly for west and north.

We show examples of these objects in Figure 1.11.The self-avoiding walk problem in particular — or more precisely, counting the number

of self-avoiding walks of length n — is a well-known and much-studied problem in combina-torics. Despite the best efforts of mathematicians over more than 50 years, this still remainsunsolved for general n.

For most enumeration problems, if we cannot calculate the exact number of objectsdirectly (which happens often), we look for the asymptotic behaviour. This is a relationshipbetween the number of objects and a measure of size, as that measure grows to ∞. In thecase of walks, we try to express the number of walks of length n in terms of n as n→ ∞. Ifthe number of walks is approximately αn for some non-trivial number α, we say that there isan exponential relationship, and call α the growth constant. It is a testament to the difficultyof the self-avoiding walk problem that we do not even have an exact value for the growthconstant of the number of walks on the square lattice!

In contrast, the directed walk problem is almost trivial. At each step, the walker has twopossible steps to choose from, north and east. None of these steps are ever prohibited, andtherefore the number of walks of length n must be 2n. Even if we wish to keep track of theend-point of the walk, any directed walk of length n from the origin to (m,n−m) must takem horizontal steps out of a total of n, which means that there are

(

nm

)

such walks, as shownin Figure 1.12.

However, we can insert greater complexity into the problem by considering more thanone walk at a time. We notice that a directed walk, by construction, must be self-avoiding.Suppose that we now take several directed walkers, which travel in the same directions.If the walkers do not affect each other, calculating the number of possible walks is easy.However, once we impose an avoidance constraint, the problem becomes much harder. This

25

(a) A self-avoiding walk. (b) A directedwalk.

(c) A self-avoiding trail.

(d) A self-avoidingpolygon trail.

(e) A spiral walk. (f) A 3-choice walk.

Fig. 1.11: Some combinatorial objects of interest that can be constructed from a walk.

Fig. 1.12: Every directed walk that ends at a fixed point has the same number of horizontal stepsand vertical steps.

Fig. 1.13: An example of vicious walks.

Fig. 1.14: A Dyck path.

is the basis of the vicious walks problem. In this problem, there are several walkers, whichcan only move in directed walks. However, when two walkers meet at the same point, theyannihilate each other, so no such configuration is allowed.

We still have the problem that a walker may visit a point which has been previouslyvisited by other walkers that have moved on. To overcome this, we adjust the model so thatthis cannot possibly happen, by forcing all walkers to have the same x-coordinate at any onetime. To achieve this, we rotate the square lattice clockwise by an angle of π

4and expand

it by a factor of√

2, so that the walkers still have integer coordinates, but can only takenorth-east or south-east steps. Since this ensures that the walks will have travelled the samex-distance at any one time, we can then ensure that they stay ‘in step’ by starting them allat the same x-coordinate, in this case 0. By spacing them 2 units apart on the y-axis, wethen arrive at the vicious walks problem. An example of this model is shown in Figure 1.13.

This model is related to the well-known Dyck paths. A Dyck path is a path on the samelattice which starts at the origin, ends at height 0, and does not go below the x-axis. Weshow a Dyck path in Figure 1.14.

The vicious walks problem has been solved exactly by Guttmann, Owczarek and Viennotin 1998 ([65]). They found that if there are p vicious walkers, then the total number of

27

possible walk configurations of length n is

∏

1≤i≤j≤n

p+ i+ j − 1

i + j − 1. (1.12)

Like the Ising model, we can also make the vicious walks problem a little more interestingby devising several variations. We do this by, again, relaxing or altering the conditions ofthe model. Some of these variations are:

• Walks with a wall. We apply a Dyck path-like constraint to the walks, preventingthem from going below the line y = 0. Alternatively, we could prevent the walks fromgoing above the line y = L.

• Walks in a horizontal strip. We apply two boundary conditions, forcing the walksto remain within the horizontal strip between y = 0 and y = L.

• Friendly walks. We relax the avoidance constraint, so that walks may touch, but notcross each other.

• n-friendly walks. This is a generalisation of the model, rather than a modification.We allow two (but no more than two) walks to touch, and they may continue to touchfor up to n vertices, where n is a number called the friendliness. We still do not allowthe walks to cross. Under this system, vicious walkers are 0-friendly.

We show some of these variations in Figure 1.15.The different models that can be made with these variations are quite interesting, and

for many cases the models have not been solved exactly, although asymptotic behaviour hasbeen found for many. We are interested in one model in particular, that of n-friendly walksconfined in a horizontal strip. This model amalgamates the principles of walks in a strip andn-friendliness in an obvious manner. We study this model in detail in Chapter 4.

1.4 Polygons and knots

If we take the self-avoiding polygon trail described in the previous section, we notice thatit bears a striking similarity to the common representation of a knot. Formally, a knot isa simple, closed curve in 3 dimensions. The distinguishing feature of a knot is, informally,how tangled up it is with itself — in other words, how ‘knotted’ it is.

To gain an idea of this concept, consider that there are many closed curves in 3 dimensionswhich can be transformed continuously into other curves in a physical sense — that is, wecan physically move the first curve to match the other curve without breaking the curveor having it intersect itself at any stage. On the other hand, there are some curves whichcannot be transformed into each other in this fashion. We say that any knots which can betransformed into each other in this way have the same knot type, although sometimes we usethe terms knot type and knot interchangeably. We call such knots equivalent.

28

(a) Walks with a wall. (b) Walks in a horizontal strip.

(c) Friendly walks. (d) n-friendly walks. Here n = 1.

Fig. 1.15: Variations on vicious walks.

(a) An embedding of a knot. (b) A corresponding self-avoiding polygon trail.

Fig. 1.16: Self-avoiding polygon trails and knots resemble each other.

To simplify our visualization of knots and knot types, we represent knots in a 2-dimensionalform called an embedding. To reach an embedding, we project the knot down onto a plane,forming a closed (but not necessarily simple) curve. In most knots the projection will crossitself at least once, and in 3-space one of the ‘strands’ of the knot at the crossing will be atgreater height than the other. It is important in finding the type of the knot to know whichstrand this is, so every crossing is marked so that there is an overpass and an underpass.This is commonly represented by drawing a break in the under-crossing, so the embeddinglooks like a closed curve with breaks where it crosses itself. It is this embedding that the self-avoiding polygon trail bears a resemblance to. We show an example of such a resemblancein Figure 1.16.

The idea behind this resemblance is that both knot embeddings and self-avoiding polygontrails are closed curves which can cross themselves. Following this idea through, we can seethat any self-avoiding polygon trail (abbreviated to SAPT) can in fact be converted to a knotembedding simply by creating a crossing at each place where it crosses itself, and randomlyassigning one of the strands to be the overpass and the other to be the underpass.

The advantages of such an association lie in the fact that we can now use what we knowabout self-avoiding polygon trails to infer information about knots. In particular, knot theoryis very interested in the idea of unknotting. The idea is that there is a ‘simplest’ knot, calledthe unknot, which can be embedded as a simple circle with no crossings, as shown in Figure1.17(a). Of course, there are many other embeddings which are equivalent to the unknot,but which have crossings, such as Figure 1.17(b).

Unknotting is the act of transforming any knot into the unknot (or more precisely a knotequivalent to the unknot) by reversing crossings. We reverse crossings by switching the twostrands in the crossing, so the overpass becomes the underpass and vice versa, but nothingelse changes. This is illustrated in Figure 1.18.

In fact, there is a very practical application for the use of knots and unknotting. Thereare many physical objects which can be modelled by knots; in particular, we are interestedin strands of DNA. DNA is well known to have a double helix structure, but if we take each

30

(a) The unknot. (b) Another embedding of the un-knot.

Fig. 1.17: Some embeddings of the unknot.

Fig. 1.18: Reversing a crossing.

double helix as a single strand, we see that a piece of DNA is essentially one long strand.Often this strand is tangled up with itself, forming a knot.

Now, for the cell containing the DNA to be able to replicate, the DNA needs to beuntangled (i.e. unknotted). There is an enzyme called topoisomerase II that acts on asection of the DNA by breaking it up, passing another section of DNA through the break,and then resealing the strand. We see that this corresponds exactly to the crossing reversalthat we mentioned above.

The aim of the topoisomerase action is to disentangle the DNA into one long, unknottedstrand. This is equivalent to unknotting a knot by means of reversing crossings. We want tofind out how long the disentangling will take. Put in terms of knots, this immediately leadsto the following question:

Given a knot, what is the smallest number of crossing reversals needed to turnthis knot into the unknot?

The answer to this question is called the unknotting number of the knot. This is a well-studied quantity, but determining the unknotting number of a given knot is often very hard— in fact, there exists a knot with 8 crossings whose unknotting number is unknown. This isdue to the fact that the unknotting number looks at the smallest number of reversals neededover all possible embeddings of a knot, and every knot can be embedded in an infinitenumber of ways. Not only do different embeddings need a different number of minimumreversals, but it is possible for embeddings with more crossings to need less reversals thanother embeddings with less crossings. So we cannot just take the most simple embeddingswhen trying to determine the unknotting number.

In Figure 1.19 we show how the knot known as 74 (in part because it has 7 crossings)can be unknotted in 2 crossing reversals, indicating that its unknotting number is either 1

31

Fig. 1.19: The knot 74 can be unknotted with two crossing reversals.

or 2. It turns out ([94]) that there are no embeddings of 74 which can be unknotted by asingle reversal, so the unknotting number of 74 is 2.

However, there is a difficulty with using the unknotting number as a model for untanglingDNA. The topoisomerase enzyme, being an enzyme that acts locally, will almost certainlynot know the crossing to reverse to take the shortest route to the unknot; rather, it is muchmore likely that it simply reverses crossings at random. In terms of knots, this translatesinto the question:

If we reverse crossings at random, what is the average number of crossingreversals needed to turn a given knot into the unknot?

We call this number the mean unknotting time of the knot. Ironically, this may even beharder to calculate than unknotting number, because it is not well defined. Using differentembeddings for the same knot results in vastly different mean unknotting times, so it makesmuch more sense to talk of the mean unknotting time of an embedding, rather than a knot.

Ideally, we would like to choose a knot, and then randomly choose an embedding for thatknot and find the mean unknotting time for that embedding. This is where the relationbetween self-avoiding polygon trails and knots comes in handy. One of the problems withembeddings is that they are difficult to generate randomly — after all, there are an infinitenumber of them, so how can we assign a probability distribution to them? On the otherhand, it is (relatively) easy to generate a SAPT randomly, because there are a finite numberof SAPTs of any length. All we need to do is choose one at random.

The main problem with this method is that choosing SAPTs randomly in this way doesnot allow us to choose a knot type. Rather, we observe the relationship between the length ofthe self-avoiding polygon trail, which corresponds to DNA length, and the mean unknottingtime. The general idea is that the longer the length, the more likely the chance that the SAPTwill fall into a ‘complicated’ knot, and therefore length is directly related to complexity. Inthis way, we can get an idea of the mean unknotting time of a ‘random knot’. This problemis studied in detail in Chapter 5.

32

1.5 In this thesis

This thesis is divided into four chapters, each corresponding to a topic.

• Chapter 2 studies the corner transfer matrix method. In Section 2.1, we introducethe method and look at some historical background. In Section 2.2, we present somemore background material in the form of a quick primer on transfer matrices and howthey can be used to find series expansions. Section 2.3 gives a variational result thatwill be used later on. In Section 2.4, we derive (in great detail) the all-importantCTM equations, following the original work of Baxter. In Section 2.5, we show howthese equations yield the exact partition function in the infinite-dimensional limit. InSection 2.6, we show how to calculate physical quantities of interest from the equations,and we illustrate the equations by solving them for 1 × 1 matrices in Section 2.7. InSection 2.8 we present some matrix methods that will come in handy. In Section 2.9we present an iterative method for calculating series based on the CTM equations.This method is analysed in Sections 2.9.2 to 2.9.4, where we look at the advantagesand flaws in the method, as well as the results we derived from it. We derive anothermethod based on the renormalisation group method of Nishino and Okunishi in Section2.10, and analyse it in a similar manner. Lastly, in Section 2.11 we discuss possibleenhancements to these methods and suggest future directions in which we can take theresearch.

• Chapter 3 applies the renormalization group corner transfer matrix method to thesecond-neighbour Ising model. In Section 3.1 we derive the model and give some back-ground. In Section 3.2 we illustrate and discuss the current premier series-derivationmethod in two dimensions, the finite lattice method. Next, in Section 3.3, we look athow the renormalization group CTM method applies to the model, analysing in detailthe convergence of the method. We then analyse the second-neighbour Ising modeltheoretically by means of scaling methods. Section 3.4 gives a quick primer of scalingtechniques and devices, which we then apply to our model in Section 3.5 to estimate thecrossover exponent. In Section 3.6 we discuss how to find the critical lines, using boththe CTM and FLM methods. Then we apply these methods to the model, producingresults in Section 3.7, as well as estimating the critical exponent of the magnetizationon the lower phase boundary. Lastly, we consider what we have done and offer possiblefurther avenues of research in Section 3.8.

• Chapter 4 studies the n-friendly directed walks problem when confined in a horizontalstrip. Section 4.1 outlines the model and gives some historical background. In Section4.2 we show how we can guess the generating functions of the walkers from the first fewterms, using Pade approximants. Turning to the problem of generating those terms,we give the general transfer matrix for two walkers in Section 4.3 and a method ofrecurrences for larger numbers of walkers in Section 4.4. In Section 4.5, we give anidea of our general methodology by proving the one-walker generating function. Thenin Section 4.6, we present our main results: we prove the generating functions for 2

33

walkers in a strip of width 3 and 4, three walkers in a strip of width 4, and p viciouswalkers in strips of width 2p− 1, 2p and 2p+ 1. In Section 4.7 we take a brief look atthe growth constants we derive from these results. Next, in Section 4.8 we speculateabout the possibility of another parameter in the model, bandwidth. Lastly, in Section4.9 we reflect on our results and consider further possibilities for research.

• Chapter 5 looks at the problem of mean unknotting times. In Section 5.1 we introducethe general methods and ideas of knots and unknotting, and define the problem. InSection 5.2 we then look at the unknotting times of knots with small numbers ofcrossings. Then we move on to SAPTs and knots. For some background, we give adetailed proof of Kesten’s pattern theorem in Section 5.3 and a brief primer on Fouriertransforms in Section 5.4. Through considering the relationship of the problem with awalk on an n-cube in Section 5.5, we are able to derive some theoretical bounds on themean unknotting time in Section 5.6. Then, moving to the practical side of the matter,we describe how we generate self-avoiding polygon trails using the pivot method, andjustify its validity in Sections 5.7 and 5.8. We use the method to estimate the meanunknotting time of self-avoiding polygon trails of fixed length in Section 5.9. Finallywe discuss results and further options in Section 5.10.

34

2. THE CORNER TRANSFER MATRIX METHOD

2.1 Introduction

In the study of the physical properties of magnets, one can approach the problem from eithera macroscopic direction or a microscopic direction. We will use the latter approach; in thiscase, we think of the magnet as being composed of a large number of magnetic atoms (whichit is). These atoms are given fixed positions in space, and each interacts magnetically withevery other atom to produce an energy in the system, with this interaction diminishing oreven disappearing at long distances. For example, they may be in a 3-dimensional cubiclattice such as that shown in Figure 2.1. This is the basis for the statistical mechanicalmodels of a magnet; using these models, we seek to find physical properties of the entiresystem in terms of the atomic interactions.

In the most general case, we represent each atom by its magnetic component, or spin. Theactual numerical representation of the spin varies depending on the model used — generallyit is represented by a single vector. This is often simplified in models to a single number, oreven just a number which can only take two values.

Now, given this representation, every possible configuration of spins must have a certainprobability of occurring. On the other hand, they may not all be equally likely — indeed,some configurations may well be impossible. To allocate probabilities to each configuration,we look at the various interactions acting in the system. Again the number and natureof these interactions vary according with the model, but they often include interactionssuch as magnetic field(s) and atom-pair interactions. Taken as a whole, these interactionsproduce an energy for each possible configuration. We call this energy the Hamiltonian ofthe configuration.

Suppose there are N spins in our model. We denote the spin values by σ1, σ2, . . . , σN ,

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.1: A 3-dimensional cubic lattice with atoms at each vertex.

and symbolise the Hamiltonian as H(σ1, σ2, . . . , σN ). Under the so-called Gibbs canonicaldistribution, the probability that the spin values will take the particular values σ1, σ2, . . . , σNis defined to be proportional to the exponential of a multiple of the Hamiltonian:

P (σ1, σ2, . . . , σN) =1

ZNe−βH(σ1 ,σ2,...,σN ) (2.1)

where β = 1kT

, in which k is Boltzmann’s constant and T is the temperature of the system.We call e−βH the Boltzmann weight of the system. The multiple of the Hamiltonian in theBoltzmann weight is negative, so configurations with lower energies are more likely to occur.The factor of 1

ZNis a normalising factor, which is inserted so that the sum of the probabilities

of all configurations is 1. Therefore ZN has the value

ZN =∑

σ1,σ2,...,σN

e−βH(σ1,σ2,...,σN ) (2.2)

where the summation is over all possible values of all spins. We call ZN the partition functionof the model.

The partition function is very important for statistical mechanical models, as it containsall the information about the physical properties of the magnet. Because of this, if we canfind a closed form expression for the partition function of a model, we call that model solved.

For example, the mean energy of the system is the expected value of the Hamiltonianover all possible spin configurations, under the Gibbs canonical distribution. This meansthat it can be calculated as

E = 〈H(σ1, σ2, . . . , σN)〉=

∑

σ1,σ2,...,σN

H(σ1, σ2, . . . , σN )P (σ1, σ2, . . . , σN )

=1

ZN

∑

σ1,σ2,...,σN

H(σ1, σ2, . . . , σN)e−βH(σ1 ,σ2,...,σN )

= − 1

ZN

∂ZN∂β

= − ∂

∂βlnZN . (2.3)

Here, and from now on, we use angle brackets 〈〉 to denote expectation over all the spinconfigurations. The energy of the system can be used to find the thermodynamical quantities;from the equation (taken from [128, Section 2-5])

E = Ψ − TS (2.4)

where Ψ is the Hemholtz free energy and S = − ∂Ψ∂T

is the entropy of the system, we derive

Ψ = −kT lnZN . (2.5)

As the number of spins N increases to ∞, this becomes infinite. To have a measure of the

36

free energy in this limit, which is called the thermodynamic limit, we divide the free energyby N , and then take the limit N → ∞. This gives the free energy per site

ψ = −kT limN→∞

1

NlnZN . (2.6)

As another example, suppose we wish to find how strong the magnet is, as a whole.We call this property the magnetization of the magnet. If all the spins are aligned, themagnetization will be maximised; on the other hand, if they all point in random directions,or if half the spins point in an opposite direction to the other half, then the system will notbe magnetized at all. To give a value to the strength of the magnet if the spins have thevalues σ1, σ2, . . . , σN , we sum all the spins in the configuration:

Magnetic strength =N∑

i=1

σi. (2.7)

Of course, the system may be in any configuration with varying probabilities, so we definethe magnetization to be the expected value of the magnetic strength of the system:

M =

⟨

N∑

i=1

σi

⟩

=1

ZN

∑

σ1,σ2,...,σN

N∑

i=1

σie−βH(σ1,σ2,...,σN ). (2.8)

Again, the magnetization is directly proportional to N , so to find a value for it in thethermodynamic limit, we will divide by N and take the thermodynamic limit to find themagnetization per site.

For most models, the system is translationally invariant, which means that taken by itself,each spin in the system is indistinguishable from any other. If this is true, the expected valueof any spin is equal to that of any other. Then the magnetization per site can be expressedas

m = limN→∞

1

N

N∑

i=1

〈σi〉 = 〈σi〉 (2.9)

where i is arbitrary.To express this in terms of the partition function, we look at the effect of an external

magnetic field on the system. Such a field would affect indistinguishable spins equally, andin many models, this effect is expressed by a term of −H

∑

i σi in the Hamiltonian, whereH is the strength of the external magnetic field. If this is the case, it is possible to express

37

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.2: A square lattice.

the magnetization per site in terms of the free energy per site:

− ∂ψ

∂H=

∂

∂H

(

kT limN→∞

1

NlnZN

)

= kT limN→∞

1

N

∂ZN∂H

ZN

= kT limN→∞

1

NZN

∑

σ1,σ2,...,σN

(

β∑

i

σi

)

e−βH(σ1 ,σ2,...,σN )

= kT limN→∞

1

Nβ

⟨

∑

i

σi

⟩

= limN→∞

1

NN〈σi〉 = m (2.10)

and thereforem is another quantity which can be found from the partition function. However,in practice we often find it from Equation 2.9.

In general, it is difficult to solve such generalised models as the ones above, especially forsystems with a large or infinite number of spins. Even making some simplifying assumptionsstill results in hard-to-solve models. One famous example of a solved model is the Isingmodel, or more precisely the square lattice spin- 1

2Ising model, although it is only solved in

zero field and in two dimensions or less. In fact, this model, which was introduced by Isingin 1925 ([72]) as a one-dimensional model, is one of the most studied models in statisticalmechanics. In this model, each of the N spins is situated on a vertex of a very large squarelattice. Each of the vertices must contain a spin, as shown in Figure 2.2.

Each spin is represented by a single variable, which can only take the values 1 and −1.The model takes into account two interactions: an external magnetic field, of strength H,and a spin-spin interaction of strength J . The former interaction acts independently andequally on all spins; each spin that has value 1 contributes −H to the Hamiltonian, whileeach spin that has value −1 contributes H. Since a lower energy is more likely to occur, thismeans that the stronger (more positive) H is, the more likely spins are to take the value 1.

The latter interaction acts on all pairs of nearest-neighbour spins, i.e. spins that are

38

...

......

PSfrag replacements

VVVV

ψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.3: V is a matrix which transfers a column of spins.

immediately adjacent to each other on the lattice. Each nearest-neighbour pair of spinsthat have the same value contribute −J to the energy, while each nearest-neighbour pairwith different-valued spins contributes J . Again, the larger J is, the more likely nearest-neighbour pairs are to have the same spin. As these two interactions are the only ones takeninto account in the Ising model, the Hamiltonian is

H(σ1, σ2, . . . , σN) = −J∑

<i,j>

σiσj −H∑

i

σi (2.11)

where the first sum is over all pairs of spins i and j which are adjacent to each other on thelattice. H defines the model, as the partition function can be expressed in terms of it.

The partition function of the zero-field (H = 0) version of the Ising model was calculatedin a landmark paper by Onsager in 1944 ([114]), by using the notion of transfer matrices.The transfer matrix for a model on a square lattice is a matrix which has one row/columncorresponding to every possible configuration of spin states along one column of the lattice.Each of its elements is the Boltzmann weight of a single column with its spins fixed. Assuch, when we multiply a vector containing the weights of part of the lattice by the transfermatrix, the net effect is of ‘adding on’ the weight of a single column, as illustrated in Figure2.3. It turns out that this ‘reduces’ the problem to that of finding the largest eigenvalue ofan infinite-dimensional transfer matrix. We will give more details about transfer matrices inSection 2.2.

Onsager was able to find the eigenvalue of the transfer matrix, resulting in a solution forthe zero-field Ising model. He also stated, without proof, an expression for the magnetizationin zero field (also called the spontaneous magnetization), but the proof was first publishedby Yang in 1952 ([144]). However, the more general case of arbitrary field remains unsolved,despite all efforts since that time. So we see that even a relatively simple model, with onlytwo interactions, can be very tricky to solve indeed.

However, even though solving a model exactly is obviously most desirable, it is not theonly way to obtain information about the properties of the magnet. Another possible wayof obtaining information is by numerical calculation. To do this for the Ising model, we

39

give numerical values to interaction strengths J and H, instead of leaving them as variables.Then we use various numerical methods to calculate the partition function, or any otherwanted properties, at those values. One such method that can be used is the finite latticemethod (which we will discuss later in this section). Another is the corner transfer matrixmethod, which is the subject of this chapter.

Another, more sophisticated, way that we can gain information about properties of themodel, without actually solving it, is to use series expansions. With this technique, weexpress the partition function (or other quantities of interest) as a series in terms of avariable or variables. These variables depend on the strength of the interactions.

We can usually calculate the first few terms of the series of interest exactly. To illustrate,we define the variables u = e−2βJ and µ = e−2βH . Then we can manipulate the partitionfunction of the Ising model in the following way:

ZN =∑

σ1,σ2,...,σN

exp

(

βJ∑

<i,j>

σiσj + βH∑

i

σi

)

=∑

σ1,σ2,...,σN

∏

<i,j>

eβJσiσj∏

i

eβHσi

= e2NβJ+NβH∑

σ1,σ2,...,σN

∏

<i,j>

(

e−βJ)1−σiσj∏

i

(

e−βH)1−σi

= e2NβJ+NβH∑

σ1,σ2,...,σN

∏

<i,j>

u(1−σiσj)/2∏

i

µ(1−σi)/2

= e2NβJ+NβH

(

1 +Nu4µ+ 2Nu6µ2 +N(N − 5)

2u8µ2 + . . .

)

. (2.12)

Because the spins can only take the values 1 or −1, the expression 1−σi2

will be 0 when σiis 1, and 1 when σi is −1. So the power of µ in the term arising from a specific configurationcounts the number of spins in that configuration which have value −1. Similarly, the powerof u in the term arising from a configuration counts the number of nearest-neighbour bondswhere the spins are unequal. Since the result is a series about the point u = µ = 0, thisseries expansion is called the low-temperature expansion — when the temperature is low, βis large and the variables u and µ are small. We can also think of it as a high-field expansion— when the external field strength is high, µ will approach 0.

The first few terms in the series can be derived by inspection: there is only one con-figuration with no powers of u or µ, i.e. the configuration where the spins are all 1 (andtherefore all nearest-neighbour bonds have equal spins — also known as like neighbours).The next term can be achieved by setting all the spins to 1 and then flipping one spin. Asthere are N spins, there are N ways to do this. In this new arrangement, there is exactlyone spin with value −1 (which therefore contributes a factor of µ), and exactly four unlikenearest-neighbour bonds (which each contribute a factor of u), since we are on a squarelattice. We show this in Figure 2.4(a).

The next couple of terms come from setting exactly two spins to have value −1. If the

40

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(a) One negative spin.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(b) Two adjacent negativespins.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(c) Two separated negativespins.

Fig. 2.4: Calculating the first few terms of the low-temperature series for the Ising model partitionfunction. Hollow circles denote spins with value -1, and dashed bonds denote unlike bonds.

two spins are adjacent, which can happen in 2N ways, there will be 6 unlike bonds (the bondbetween the spins themselves contains two identical spins of −1). This is shown in Figure2.4(b). On the other hand, if they are not adjacent, there will be 8 unlike bonds. This can

happen in N(N−5)2

ways, since having chosen the first spin, there are 5 spins which cannotbe chosen (four are adjacent to the first spin and one is the spin itself). We can continue inthis way to derive the first few terms of ZN , but the rapid increase in possible configurationsmeans that we cannot derive more than a few terms by hand.

As can be seen, the partition function becomes infinite in the thermodynamic limit. Sowe can take its Nth root in the thermodynamic limit to get the partition function per site,denoted by κ:

κ = limN→∞

Z1

N

N = e2βJ+βH(

1 + u4µ+ 2u6µ2 − 2u8µ2 . . .)

. (2.13)

For small u and µ, the first few terms of the series provides a good approximation to thepartition function. Of course, as the variables become larger, this approximation becomesworse and worse. Thus we would like to generate as many terms of the series as possible.Obviously, just counting the numbers of various configurations rapidly becomes infeasible.If we want longer series, we must use more sophisticated methods than direct enumeration.Again, both the finite lattice method and the corner transfer matrix method provide suchmethods.

Up to now, the most commonly used method for finding either numerics or series forthese models has been the finite lattice method (or FLM). This method was developed byde Neef in his PhD thesis in 1975 ([40]) and by de Neef and Enting in 1977 ([41]). At itscore, the method exploits the relation between the partition function per site and a numberof partition functions of small finite lattices. For instance, from [50] (which is also a reviewof the FLM), on the square lattice we have

κ 'n−1∏

i=1

Zi,n−i

n−2∏

i=1

Z−3i,n−1−i

n−3∏

i=1

Z3i,n−2−i

n−4∏

i=1

Z−1i,n−3−i (2.14)

41

...

...

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.5: A corner transfer matrix incorporates the weight of a quarter of the plane.

where Zi,j is the partition function of a finite rectangular lattice with i rows and j columns.If this relation is used in series calculations, the first few terms of the right-hand side willbe exactly the same as the corresponding terms on the left. The number of these termsis dependent on n — the larger n is, the better the approximation. To find the partitionfunctions for the finite lattices, a method based on transfer matrices is used.

In general, the finite lattice method is an exponential time algorithm, i.e. to generateexactly n terms in a desired series, the finite lattice method takes approximately αn time forsome α. The method has since been optimised by Jensen, Guttmann and Enting ([76] and[77]), who have reduced α by a fair amount. However, the method is unlikely to ever becomemore efficient than exponential time. We give more details of the finite lattice method inSection 3.2.

The corner transfer matrix method is potentially much more efficient than the finitelattice method, but unfortunately its full potential has never really been fulfilled. It wasdeveloped originally by Baxter ([12]) and Baxter and Enting ([19]) in two papers in 1978and 1979. It is based on the idea of transfer matrices, but with a twist. Whereas a transfermatrix transfers one column of spins to another, while adding the weight of the column inbetween, the corner transfer matrix transfers ‘half a column’ of spins (which is a column ofspins which extends to infinity in only one direction) into half a row of spins, or vice versa,where the column and row are joined at the end, as shown in Figure 2.5. This adds theweight of a quarter of the plane.

Using corner transfer matrices, and a lot of manipulation, Baxter derived a set of equa-tions that he called the CTM equations. The solution of these equations will yield the par-tition function per site; the main difficulty is that they are equations in infinite-dimensionalmatrices. However, limiting the matrices to finite size gives an approximate solution whichcan then be converted into an approximation for κ. Furthermore, if we use series, thisapproximation gives the first few series terms of κ.

These equations apply to a certain class of models which are known as interaction rounda face (IRF) models. These are models where all interactions can be described by their effecton a single cell — in the case of the square lattice, a cell is a single square. The most generalIRF model has a different interaction for every different combination of spins in the cell. Weshow this in Figure 2.6.

For example, the Ising model is an IRF model, because its external field interaction

42

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

H H

HH J1

J1

J ′1

J2 J2

J ′2

K1

K2

K3

K4

L

(a) One- and two-spin in-teractions.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

H

J1

J ′1

J2

J ′2

K1

K2

K3

K4

L

(b) Two three-spininteractions.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

H

J1

J ′1

J2

J ′2

K1 K2

K3 K4

L

(c) The other twothree-spin interac-tions.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

H

J1

J ′1

J2

J ′2

K1

K2

K3

K4

L

(d) Four-spin interac-tion.

Fig. 2.6: IRF models can be described solely by their effect on a single cell. All of these interactionsapply at the same time.

applies to single spins (which always lie within a single cell) and its spin-pair interactionapplies to nearest-neighbour pairs of spins (which, again, always lie within a single cell). Interms of the most general model, the Ising model can be described by J1 = J2, J

′1 = J ′

2 =K1 = K2 = K3 = K4 = L = 0. An example of a model which is not IRF is any model on thesquare lattice which has an interaction between spins which are two units apart (countingthe side of a cell as one unit).

The corner transfer matrix approach was actually used as far back as 1968, when Baxter([9]) applied a similar method to find numerics for a dimerization problem. In 1976, Kelland([82]) also used a similar idea to find numerical data for the Potts model. However, it wasnot until the landmark paper in 1978 ([12]) that the CTM equations began to be used forfinding series.

In this paper, Baxter gave an iterative method for solving the CTM equations up to acertain order. Unfortunately this general method (involving, among other things, findingeigenvalues and eigenvectors of matrices with series elements) was not extremely efficient.In [19], a new, faster method was proposed for the square lattice Ising model, which workedvery well indeed. This method was used to generate low-temperature expansions in thispaper; in a subsequent paper Enting and Baxter ([51]) also found the high-field expansionat certain temperatures up to order 35. However, the method took advantage of Ising modelproperties that could not be transferred to other models.

In two further papers ([20] and [22]), various algorithms based on the CTM equationswere applied to the hard squares model and the hard hexagons model. In the case of the hardsquares model, so many series terms were derived for the partition function that until today,the number of terms has not been bettered, even with access to enormously more powerfulcomputers than those of 25 years ago! For the hard hexagon model, even better was achieved:because of eigenvalue redundancy in the matrices, the model was actually solved exactly (in[13]). In 1999 ([18]), Baxter returned to these models, calculating numerical values for thepartition functions at z = 1 (where z is the fugacity). He managed to calculate 43 decimalplaces for the hard squares model, and 39 for the equivalent model on the hexagonal lattice.

The corner transfer matrix ideas and methods were also applied to a few other models —Baxter applied it to the chiral Potts model in the early 90s ([15], [17], and [16]), the 8-vertexmodel in 1977 ([10] and [11]), and in 1984 Baxter and Forrester ([21]) applied the idea tothe 3-dimensional Ising model for one-dimensional matrices. However, because the methodhad to be modified in different ways for each different model, it was difficult to apply it to awide range of models. Furthermore, it was much easier to implement the better-understoodfinite-lattice method. So despite having the potential to generate large numbers of seriescoefficients, the CTM methods were not widely used.

More recently, in 1996 Nishino and Okunishi ([105], [111] and [106]) combined the cornertransfer matrix idea with another numerical matrix method, the density matrix renormal-ization group method. This method was invented by White in 1992 ([139] and [140]) forone-dimensional quantum lattice models, and applied to two-dimensional models by Nishinoin 1995 ([103]). A review can also be found in [121].

These two methods were combined to produce the ‘corner transfer matrix renormalization

44

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.7: A one-dimensional lattice.

group method’, or CTMRG for short. To do this, Nishino and Okunishi stripped awayBaxter’s CTM equations and worked on the matrices which were in the equations. Theresult was a method which was very efficient, but was applied only for numerical results(rather than for generating series). In [105], they applied the method to the Ising model asan illustration. In [106], a few calculations were made for q = 2, 3 Potts models.

In further papers, the CTMRG was (numerically) applied to the q = 5 Potts model([108]), the 3D Ising model ([109]), the spin- 3

2Ising model ([131]), and the two-layer Ising

model ([93]). The eigenvalue distribution of the CTM matrices was also studied in [113].In [107], [110], [104], and [59], the method was also converted to 3-dimensional lattices, byexpressing the variational state as a product of tensors.

Foster and Pinettes ([58] and [57]) also used this method in 2003. In these papers, theyapplied the CTMRG to self-avoiding walks, which can be expressed as IRF models. However,they too did not use the method to generate series.

In this chapter we present our attempts to find a general method, based on the CTMequations, that will yield results for both series and numerical calculations. In Section 2.2,we give some background on transfer matrices. After some background in Section 2.3, were-derive Baxter’s CTM equations in Section 2.4. In Section 2.5 we show why the CTMequations provide the solution to the model. In Section 2.6, we give an alternate methodof calculating the partition function and other quantities if not all variables are known. InSection 2.7 we solve the equations in a simple case. Next, in Section 2.8, we give a shortbackground on some matrix algorithms. In Section 2.9, we detail an iterative method ofsolving the CTM equations, and discuss its convergence properties and the results we haveobtained with it. In Section 2.10, we detail another method of solving the CTM equations,based on the CTMRG method. We also discuss convergence properties and the results wehave obtained. Finally, in Section 2.11, we look back at what we have done and considersome future avenues of research.

2.2 Transfer matrices

As the corner transfer matrix method is based on (unsurprisingly) corner transfer matrices,it would be handy to have a brief introduction to the theory of (regular) transfer matrices.This grounding will also come in handy when we explain the finite lattice method in Section3.2.

To illustrate this concept, we will solve the 1-dimensional (spin- 12) Ising model. This

model is identical to the square-lattice Ising model, except that the spins are situated evenlyon a single line (hence 1-dimensional). This means that each spin has 2 nearest neighbours.We show part of this lattice in Figure 2.7.

We take the number of spins to be N , and label the spins from 1 to N , starting at theleft-most spin and moving from left to right. We denote the value of spin i by σi. Again, σi

45

can only take the values -1 and 1.There are still two interactions: an external field interaction of strength H, and a spin-

pair interaction which acts on nearest-neighbour pairs with strength J . We will assume acircular boundary condition, so that we identify the spins N + 1 and 1 as the same spin.

Similarly to the 2-dimensional Ising model, the Hamiltonian for a configuration of spinswith values σ1, σ2, . . . , σN is

H(σ1, σ2, . . . , σN ) = −H∑

i

σi − J∑

<i,j>

σiσj

= −N∑

i=1

(Hσi + Jσiσi+1) . (2.15)

Equivalently, the partition function is

ZN =∑

σ1,σ2,...,σN

N∏

i=1

eβHσieβJσiσi+1 . (2.16)

Notice that this partition function is broken down into a product of contributions fromeach spin and its neighbour on the right — the one-dimensional equivalent of a cell. Butwhat is the individual contribution from one cell? Naturally, this depends on what value thespins in the cell take. Thus we can set up a matrix of contributions, such that each elementcorresponds to a particular set of values of spins in the cell — more precisely, we define thematrix V so that

Vσi,σi+1= eβ

H2

(σi+σi+1)eβJσiσi+1 (2.17)

is the contribution to the partition function from a cell with spin values σi and σi+1. Notethat we divide the external field interaction by half, because each single spin belongs to 2different cells. Splitting the term in this way ensures that the transfer matrix is symmetric.Writing out this matrix gives us

σi+1

1 −1

σi1

(

eβHeβJ e−βJ)

−1 e−βJ e−βHeβJ

(2.18)

46

Now we can break down ZN into the individual contributions from each cell:

ZN =∑

σ1,σ2,...,σN

N∏

i=1

eβHσieβJσiσi+1

=∑

σ1,σ2,...,σN

N∏

i=1

eβH2

(σi+σi+1)eβJσiσi+1

=∑

σ1,σ2,...,σN

N∏

i=1

Vσi,σi+1

=∑

σ1,σ2,...,σN

Vσ1,σ2Vσ2,σ3

. . . VσN ,σN+1

=∑

σ1

(V N)σ1,σN+1

=∑

σ1

(V N)σ1,σ1

= Tr V N (2.19)

since we identify spin N + 1 with spin 1. From this equation, it can be seen why V is calledthe transfer matrix — if we were to calculate ZN by multiplying the contributions of eachcell in turn from left to right, then at any one time, multiplying by the appropriate elementof V would add the contribution of the current cell, thus ‘transferring’ the calculation onecell to the right.

Now the trace of a matrix is the sum of the eigenvalues of that matrix. If we let theeigenvalues of V be λ1 and λ2, where λ1 ≥ λ2, then the eigenvalues of V N are λN1 and λN2 .Then we can calculate the partition function per site in the thermodynamic limit:

κ = limN→∞

Z1

N

N

= limN→∞

(Tr V N)1

N

= limN→∞

(λN1 + λN2 )1

N

= limN→∞

λ1(1 +

(

λ2

λ1

)N

)1

N

= λ1. (2.20)

This is why the transfer matrix is so useful — instead of calculating an infinite partitionfunction and then taking the Nth root as N → ∞, all we need to do is merely find thelargest eigenvalue of the transfer matrix. For the one-dimensional Ising model, the transfer

47

PSfrag replacements

V

ψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.8: One-dimensional transference.

matrix is only 2 × 2, and therefore the eigenvalue is easily found:

κ = λ1 = eβJ cosh βH +

√

e2βJ sinh2 βH + e−2βJ . (2.21)

For other models, the transfer matrix is set up in a similar way. The idea is that thetransfer matrix represents a section of the lattice which has the property that by addingthat section to itself over and over again (in one way only), eventually the entire latticewill be covered. This is easy to achieve for a one-dimensional lattice, as the transfer ma-trix represents one cell, and the lattice is composed entirely of cells laid end-to-end in onedirection.

Another way we can consider this is to think of a cut in the lattice. By ‘cutting’ thelattice on a line perpendicular to the lattice, we generate a cut which (for this lattice) consistsof one spin. The transfer matrix then transfers a cut of one site to an equivalent cut one siteto the right while adding the weight of the cell in between, as shown in Figure 2.8.

For the two-dimensional square lattice, we cut the lattice with a vertical line. This meansthat the transfer matrix represents one column of the lattice, as shown in Figure 2.9. If thelattice consists of m rows and n columns of spins, then the cut will have m spins, ratherthan one.

Now, each row and column of V will represent one possible set of values for the m spins onthe cut. We say that V is indexed by m spins. In particular, this means that V has dimension2m × 2m. For the Ising model, the element of V corresponding to row (σ1, σ2, . . . , σm) andcolumn (σ′

1, σ′2, . . . , σ

′m) is

V(σ1 ,σ2,...,σm),(σ′1,σ′

2,...,σ′m) =

m∏

i=1

eβH2

(σi+σ′i)eβJ2(σiσi+1+σ′iσ

′i+1

+2σiσ′i). (2.22)

This is the contribution to the partition function of a column with the spins σ1, σ2, . . . , σmon the left and σ′

1, σ′2, . . . , σ

′m on the right. Therefore we say that V adds the weight of a

column.Again, by building up one column at a time, we can eventually cover the entire lattice.

Then we can proceed as before, except that the transfer matrix V will have dimension2m×2m, and therefore 2m eigenvalues. However, the calculations stay mostly the same, withthe one exception being that the power of V is the number of columns in the lattice rather

48

PSfrag replacements

V

ψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.9: Two-dimensional transference.

than the number of sites:Zm,n = Tr V n. (2.23)

There is a possibility that there may be no one maximum eigenvalue for V , i.e. themaximum eigenvalue may be degenerate. This does not affect the calculations, however, asTr V n would still grow like λn1 (albeit with a constant multiple). We can then continue withthe calculations as above (with at most 2m − 1 non-dominant eigenvalues instead of 1) toreach the relation

κ = λ1

m

1 . (2.24)

So again we only need to find the maximum eigenvalue of the transfer matrix to solve themodel.

It is worth noting that although the transfer matrices in this section have been columntransfer matrices, it is entirely possible to have transfer matrices which act identically onrows. This makes them row transfer matrices.

2.3 A variational result

An important result in the development of the CTM equations is proved in the followingtheorem, which gives an expression for the maximum eigenvalue of a symmetric matrix. Itis a part of the Courant-Fischer theorem, which provides an expression for all eigenvalues.

Theorem 2.3.1. If A is a n× n real symmetric matrix with maximum eigenvalue λ1, then

λ1 = maxx6=0

xT x=1

xTAx = maxx6=0

xTAx

xTx. (2.25)

The value of x which maximises the above expression is the eigenvector of A corresponding

49

to λ1.

Proof. For the first part, we follow the proof of Wilkinson in [142, pp. 98-99]. Since A issymmetric, it must be orthogonally diagonalizable, so there exists an orthonormal matrix Psuch that

P TAP = diag(λ1, λ2, . . . , λn) (2.26)

where λ1, λ2, . . . , λn are the eigenvalues of A, arranged in non-increasing order, i.e.

|λ1| ≥ |λ2| ≥ · · · ≥ |λn|. (2.27)

Now let x be an arbitrary n-dimensional vector, and let y = P Tx. Since P is orthonormal,we know that x = Py. Then we have

xTAx = yTP TAPy = yTdiag(λ1, λ2, . . . , λn)y =n∑

i=1

λiy2i (2.28)

and

xTx = yTP TPy =n∑

i=1

y2i . (2.29)

Now, by construction, y = 0 if and only if x = 0. Since x is arbitrary,

maxx6=0

xT x=1

xTAx = maxy 6=0

P

i y2i=1

n∑

i=1

λiy2i = λ1 (2.30)

which occurs when y = (±1, 0, 0, . . . , 0)T .Now if we set x′ = P (1, 0, . . . , 0)T , x′ corresponds to the maximising y in the above

equation, so x′ maximises xTAx when xTx = 1. Since

(x′)TAx′

(x′)Tx′ = (x′)TAx′ (2.31)

we can immediately say that

maxx6=0

xTAx

xTx≥ max

x6=0

xT x=1

xTAx. (2.32)

Suppose, however that the inequality is strict and we have a vector x2 such that

xT2Ax2

xT2 x2

> (x′)TAx′. (2.33)

50

Then this implies(

x2

||x2||

)T

Ax2

||x2||> (x′)TAx′, (2.34)

contradicting the fact that x′ maximises xTAx when xTx = 1. Therefore the inequality isan equality, and

maxx6=0

xT x=1

xTAx = maxx6=0

xTAx

xTx(2.35)

which proves the remaining part of the equation.Now, if x1 is the eigenvector of A corresponding to λ1, then

xT1Ax1

xT1 x1

=xT1 λ1x1

xT1 x1

= λ1 (2.36)

so x1 solves the maximisation problem.

2.4 The CTM equations

The basis of the CTM methods rests on the matrix equations which underlie the method,known as the CTM equations. These equations were first derived by Baxter in [12]. In thissection, we will follow in detail Baxter’s derivations for the 2-dimensional square lattice.

2.4.1 An expression for the partition function

Firstly, we define the lattice. We assume that we are working on an m × n lattice withtoroidal boundary conditions, so that row m+ 1 is identified with row 1, and column n+ 1is identified with column 1, as shown in Figure 2.10. Ultimately, we would like to take thethermodynamic limit m,n→ ∞, and find the partition function per site κ in this limit.

The CTM equations only apply to interaction round a face (IRF) models, so we willassume from now on that the model that we work on is an IRF model.

Let V be the column transfer matrix of the model, as discussed in Section 2.2. Diagram-matically, V ‘transfers’ a cut along a column by moving it one column to the right, andadding the Boltzmann weight of the column in between. In this manner, multiplying by Vrepeatedly will eventually give the partition function of the entire lattice, as shown in Figure2.11.

Each column of the lattice has m spins, so the rows and columns of V are indexed byall possible states of these m spins. Suppose that the largest eigenvalue of V is Λ. Then, asshown in Section 2.2, the limiting partition function per site is related to Λ:

κ = limm,n→∞

Z1

mnm,n = lim

m→∞Λ

1

m . (2.37)

51

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.10: Toroidal boundary conditions.

...

......

PSfrag replacements

VVVV

ψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.11: Multiple column transfer matrices.

......

......

PSfrag replacements

Vψ

a b

c d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

ω

(

a bc d

)

Fig. 2.12: ω gives the weight of a single cell.

Thus, to find κ, we must first find Λ.To find Λ, we break V up into smaller parts. This is possible because the model is

an IRF model. We define ω

(

a bc d

)

as the Boltzmann weight of the interactions around a

single cell, where the spins are fixed to be a in the upper left-hand corner, b in the upperright-hand corner, c in the lower left-hand corner and d in the lower right-hand corner, asshown in Figure 2.12. Importantly, ω is the only variable in the equations which varies whenthe model changes.

Let us look at an example. In the hard squares model, there are two values a spin cantake: 0, which stands for an ‘empty’ site, and 1, which stands for an ‘occupied’ site. Note thatthis is slightly different from the description in Chapter 1, which used spin values of 1 and−1. In this model, we do not allow two occupied sites to be adjacent to each other. We aregenerally interested in the number of occupied sites in the lattice, or in the thermodynamiclimit, the number of occupied sites per site. Therefore we assign each occupied site a weightz in the partition function. Seeing that on the square lattice, each spin resides in 4 distinctcells, the weight of each spin in any one cell is divided by 4. Thus the weight of a single cellis

ω

(

a bc d

)

=

{

0 if a = b = 1, a = c = 1, b = d = 1 or c = d = 1z(a+b+c+d)/4 otherwise.

(2.38)

We will use the hard squares model later as an example.Since ω counts the weight of one face, and each element of V adds the weight of a column

53

......

......

PSfrag replacements

V

ψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYA ω

ω

ωF (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.13: Decomposition of a column matrix into single cells.

of m faces, we can express each element of V as a product of m of these weights:

V(σ1,σ2,...,σm),(σ′1,σ′

2,...,σ′m) =

m∏

i=1

ω

(

σi+1 σ′i+1

σi σ′i

)

(2.39)

where the toroidal boundary conditions identify row m + 1 with row 1. Figure 2.13 showshow V breaks down into m cell weights.

For many models, the weight of a configuration is invariant under the symmetries ofthe lattice. In other words, we can reflect or even rotate configurations and they will stillcontribute the same amount to the partition function (see Figure 2.14). We call such modelsundirected. In particular, if the model has reflection symmetry about the vertical axis, thenthe transfer matrix V is symmetric. We assume that this is the case for our model. BecauseV is symmetric, we can apply Theorem 2.3.1 to derive an expression for the maximumeigenvalue Λ:

Λ = maxψ

ψTV ψ

ψTψ(2.40)

where ψ is a non-zero vector of dimension 2m. The value of ψ which maximises the right-hand side is the eigenvector of V corresponding to Λ. ψ will be indexed in the same way asV , so there will be one element of ψ for each possible configuration of the m states on thecut.

We can graphically interpret the optimal value of ψ in Equation 2.40 as the ‘partitionfunction’ of the half plane. Each element ψσ1,σ2,...,σm is the partition function (really thecontribution to the partition function) of the left half of the plane, cut along a column,with spins on the cut fixed at the values σ1, σ2, . . . , σm. By interpreting the optimal ψ inthis manner, it is easy to see how it is an eigenvector of V , as V adds one column to thehalf-plane, resulting in another half-plane. We show this in Figure 2.15.

It is difficult to evaluate the maximum in Equation 2.40 as ψ becomes infinite-dimensionalin the thermodynamic limit m → ∞. However, it is possible to approximate Λ by usingthe above equation. Instead of maximising the expression over all possible ψs, we maximiseinstead over a subset of all possible ψs that we choose. We take this subset to be all ψ where

54

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(a)

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(b)

Fig. 2.14: Reflection symmetry. The weights of configurations (a) and (b) are identical in undirectedmodels.

...

...

PSfrag replacements

Vψ

a

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

ψσ1,σ2,...,σm

Fig. 2.15: At optimality, ψ is an eigenvector of V .

the elements satisfy

ψσ1,σ2,...,σm = Tr (F (σ1, σ2)F (σ2, σ3) . . . F (σm, σ1)) (2.41)

where F (a, b) is an arbitrary matrix of dimension 2p, dependent on the two spins a and band indexed by all possible values of a set of p spins. The only restriction that we place onF is the condition

F T (a, b) = F (b, a). (2.42)

We will need this condition later to introduce a symmetry into our matrices. When p = 1,this expression for ψ is equivalent to the ansatz of Kramers and Wannier in [86] and [87].

If p > 1, then the space for ψ generated by this expression includes the spaces generated

by all lower p. This follows because replacing F by

(

F 00 0

)

does not change the right hand

side of Equation 2.41.We can interpret the optimal F (a, b) (i.e. the value of F (a, b) that gives the optimal

value of ψ) graphically as a half-column or half-row transfer matrix. F (a, b) will take a cutof p spins, ending in a spin of value a, and transfer it one column (or row) to the right orleft (down or up), adding the intermediate weight at the same time. The new cut also hasp spins, but ends in a spin of value b. In this way it can be seen how the product of m F sbecomes the half-plane partition function as p becomes large. We illustrate this in Figure

55

p spins

...

PSfrag replacements

V

ψ

a

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)

F (σ2, σ3)

F (σ3, σ4)

F (σ4, σ5)

Fig. 2.16: Decomposition of ψ into m F s.

p spinsp spins2p+1 spins

......

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FF

XYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

R

Fig. 2.17: Full-row transfer matrix interpretation of R.

2.16.Now define R to be the 22p+1 × 22p+1 matrix (indexed by 2p+ 1 spins) with the elements

R(λ,a,µ),(λ′ ,b,µ′) = Fλ,λ′(a, b)Fµ,µ′(a, b) (2.43)

where λ, λ′, µ, and µ′ are sets of p spins. At optimality, each element of R is the product oftwo half-row weights, so the optimal R has a graphical interpretation as a full-row transfermatrix, transferring 2p+ 1 spins at a time, as shown in Figure 2.17.

Then if we let λi and µi denote sets of p spins, we have

ψTψ =∑

σ1,σ2,...,σm

[Tr (F (σ1, σ2)F (σ2, σ3) . . . F (σm, σ1))]2

=∑

σ1,σ2,...,σm

[

∑

λ1,λ2,...,λm

Fλ1,λ2(σ1, σ2)Fλ2,λ3

(σ2, σ3) . . . Fλm,λ1(σm, σ1)

]2

=∑

σ1,σ2,...,σmλ1,λ2,...,λmµ1,µ2,...,µm

Fλ1,λ2(σ1, σ2)Fµ1,µ2

(σ1, σ2) . . . Fλm,λ1(σm, σ1)Fµm ,µ1

(σm, σ1)

=∑

σ1,σ2,...,σmλ1,λ2,...,λmµ1,µ2,...,µm

R(λ1,σ1,µ1),(λ2,σ2,µ2)R(λ2 ,σ2,µ2),(λ3,σ3,µ3) . . . R(λm,σm,µm),(λ1 ,σ1,µ1)

= Tr Rm. (2.44)

56

p spins p spins2p+2 spins

......

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FF

XYA

ω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

S

Fig. 2.18: Full-row transfer matrix interpretation of S.

If we define ξ to be the dominant eigenvalue of R, then

limm→∞

(ψTψ)1

m = limm→∞

(Tr Rm)1

m = ξ. (2.45)

Similarly, we define S to be the 22p+2 × 22p+2 matrix with the elements

S(λ,a,b,µ),(λ′ ,c,d,µ′) = ω

(

a bc d

)

Fλ,λ′(a, c)Fµ,µ′(b, d). (2.46)

Then it can be proved in an identical way that

ψTV ψ = Tr Sm (2.47)

and if we define η to be the dominant eigenvalue of S, then

limm→∞

(ψTV ψ)1

m = η. (2.48)

The optimal S also has an interpretation as a full-row transfer matrix, except that ittransfers a row of 2p+ 2 spins, as shown in Figure 2.18.

Substituting the expressions found above into the formula for Λ,

Λ = maxψ

ψTV ψ

ψTψ

≥ maxF

ψTV ψ

ψTψ

= maxF

Tr Sm

Tr Rm(2.49)

and thereforeκ = lim

m→∞Λ

1

m ≥ maxF

η

ξ. (2.50)

Not only is this a lower limit for κ, but we shall show later that as the dimension of theF matrices increases (as p → ∞), the expression on the right-hand side tends to κ, so atfinite p it can be used as an approximation to the partition function per site. In fact, if weare using series, this approximation will also give the first few terms of κ correctly. This isthe approximation that the CTM methods use.

57

PSfrag replacements

Vψ

aa

b

c

d

σ1

σ2

σ3

σ4

σ5

FF

XX

YAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.19: Half-plane transfer matrix interpretation of X.

2.4.2 Eigenvalue equations

Let X be the eigenvector of R corresponding to the eigenvalue ξ. X contains 22p+1 elements,which can be rearranged to form a set of matrices X(a), where a is a spin. Each of thesematrices has size 2p × 2p (indexed by p spins), and their elements are

Xλ,µ(a) = X(λ,a,µ). (2.51)

Then the eigenvalue equation of R and X can be written as

ξXλ,µ(a) = [ξX](λ,a,µ)

= [RX](λ,a,µ)

=∑

λ′,b,µ′

R(λ,a,µ),(λ′ ,b,µ′)X(λ′,b,µ′)

=∑

λ′,b,µ′

Fλ,λ′(a, b)Fµ,µ′(a, b)Xλ′,µ′(b)

=∑

λ′,b,µ′

Fλ,λ′(a, b)Xλ′,µ′(b)Fµ′,µ(b, a)

=

[

∑

b

F (a, b)X(b)F (b, a)

]

λ,µ

(2.52)

and thereforeξX(a) =

∑

b

F (a, b)X(b)F (b, a). (2.53)

This equation holds for all values of the spin a.Graphically, we can think of the optimal X(a) (resulting from the optimal F (a, b)) as a

half-plane transfer matrix, taking a half-row cut of p spins and rotating it around anotherspin with value a by an angle of π, while adding the Boltzmann weight of all the cells covered.In this way, it can be seen how the above equation holds at optimality — the right handside merely moves the half-row cut by one row before and after rotation, which is equivalentto adding one full row of 2p+ 1 spins (which is exactly what R does). This still results in ahalf-plane transfer. We illustrate this in Figure 2.19.

Now let Y be the eigenvector of S corresponding to η. As above, Y can be written as

58

PSfrag replacements

Vψ

aa bb

c d

σ1

σ2

σ3

σ4

σ5

FF

X

YY

Aω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.20: Half-plane transfer matrix interpretation of Y .

a set of 2p × 2p matrices Y (a, b). In an identical fashion, but with Y s replacing Xs and Ssreplacing Rs, it can be shown that

ηY (a, b) =∑

c,d

ω

(

a bc d

)

F (a, c)Y (c, d)F (d, b). (2.54)

This has a nearly identical graphical interpretation at optimality, except that the half-row cut is now rotated around two spins, with fixed values a and b. We show this in Figure2.20.

We note that taking the transpose of Equation 2.53 gives the (eigenvalue) equation

ξXT (a) =∑

b

F (a, b)XT (b)F (b, a). (2.55)

This means that the vector corresponding to XT (a) is also an eigenvector of R correspondingto ξ. Since it contains the same entries as X, it seems reasonable that it is X. Translatingthis in terms of X(a) gives

XT (a) = X(a) (2.56)

and, similarly, since we are assuming reflection symmetry in our model so that ω

(

a bc d

)

=

ω

(

b ad c

)

, we get

Y T (a, b) = Y (b, a). (2.57)

For optimal matrices, this again makes sense graphically: XT (a) transfers the same regionof the plane as X(a) does, but moves the cut in the opposite direction. The same argumentalso applies to Y (a, b).

2.4.3 Stationarity

Equations 2.53 and 2.54 provide expressions that we can use to evaluate ξ and η, given F .However, not only do we want to calculate ξ and η, but we must also maximise η/ξ over allpossible F s. To achieve this, η/ξ must be stationary with respect to F . This means that at

59

the optimal point, we must have∂(η/ξ)

∂F= 0. (2.58)

Here we take the derivative of a scalar with respect to a matrix to mean the matrixformed by differentiating the scalar by each of the matrix elements in turn; in other words,

[

df(A)

dA

]

i,j

=df(A)

dAi,j. (2.59)

Then we need(∂η/∂F )ξ − η(∂ξ/∂F )

ξ2= 0 (2.60)

which, after rearranging, becomes

∂η/∂Fλ,µ(a, b)

∂ξ/∂Fλ,µ(a, b)=η

ξ. (2.61)

Now since X is the eigenvector of R corresponding to ξ, we know that

ξ =XTRX

XTX

=

∑

λ,a,µX(λ,a,µ)[RX](λ,a,µ)∑

λ,a,µX2λ,µ(a)

=

∑

λ,a,b,µXTµ,λ(a) [F (a, b)X(b)F (b, a)]λ,µ∑

λ,a,µXTµ,λ(a)Xλ,µ(a)

=

∑

a,b Tr XT (a)F (a, b)X(b)F (b, a)∑

a Tr XT (a)X(a). (2.62)

Moreover, since

R(λ,a,µ),(λ′ ,b,µ′) = Fλ,λ′(a, b)Fµ,µ′(a, b) = Fλ′,λ(b, a)Fµ′,µ(b, a) = R(λ′,b,µ′),(λ,a,µ), (2.63)

we know that R is symmetric. Therefore, from Theorem 2.3.1, we know that X maximisesthe expression XTRX

XTX, and so ∂ξ

∂X= 0. Therefore we can take all the elements of X as

60

constants when differentiating the above expression for ξ by F . This leads to

∂ξ

∂Fλ,µ(a, b)=

1∑

c Tr X2(c)

∂

∂Fλ,µ(a, b)

∑

a1,a2λ1,λ2,λ3,λ4

Xλ1,λ2(a1)Fλ2,λ3

(a1, a2)Xλ3,λ4(a2)Fλ4,λ1

(a2, a1)

=1

∑

c Tr X2(c)

∑

a1,a2λ1,λ2,λ3,λ4

Xλ1,λ2(a1)Xλ3,λ4

(a2)

(

Fλ2,λ3(a1, a2)

∂

∂Fλ,µ(a, b)Fλ4,λ1

(a2, a1)

+Fλ4,λ1(a2, a1)

∂


(a1, a2)

)

=2

∑

c Tr X2(c)

∑

a1,a2λ1,λ2,λ3,λ4

Xλ1,λ2(a1)Fλ2,λ3

(a1, a2)Xλ3,λ4(a2)

∂


(a2, a1)

=2

∑

c Tr X2(c)

∑

a1,a2λ1,λ4

[X(a1)F (a1, a2)X(a2)]λ1,λ4

∂


(a1, a2)

= 2(2 − δλ,µδa,b)[X(a)F (a, b)X(b)]λ,µ

∑

c Tr X2(c)(2.64)

where δ is the Kronecker delta. The last line follows because the only elements in any Fwhich depend on Fλ,µ(a, b) are Fλ,µ(a, b) itself and Fµ,λ(b, a), which is the same element asFλ,µ(a, b) if and only if λ = µ and a = b.

In a similar fashion (albeit with more manipulation), we can apply the same procedureto η to get

∂η

∂Fλ,µ(a, b)= 2(2 − δλ,µδa,b)

∑

c,d ω

(

a cb d

)

[Y (a, c)F (c, d)Y (d, b)]λ,µ∑

c,d Tr Y (c, d)Y (d, c). (2.65)

So for η/ξ to be stationary with respect to F , we must have

∑

c,d ω

(

a cb d

)

[Y (a, c)F (c, d)Y (d, b)]λ,µ∑

c Tr X2(c)

[X(a)F (a, b)X(b)]λ,µ∑

c,d Tr Y (c, d)Y (d, c)=η

ξ(2.66)

for all spin values a, b and sets of p spins λ, µ.If we set

α =η

ξ

∑

c,d Tr Y (c, d)Y (d, c)∑

c Tr X2(c)(2.67)

61

PSfrag replacements

Vψ

aa

bb

c

d

σ1

σ2

σ3

σ4

σ5

FF

X

X Y

Y

Aω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.21: Graphical interpretation of Equation 2.68.

then this becomes

αX(a)F (a, b)X(b) =∑

c,d

ω

(

a cb d

)

Y (a, c)F (c, d)Y (d, b). (2.68)

Given the graphical interpretations for F , X and Y at optimality, this equation can beinterpreted as adding a column to a section of the plane which covers all of the plane but

half a row. This is shown in Figure 2.21, remembering that ω

(

a cb d

)

= ω

(

c ad b

)

.

However, if we define the matrices A(a) to be the ‘square roots’ of the X(a) matrices:

X(a) = A2(a) (2.69)

then the equationY (a, b) = A(a)F (a, b)A(b) (2.70)

implies that

α =η

ξ

∑

a,b Tr A(a)F (a, b)A2(b)F (b, a)A(a)∑

a Tr A4(a)

=η

ξ

ξ∑

a Tr A4(a)∑

a Tr A4(a)

= η (2.71)

62

PSfrag replacements

Vψ

aa

b

c

d

σ1

σ2

σ3

σ4

σ5

F

X

Y

AA

ωF (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(a) Graphical interpretation of Equation 2.69.

PSfrag replacements

Vψ

aa bb

c

d

σ1

σ2

σ3

σ4

σ5

F

X

Y AA

ωF (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(b) Graphical interpretation of Equation 2.70.

Fig. 2.22: The graphical interpretation of A as a corner transfer matrix gives interpretations ofEquations 2.69 and 2.70.

and therefore

∑

c,d

ω

(

a cb d

)

Y (a, c)F (c, d)Y (d, b) =∑

c,d

ω

(

a cb d

)

A(a)F (a, c)A(c)F (c, d)A(d)F (d, b)A(b)

= A(a)

(

∑

c,d

ω

(

a cb d

)

F (a, c)Y (c, d)F (d, b)

)

A(b)

= A(a)ηY (a, b)A(b)

= ηA2(a)F (a, b)A2(b)

= αX(a)F (a, b)X(b) (2.72)

which implies that F maximises ηξ. Thus Equation 2.70 implies Equation 2.68, which is

equivalent to stationarity (i.e. optimality).The optimal A can be interpreted as a corner transfer matrix, which gives us the name

of the method. Whereas X(a) rotates a half-row cut by an angle of π around a spin withvalue a, A(a) achieves exactly half that effect by rotating the cut around a spin of value aby an angle of π

2. We show interpretations of Equations 2.69 and 2.70 in Figure 2.22.

63

2.4.4 The CTM equations

Equations 2.53, 2.54, 2.69 and 2.70 were first derived by Baxter in his original paper [12],and they are called the CTM equations. For convenience we state them together here.

ξX(a) =∑

b

F (a, b)X(b)F (b, a) (2.73)

ηY (a, b) =∑

c,d

ω

(

a bc d

)

F (a, c)Y (c, d)F (d, b) (2.74)

X(a) = A2(a) (2.75)

Y (a, b) = A(a)F (a, b)A(b) (2.76)

Note that the first two equations basically define X and Y in terms of F ; the last two forcethe matrices to maximise η

ξ.

The solution of these equations for finite matrix sizes will yield an approximation (andlower bound) for the partition function per site, κ, from the inequality

κ ≥ η

ξ. (2.77)

We will show in the next section that if the matrices are infinite-dimensional, this approxi-mation is exact.

Note that these equations do not define the matrices uniquely; for example, they arevalid under the similarity transformations

X(a) → P T (a)X(a)P (a), Y (a, b) → P T (a)Y (a, b)P (b) (2.78)

A(a) → P T (a)A(a)P (a), F (a, b) → P T (a)F (a, b)P (b) (2.79)

where P (a) is an orthogonal matrix (P T (a)P (a) = I) of dimension 2p × 2p. In particular,we can choose P (a) so that A(a), and hence X(a), is diagonal. Note that this ensures thatXT (a) = X(a) and Y T (a, b) = Y (b, a).

The equations are also unchanged under the transformations

X(a) → c2X(a), Y (a, b) → c2Y (a, b), A(a) → cA(a) (2.80)

where c is any constant. In other words, we can choose normalising factors for X, Y andA. Since X and Y are defined only as eigenvectors in the equations, and hence can havearbitrary normalisation, this seems reasonable.

2.5 The infinite-dimensional solution

Solving the CTM equations provides a lower bound for κ, which can be used as an approx-imation for κ. In this section, we will show that the approximation becomes exact as the

64

matrices become infinite-dimensional. To do this, we show that the space for ψ generated byEquation 2.41 contains the optimal ψ to the maximisation problem in Equation 2.40. Again,this section is based on Baxter’s workings in [12].

We start off by again considering an m× n square lattice with toroidal boundary condi-tions. In addition to the weight ω of each cell, we assign a ‘weight’ f(a, b) to each horizontaledge with spins a and b. Note that f(a, b) is not the Boltzmann weight of an edge; atthe moment it is just an arbitrary function. We do, however, impose the condition thatf(a, b) = f(b, a).

We continue by defining the 2m-dimensional vector φ0 by its elements:

[φ0]σ1,σ2,...,σm= f(σ1, σ2)f(σ2, σ3) . . . f(σm, σ1) (2.81)

and we define φ to beφ = V n−1φ0. (2.82)

Now we will construct a set of matrices F ′(a, b) which, when substituted for F intoEquation 2.41, will give the maximum value for ψ. We define F ′(a, b) to be a 2n−1-dimensionalsquare matrix with entries

F ′λ,µ(a, b) = ω

(

a λ1

b µ1

)

f(λn, µn)

n−2∏

i=1

ω

(

λi λi+1

µi µi+1

)

(2.83)

and define α1, α2, . . . , αn to each be sets of m spins (so that, for example, α1 consists of thespins α1,1, α1,2, . . . , α1,m). Then we can derive

φα1=

[

V n−1φ0

]

α1

=∑

α2,α3,...,αn

(

n−1∏

i=1

Vαi,αi+1

)

[φ0]αn

=∑

α2,α3,...,αn

(

n−1∏

i=1

m∏

j=1

ω

(

αi,j+1 αi+1,j+1

αi,j αi+1,j

)

)

m∏

j=1

f(αn,j, αn,j+1)

=∑

α2,α3,...,αn

m∏

j=1

(

f(αn,j, αn,j+1)n−1∏

i=1

ω

(

αi,j+1 αi+1,j+1

αi,j αi+1,j

)

)

=∑

α2,α3,...,αn

m∏

j=1

F ′(α2,j ,α3,j ,...,αn,j ),(α2,j+1 ,α3,j+1,...,αn,j+1)(α1,j , α1,j+1)

= Tr F ′(α1,1, α1,2)F′(α1,2, α1,3) . . . F

′(α1,m, α1,1). (2.84)

Thus substituting ψ = φ and F (a, b) = F ′(a, b) satisfies Equation 2.41. The manipulationwe have done above is essentially converting a division of a half-plane between rows andcolumns, as shown in Figure 2.23.

We must also have (F ′(a, b))T = F ′(b, a). This follows quite easily from Equation 2.83,

65

......

PSfrag replacements

VVVVV

ψa

b

c

d

σ1

σ2

σ3

σ4

σ5

F

F

F

F

F

XYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

φ0

φα1

α1,1 α1,1

α1,2 α1,2

α1,3 α1,3

α1,4 α1,4

α1,5 α1,5

Fig. 2.23: Graphical interpretation of Equation 2.84.

the fact that ω

(

a bc d

)

= ω

(

c da b

)

, and our earlier assumption that f(a, b) = f(b, a).

Now, if we choose f correctly, so that in the basis of eigenvectors of V , φ0 contains somenon-zero multiple of the eigenvector corresponding to Λ, then in the thermodynamic limitn → ∞, φ = V n−1φ0 will become the partition function of the half-plane, which is theoptimal ψ. Therefore as n→ ∞ (which also makes the dimension of F ′ infinite), the optimalψ must satisfy Equation 2.41 for F = F ′. Thus when the matrices are infinite-dimensional,the ψ-space generated by Equation 2.41 contains the optimal solution, and therefore theapproximation is exact.

2.6 Calculating quantities

In our statistical mechanical models, the primary quantity that we are interested in is κ, thepartition function per site. If we have solutions, or approximate solutions, for all variablesin the CTM equations, then we can easily find κ by means of the equation

κ =η

ξ. (2.85)

However, as we will see in subsequent sections, not all of our methods calculate all of thevariables. In particular, our renormalization group method (which we will describe in Section2.10) only calculates approximate As and F s. We would therefore like to find an expressionfor κ involving only those variables.

We do this by adjusting Equation 2.62. If X(a), A(a) and F (a, b) are solutions for the

66

CTM equations, then we have

ξ =

∑

a,b Tr XT (a)F (a, b)X(b)F (b, a)∑

a Tr XT (a)X(a)

=

∑

a,b Tr (AT (a))2F (a, b)A2(b)F (b, a)∑

a Tr (AT (a))2A2(a)

=

∑

a,b Tr A2(a)F (a, b)A2(b)F (b, a)∑

a Tr A4(a)(2.86)

since A(a) can be taken to be symmetric.We can find a similar expression for η in the same way. If the matrices solve the CTM

equations, we have

η =

∑

a,b,c,d Tr ω

(

a bc d

)

Y T (a, b)F (a, c)Y (c, d)F (d, b)

∑

a,b Tr Y T (a, b)Y (a, b)

=

∑

a,b,c,d Tr ω

(

a bc d

)

A(a)F (a, c)A(c)F (c, d)A(d)F (d, b)A(b)F (b, a)

∑

a,b Tr A2(a)F (a, b)A2(b)F (b, a)(2.87)

since Tr AB = Tr BA for all matrices A,B.Since κ = η

ξ, these two expressions allow us to find κ in terms of our solved variables.

κ =

(∑

a Tr A4(a))

(

∑

a,b,c,d Tr ω

(

a bc d

)

A(a)F (a, c)F (c, d)A(d)F (d, b)A(b)F (b, a)

)

(

∑

a,b Tr A2(a)F (a, b)A2(b)F (b, a))2 .

(2.88)This expression again has a graphical interpretation — each of the sums in the expressionrepresents the partition function of the entire plane, expressed in a different way, as shown inFigure 2.24. The first sum in the numerator just takes four corner-transfer matrices and putsthem together (taking the trace to ensure that the first and last half-row cuts, which occupythe same sites, are identical). The sum in the denominator adds a row to this, while thesecond sum in the numerator adds a cross, consisting of a row and an intersecting column.

It can be seen that this cross consists of four half-rows together with a single cell. Bydividing by the square of the sum in the denominator, which adds two half-rows, we essen-tially remove the half-rows and are left with the partition function of a single cell, which isexactly what κ is.

The partition function per site, while very important, is not the only thermodynamicquantity of importance. We will often want to calculate spin expectations — the expectedvalue of a single spin or products of certain spins. We show next how to calculate some ofthese.

67

PSfrag replacements

Vψ

a

b

c

d

σ1

σ2

σ3

σ4

σ5

FXY

AA

A A

ωF (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(a) Four-CTM partitionfunction.

PSfrag replacements

Vψ

a

b

c

d

σ1

σ2

σ3

σ4

σ5

FF

XY

A A

AA

ωF (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(b) Four-CTM partitionfunction with a row.

PSfrag replacements

Vψ

a b

c d

σ1

σ2

σ3

σ4

σ5

F

F

FF

XY

A

A

A

A

ωF (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

(c) Four-CTM partition func-tion with a cross.

Fig. 2.24: Calculating κ. The expression we use is (a)×(c)(b)2 .

The most important spin expectation is the magnetization per site, m = 〈σi〉. This isthe expected value of an arbitrary site, which does not depend on the site if the model istranslationally invariant (which we assume it is). As shown in the Introduction, we can writethis as

m = − ∂ψ

∂H= kT

∂ ln κ

∂H= kT

∂ ln(η/ξ)

∂H(2.89)

when η and ξ solve the CTM equations.Now, by construction, η

ξis stationary with respect to all elements in F :

∂η/ξ

∂F (a, b)= 0. (2.90)

Recall from Section 2.4.3 that

∂ξ

∂X(a)=

∂η

∂Y (a, b)= 0. (2.91)

Since η does not depend on X and ξ is independent of Y , we know that ηξ

is stationary withrespect to all elements in X and Y .

Therefore, when we differentiate ηξ

with respect to H, the only quantity in the expressionthat is not stationary with respect to H is ω. Now ω will change with the model, but inall models that we are interested in, the external field interaction affects each spin equally,contributing a weight eβHσi for each spin σi. As we are working on the square lattice, eachspin belongs to four different cells, so this weight will be split over the ωs for those cells. Onthe other hand, each of the four spins will contribute to the weight. In other words,

ω

(

a bc d

)

∝ eβH(a+b+c+d)/4 (2.92)

68

and this is the only part of ω which includes H.Using our expression for η

ξin terms of A and F now enables us to differentiate ln η

ξ:

∂ ln(η/ξ)

∂H=

∑

a,b,c,d

∂ ln(η/ξ)

∂ω

(

a bc d

)

∂ω

(

a bc d

)

∂H

=∑

a,b,c,d

∂

∂ω

(

a bc d

)

(

ln∑

a

Tr A4(a′)

+ ln∑

a′,b′,c′,d′

ω

(

a′ b′

c′ d′

)

Tr A(a′)F (a′, c′)A(c′)F (c′, d′)A(d′)F (d′, b′)A(b′)F (b′, a′)

−2 ln∑

a′,b′

Tr A2(a′)F (a′, b′)A2(b′)F (b′, a′)

) ∂ω

(

a bc d

)

∂H

=∑

a,b,c,d

βa + b+ c + d

4ω

(

a bc d

)

×

∂

∂ω

0

@

a bc d

1

A

(

∑

a′,b′,c′,d′ ω

(

a′ b′

c′ d′

)


)

∑

a′,b′,c′,d′ ω

(

a′ b′

c′ d′

)


= β

∑

a,b,c,da+b+c+d

4ω

(

a bc d

)

Tr A(a)F (a, c)A(c)F (c, d)A(d)F (d, b)A(b)F (b, a)

∑

a,b,c,d Tr ω

(

a bc d

)


. (2.93)

69

From the CTM equations, the denominator of the fraction in this expression becomes

∑

a,b,c,d

Tr ω

(

a bc d

)


= Tr∑

a,b

A(a)

(

∑

c,d

ω

(

a bc d

)

F (a, c)Y (c, d)F (d, b)

)

A(b)F (b, a)

= ηTr∑

a,b

A(a)Y (a, b)A(b)F (b, a)

= ηTr∑

a,b

A2(a)F (a, b)A2(b)F (b, a)

= ηTr∑

a

X(a)

(

∑

b

F (a, b)X(b)F (b, a)

)

= ηξTr∑

a

X2(a) = ηξ∑

a

Tr A4(a). (2.94)

Using exactly the same reasoning we can also derive the equation

∑

a,b,c,d

a

4Tr ω

(

a bc d

)

A(a)F (a, c)A(c)F (c, d)A(d)F (d, b)A(b)F (b, a) = ηξ∑

a

a

4Tr A4(a).

(2.95)By grouping the matrices in a different order, we can derive in similar fashion the equation

∑

a,b,c,d

b

4Tr ω

(

a bc d

)


= ηξ∑

b

b

4Tr A4(b) = ηξ

∑

a

a

4Tr A4(a). (2.96)

Similarly, if the multiplying coefficient is c or d, we can do similar manipulations andachieve the same result. Therefore the numerator of the fraction in Equation 2.93 is

4ηξ∑

a

a

4Tr A4(a) = ηξ

∑

a

aTr A4(a). (2.97)

Putting it all together, we get our (remarkably simple) expression for the magnetization

70

per site:

m = kT∂ ln κ

∂H

= kTβηξ∑

a aTr A4(a)

ηξ∑

a Tr A4(a)

=

∑

a aTr A4(a)∑

a Tr A4(a). (2.98)

As with many things in the CTM method, there is a graphical interpretation to this aswell — A4(a) is the contribution to the partition function of the entire lattice if the ‘central’spin is fixed at value a, as shown previously in Figure 2.24(a). Thus the denominator is thepartition function ZN , and the numerator is the sum of a multiplied by the (unnormalised)probability of a being the value of the central spin. This gives the expected value of thatspin.

If we wish to calculate other spin expectations, especially for spins which are close together(e.g. in a single cell), we can use a similar method. The result is very much what wouldbe expected — in the most general terms, if σi, σj, σk, and σl are spins around a single cell,then

〈f(σi, σj, σk, σl)〉 =

∑

a,b,c,d f(a, b, c, d)Tr ω

(

a bc d

)


∑

a,b,c,d Tr ω

(

a bc d

)


.

(2.99)The interpretation of this quotient is as before, except that we have to express the partitionfunction so that we know the value of 4 spins.

2.7 The 1x1 solution — an example

As an example, we solve the CTM equations in the case where p = 0 for the hard squaresmodel. Since p = 0, all the matrices in the equations are now scalars. We have already giventhe formula for ω in Equation 2.38. This is all we need to solve the CTM equations.

It seems reasonable to assume that F (1, 1) = 0, since applying F (1, 1) would involve two1 spins side by side, which is forbidden. Additionally, because the norms of X and Y (aseigenvectors of R and S) are undetermined, we can set X(0) = Y (0, 0) = 1. From Equation2.69 with a = 1, we then get A(0) = ±1. As A(0) represents the weight of some section ofthe plane, it should be positive, so we take A(0) = 1. From Equation 2.70 with a = b = 0,we then derive F (0, 0) = 1.

Let t = z1

4 , so that we do not have to deal with fractional powers of z. Substituting what

71

we know into the CTM equations and writing them out in full gives the equations

ξ = 1 + F 2(0, 1)X(1) =1

X(1)F 2(0, 1) (2.100)

η = 1 + 2tF 2(0, 1)A(1) =F (0, 1)

Y (0, 1)

(

t+ t2F 2(0, 1)Y (0, 1))

(2.101)

X(1) = A2(1), Y (0, 1) = F (0, 1)A(1). (2.102)

Substituting the last two equations into the first two lines and removing ξ and η gives us

1 + F 2(0, 1)A2(1) =

(

F (0, 1)

A(1)

)2

(2.103)

1 + 2tF 2(0, 1)A(1) =1

A(1)

(

t + t2F 2(0, 1)A(1))

. (2.104)

The first equation gives F 2(0, 1) in terms of A(1):

F 2(0, 1) =1

1A2(1)

− A2(1)=

A2(1)

1 − A4(1)(2.105)

which, when substituted into the second equation, gives an expression for A(1) as the rootof a 5th degree polynomial:

1 + 2tA3(1)

1 − A4(1)=

t

A(1)+ t2

A2(1)

1 − A4(1)(2.106)

⇒ −t + A(1) − t2A3(1) + 3tA4(1) − A5(1) = 0. (2.107)

This cannot be solved in closed form (by Maple), but it is possible to generate an expressionfor A(1) as a series in t. This gives

A(1) = t− t5 + 4t9 − 21t13 + 125t17 − 800t21 + 5368t25 − 37240t29 + 264828t33 + . . . (2.108)

Substituting this expression back into F (0, 1), ξ and η gives a series for the partitionfunction per site.

κ = 1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57254z8 + . . . (2.109)

This gives κ correctly up to the z7 term. The actual coefficient of z8 in κ is −57253,which differs by only 1 from our approximation. This demonstrates how, at low matrix sizes,the CTM method can generate very good series approximations efficiently.

72

2.8 Matrix algorithms

Now we move on from Baxter’s CTM equations to derive our own numerical and seriesmethods based on these equations. In the development of these methods, we have used twomatrix decomposition algorithms which we shall outline here.

2.8.1 Cholesky decomposition

The first algorithm is Cholesky decomposition. If we are given a n × n matrix X, then theCholesky method finds an upper triangular matrix A such that

ATA = X. (2.110)

It does this by applying the following algorithm (taken from [116, pp. 96-98]):

1. Set i = 1.

2. Calculate Aii =(

Xii −∑i−1

k=1A2ki

)1

2

.

3. For j = i+ 1, i+ 2, . . . , n, calculate Aij = 1Aii

(

Xij −∑i−1

k=1AkiAkj

)

.

4. Increase i by 1.

5. If i > n, stop. Otherwise go to step 2.

The equations used above are rearrangements of the elements of the matrix equationATA = X, so the method will produce the correct A.

2.8.2 The Arnoldi method

The other method that we shall use is the Arnoldi method, invented by Arnoldi in 1951 ([3]).Given a n × n matrix A and a number m ≤ n, the Arnoldi method finds an n ×m matrixVm such that V T

mAVm is a Hessenberg matrix (all entries below the lower tri-diagonal axisare 0). The method runs as follows (taken from [119, Section 6.3]):

1. Choose an n-dimensional vector v1 of norm 1.

2. Set j = 1.

3. Compute hij = (Avj) · vi for i = 1, 2, . . . , j.

4. Compute wj = Avj −∑j

i=1 hijvi.

5. Set hj+1,j = ||wj||.

6. If j = m or hj+1,j = 0 then stop.

73

7. Set vj+1 = 1hj+1,j

wj.

8. Increase j by 1.

9. Go to step 3.

If the algorithm finishes with j = m, set Vm to be the matrix whose columns arev1,v2, . . . ,vm. Otherwise the algorithm fails.

At each step, we are basically applying the Gram-Schmidt orthogonalization process toAvj, to form the basis {v1,v2, . . . ,vm}. Therefore all the vi are orthogonal to each other.By construction, they have norm 1, so we know that

V TmVm = Im. (2.111)

Lemma 2.8.1. Let Hm be the m × m matrix whose (i, j)th entry is hij (0 if this is notdefined by the algorithm). Then

V TmAVm = Hm. (2.112)

Proof. By the construction of Vm, for all j = 1, 2, . . . , m,

Avj = wj +

j∑

i=1

hijvi

= hj+1,jvj+1 +

j∑

i=1

hijvi

=

j+1∑

i=1

hijvi

=

m∑

i=1

hijvi (2.113)

since hij = 0 when i > j + 1. This implies that

AVm = VmHm, (2.114)

and since V TmVm = I, the result follows.

Later, we will use the Arnoldi method as a ‘quick and easy’ substitute for diagonalizinga matrix.

2.9 The iterative CTM method

Having derived the CTM equations, we would now like to find a numerical value or a seriesfor κ by solving them. One method we can use to do so is by iterating through the equations,

74

solving one at a time. In his papers, Baxter proposed a general iterative method, but overalleschewed it for more optimised (but also more specialised) algorithms, also based on iteration.In this section, we present our own generalised iterative method.

While the CTM equations were generated by assuming that X, Y and F are of dimension2p × 2p, it is apparent that they place no restriction on the matrices to actually have suchdimensions. While the power-of-2 dimension makes sense graphically, corresponding to therows of the matrices being indexed by the values of a half-row/column cut of p spins, theequations can be derived in identical fashion without such an assumption. The only thingwe lose is the graphical interpretation, but the equations still hold. As such, we can assumethat the matrices are of any size (as long as they are all of the same size).

To construct the iterative method, we start with a set of values for all the matrices.Then we impose an order on the CTM equations, and go through the equations one at atime according to that order. For each equation, we fix some of the variables in the equation,then solve for the unfixed variables. The solutions then become the new values for thosevariables. Eventually, working through all the equations once should lead to us changingeach variable once.

We hope that if we choose the right order for the equations, then after each pass throughthe equations, the values of the variables will be closer to the solution of the equations thanbefore. Unfortunately, we cannot actually prove that this will be the case; but our empiricaltesting indicates that the order we have chosen does get us closer. The order that we use is2.53 → 2.54 → 2.69 → 2.70 → 2.53 → . . . .

Having set the framework, we must figure out how to solve each of the CTM equationsin turn. Firstly, we take Equation 2.53. This equation is merely a (glorified) eigenvalueequation for the matrix R — in fact, it generates the same equations as the eigenvalueequation

ξX = RX (2.115)

where X is taken as a vector. Therefore it can be solved using the power method. To do this,we fix the value of R, then calculate the right-hand side (which corresponds to the right-handside of Equation 2.53). To keep the values at a reasonable size, we set a normalisation for X— here we have used X0,0(0) = 1. This allows us to find ξ, and then divide the right-handside by ξ to get a new X. By repeating this process, eventually ξ will become the maximumeigenvalue of R and X will become the corresponding eigenvector, which is what we want.In terms of the matrices in Equation 2.53, we are applying the equations

ξ′ =

[

∑

b

F (0, b)X(b)F (b, 0)

]

1,1

(2.116)

X ′(a) =1

ξ′

∑

b

F (a, b)X(b)F (b, a) (2.117)

many times. Because this is just a modified power method, we know this method willconverge, given enough iterations. In practice, we just iterate a fixed number of times.

75

In an identical way, we recast Equation 2.54 as

η′ =

[

∑

c,d

ω

(

0 0c d

)

F (0, c)Y (c, d)F (d, 0)

]

1,1

(2.118)

Y ′(a, b) =1

η′

∑

c,d

ω

(

a bc d

)

F (a, c)Y (c, d)F (d, b). (2.119)

We can then use a similar method to find η and Y from F . We used the normalisationY0,0(0, 0) = 1.

Now the only variables which we have not recalculated are A and F . To find these, weuse the remaining Equations, 2.69 and 2.70. Equation 2.69 expresses X in terms of A only,so we should be able to reverse the equation by setting A(a) to be the ‘square root’ of X(a).However, it is not immediately obvious how we could find such a matrix.

The most obvious route would be to diagonalize X(a), i.e. find a matrix P (a) such that

X(a) = P−1(a)D(a)P (a). (2.120)

Then, because it is easy to find the square root of a diagonal matrix (just take the squareroots of its components), we set

A′(a) = P−1(a)√

D(a)P (a) (2.121)

so that A(a) satisfies Equation 2.69. This is easy to do for a matrix of scalars, so we use thisprocedure if we are calculating numerical values.

On the other hand, if we want to evaluate κ as a series in some variables, this procedurewould necessitate a diagonalization of a matrix of series, which is much more difficult. Thebest approach that was suggested to do this was to calculate the eigenvalues as series in zby solving for each coefficient in the series individually. However, to find the first coefficient,we would need to solve an nth degree polynomial (if A is n× n), which is impossible to doexactly for large n. Even if we used a numerical method to solve for this coefficient, the errorthis would introduce would quickly escalate in magnitude as we used the result to solve forthe remaining coefficients. So we cannot use this approach. We have not yet managed tocome up with a reasonable way of diagonalization.

For an alternate way of updating A, we recall from Section 2.4.4 that A can be taken tobe symmetric. So instead of solving Equation 2.69 for A, we instead solve the equation

AT (a)A(a) = X(a). (2.122)

If A(a) is symmetric, the equations are identical. Moreover, from the remarks in Section2.4.4, we can choose A to be diagonal. Therefore, given X, we would like to find a value forA in the above equation which is diagonal. Unfortunately, since we cannot guarantee thatthe current X is diagonal, this will often be impossible. Since we would like A to be ‘as

76

diagonal as possible’, we apply the Cholesky method described in Section 2.8.1. This givesus an upper triangular A(a) which satisfies the above equation.

Having used Equation 2.69 to re-calculate A, we can then use the remaining equation tocalculate the remaining variable F . We recast Equation 2.70 as

F ′(a, b) = A−1(a)Y (a, b)A−1(b) (2.123)

and substitute our new values for A and Y into the right-hand side to derive a new value forF (a, b).

Now we have gone through all the equations and recalculated each variable exactly once.Therefore we can return to the beginning and repeat this process. Our hope is that byrepeating this process enough times, the matrices will eventually converge to a solution ofthe CTM equations. Our empirical evidence seems to indicate that this is indeed the case.

However, the method so far has some omissions. One is that it assumes that the matricesare of fixed size, and never changes that size. So as it stands, we can only obtain the solutionto the CTM equations for finite matrix size. This is undesirable, because it is only in theinfinite-dimensional limit that the solutions for the equations give us the actual value of κ.Furthermore, it is not always obvious what matrices we should start our variables at — wewould like to start them reasonably close to the solution, so they have a chance of converging,but it is not obvious where that is.

Ideally, we want to start our matrices small (thereby having fewer initial conditions toset), and somehow ‘grow’ the matrices, so that as the algorithm progresses they becomelarger and larger, and make our approximation of κ more and more accurate. Furthermore,we want to expand the matrices by one row and column at a time, so that we can observethe convergence effects for fixed matrix sizes. Unfortunately there does not seem to be anintuitive way to expand the matrices in this way. We ended up using ad hoc methods ofexpanding (which really means that we tried something that didn’t seem too weird, and ifit worked, we kept it!). In general, our expansion methods expanded only the X and Ymatrices, so if we wished to expand the matrices, we did it after recalculating X and Yfrom the first two CTM equations. We could then calculate the expanded A and F from theremaining equations.

For setting initial conditions, we started off with 1 × 1 matrices which were model-dependent. We used the values that the graphical interpretation of the matrices would havehad if the cut had 0 length. This usually worked well.

To summarize, the procedure for our iterative CTM method is:

1. Set all the matrices to their initial values of size 1 × 1.

2. Apply the power method to Equation 2.53 to update ξ and X(a) (in practice, weiterated Equation 2.53 eight times).

3. Apply the power method to Equation 2.54 to update η and Y (a, b).

4. If we have iterated sufficiently long at the current matrix size (we used 5 iterations),then expand the matrices by one row and column.

77

5. Apply the Cholesky method to X(a) to update A(a).

6. Set F (a, b) = A−1(a)Y (a, b)A−1(b).

7. Go back to step 2.

We found that for the models that we applied it to, this process seems to convergereasonably well. We will discuss the convergence of the method in later sections.

2.9.1 The hard squares model — an example

To illustrate the iterative CTM method (in particular the model-specific parts), we applyit to the low-density (where occupied spins are discouraged) expansion of the hard squaresmodel in Section 2.7. For this model, ω is given in Equation 2.38 as

ω

(

a bc d

)

=

{

0 if a = b = 1, a = c = 1, b = d = 1 or c = d = 1z(a+b+c+d)/4 otherwise

(2.124)

where z is the weight of an occupied (state 1) spin.The initial conditions that we used for this model are

F (0, 0) = X(0) = X(1) = Y (0, 0) = Y (0, 1) = Y (1, 0) = 1 (2.125)

F (0, 1) = F (1, 0) = z1

4 (2.126)

F (1, 1) = Y (1, 1) = 0. (2.127)

In particular, both F (1, 1) and Y (1, 1) are 0 because they require 2 occupied spins to beadjacent. Eventually, it does not really matter what values we use, as long as the processconverges (which it seems to do).

We used the following expansion procedure: if the matrices are of dimension n× n, thenafter recalculating X and Y via the first two CTM equations, we set all matrices to be ofdimension (n+ 1) × (n+ 1), with their previous values in the top left corner. We then set

Xn+1,n+1(a) = z1

2Xn,n(a) a = 0, 1 (2.128)

Yi,n+1(0, 0) = Yn+1,i(0, 0) = Yn,n(0, 0) (2.129)

Yi,n+1(0, 1) = Yn,n(0, 1) (2.130)

Yn+1,i(1, 0) = Yn,n(1, 0) (2.131)

with all other elements equal to 0.As we pointed out in Section 2.4.4, if we apply the appropriate transformations, then

we can take X(a) to be diagonal. In fact, we can also take the diagonal elements to be indecreasing order, where a series is considered smaller than another series if it has a largerleading power of z, or a smaller leading coefficient if the leading powers are equal. This is

78

why we only set Xn+1,n+1(a) to be non-zero, and force its smallest power of z to be largerthan the previous diagonal term.

Applying the iterative method using these initial conditions and expansion method givesthe following sequence of κ approximations:

1 + z − 2z2 + 7z3 − 28z4 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 866z7 − 56996z8 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57254z8 + 388830z9 − 2699688z10 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57254z8 + 388830z9 − 2699688z10 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57254z8 + 388830z9 − 2699688z10 + . . .1 + z − 3z2 + 2z5/2 + 13z3 − 22z7/2 − 40z4 + 122z9/2 + 134z5 + . . . (first iteration at size 2)1 + z − 2z2 + 8z3 − 40z4 + 224z5 + 4z22 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57253z8 + 388802z9 − 2699202z10

+19076005z11 + 8z23/2 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57253z8 + 388802z9 − 2699202z10

+19076006z11 − 136815282z12 + 993465248z13 − 7290310954z14 + 53986385102z15

−272063z61/4 − 17858945908z31/2 + . . .1 + z − 2z2 + 8z3 − 40z4 + 225z5 − 1362z6 + 8670z7 − 57253z8 + 388802z9 − 2699202z10

+19076006z11 − 136815282z12 + 993465248z13 − 7290310954z14 + 53986385102z15

−402957351940z16 − 1319450z67/4 + 407888312210z17 + . . .

which appears to converge reasonably well. The last approximation is correct up to orderz15 — the z16 term is 1 less than the exact number.

2.9.2 Convergence/results

We applied the iterative method to the hard squares model to find series for matrix sizes upto 10 × 10. We discuss our results in this section.

One weak point in the theory of the iterative method is that we can’t actually show thatthe values we derive will actually converge to κ. Furthermore, even if it does converge, wecannot theoretically estimate the rate or behaviour of the convergence. Therefore we mustrely on empirical evidence.

Fortunately, if we choose the initial conditions and method of expansion properly, themethod seems to converge fairly well. Given that we are merely cycling through the equationsand solving them one by one, we would expect to either converge to the solution or notconverge at all. It seems that if we fix the matrix size and solve the equations exactly, theapproximation that results for κ gives the actual value up to some power of z, depending onthe size of the matrices. Furthermore, the method is able to attain all the correct terms atevery size, after a number of iterations. This leads us to think that the method is converging.

Of great interest is how rapidly the approximations converge. We can measure this byobserving how many coefficients our approximation for κ gives exactly. For small matrixsizes (and hence relatively inaccurate approximations) this is easy to tell, as we know thecoefficients of κ up to the z42 term from [20]. However, as the matrix sizes become larger,

79

Matrix size Number of correct terms1 82 163 244 325 326 407 408 489 4810 48

Tab. 2.1: Convergence of the iterative method for the hard squares partition function.

we do not have this luxury.Fortunately, for higher matrix sizes, we noticed a pattern in the calculations. Because

our ω returns cell weights as whole powers of t = z1

4 , all our calculations were done in seriesof t. However, κ is a series in whole powers of z. We noticed that our approximations toκ started as a series in whole powers of z, and then after a number of terms, broke downinto powers of t that were not whole powers of z. We found that at higher matrix sizes, theterms where the powers of z were whole were generally correct, while any terms after thefirst fractional power of z were incorrect.

The other way that we used to determine which coefficients were exact was to comparethe result with that obtained from a larger matrix dimension. As we observed before, theψ-space generated by F at a fixed matrix size includes all the ψ-spaces generated by F sof lower dimension, so solving the CTM equations at a fixed size will always yield a moreaccurate approximation than solving the system at lower sizes. Therefore a series term isprobably accurate if it agrees with the corresponding term in an approximation resultingfrom a higher dimension.

Using these methods, Table 2.1 shows the number of terms we can get exactly from eachfixed matrix size. It is interesting to note that the number of correct terms always seemsto be a multiple of 8, and that it does not always increase strictly. Furthermore, it seemspossible that at size 2n, we get 16n correct terms, which would imply that the algorithm isexponential time. At the largest size (10 × 10), we managed to get 48 correct terms. It is atribute to the efficiency of Baxter’s method that in 1979 he managed to reach 43 terms withthe computing resources of the time! However, our method appears to be more general.

2.9.3 Technical notes

We programmed the iterative method for series for the hard squares model, which is themain model that we experimented upon, using C++. While doing so, we came across sometechnical difficulties, which we shall describe here.

Firstly, we had the ubiquitous problem of rounding. The series terms are very large inte-

80

gers, and we will also have to manipulate large integers in intermediate steps. Consideringthat the terms are exact integers, however, we would like to use exact arithmetic. Unfortu-nately the precision required (the z42th term is on the order of 1035) is much larger than thatprovided in any of the standard data types in C++. Moreover, if we used a floating-pointtype then we would need an incredibly large precision to eliminate all possible roundingerrors. This would slow the computation immensely, so we try to use exact types.

We used two different approaches. The first way we tried was to use rational numbers(type cl RA) from CLN, an arbitrary precision library written by Haible ([67]). Unfortu-nately, although the final series terms are integers, it is not true that all terms of all inter-mediate series are integers. In fact, in our calculations these non-integer terms frequentlybecame very unwieldy fractions, with both numerator and denominator far in excess of theactual number. This slowed the computation significantly.

The other approach we used was to use the well-known Chinese Remainder Theorem,which we state below (taken from [47, Section 5, Theorem 2]).

Theorem 2.9.1. Let p1, p2, . . . , pn be a set of integers which are relatively prime, i.e.gcd(pi, pj) = 1 for i 6= j. Also let a1, a2, . . . , an be a set of arbitrary integers. Then thereexists a unique integer x which satisfies

x ≡ ai mod pi ∀i = 1, 2, . . . , n (2.132)

and0 ≤ x < p1p2 . . . pn. (2.133)

We can use this theorem by performing our computations multiple times — once inintegers mod p1, once in integers mod p2, and so on. If all the coefficients of the originalseries lie between 0 and p1p2 . . . pn, then we can reconstitute the series from the results of allthe runs. We can only do this because we know that all the coefficients are integers.

However, as we mentioned before, not all of the terms of the intermediate sequences areintegers, and neither are the terms in the final series that are not exact. Fortunately thisdoes not really matter, as it does not interfere with the calculations, and we really only wishto know the terms which are exact.

The remaining difficulty with using the Chinese Remainder Theorem is that Choleskydecomposition requires us to take the square root of a series, which involves taking the rootof the first term. This is not always possible — in the integers modulo p, any number willhave the same square as its negative, so there will only be p+1

2squares in the p− 1 numbers.

So not all of the integers modulo p will have square roots. We basically cannot do anythingabout this, but given that we end up with integers, it seems unlikely that we will ever takethe square root of a number that does not have an integer square root, or at least a rationalone. This still leaves us with the problem of how to find a square root modulo p. We usethe following lemma, taken from [85, Section II.2].

Lemma 2.9.2. Let p be a prime where p ≡ 3 mod 4, and let a be an integer. Then if a hasa square root modulo p, a

p+1

4 is a square root of a modulo p.

81

Proof. Let x be a square root of a. Then in modulo p,

ap+1

4 ≡ (x2)p+1

4 ≡ xp+1

2 ≡ xxp−1

2 . (2.134)

Now since(x

p−1

2 )2 ≡ xp−1 ≡ 1 mod p (2.135)

from Fermat’s Little Theorem, we know that xp−1

2 must be either 1 or −1 mod p. Therefore

ap+1

4 ≡ ±x mod p (2.136)

which proves the lemma.

So as long as we use primes which are equivalent to 3 mod 4, we can find the squareroot easily if it exists. Using these methods, we can implement the calculations as integercalculations only, using the C++ integer type int. The compiler that we used allocates 4bytes of information to this type, which is 32 bits. Therefore the range of this type is from 0to 232 − 1. We wanted to be able to add two such numbers, so we limit our variables to halfthis limit. So for our prime moduli we chose the largest possible primes that are equivalentto 3 mod 4 and less than 231 − 1. The largest such prime is 2147483647.

The other problem that we encountered was one of storage. Obviously it is impossible tokeep all terms of the series; in fact, considering that only the first few terms are accurate, wewould not even want to. In all the series operations we used, terms of higher power in theoperands do not affect terms of lower power in the result, so theoretically we should keepa fixed number of terms in the series, which is equal to the final number of terms that theseries should yield.

Unfortunately, for reasons unknown to us, when we attempted to do this, we found thatwe did not manage to produce all the terms that we should produce at that matrix size. Wehave no idea why this is so, but to make the method converge for a certain number of terms,we have to keep almost twice as many terms as that. For example, to derive the full numberof series terms (48) at size 8, we have to keep more than 80 terms for our intermediate series.This slows our calculation down greatly. It also throws off our convergence estimates — itis possible that if we had kept more terms, we would have derived more series terms fromeach matrix size.

2.9.4 Efficiency

The efficiency of the iterative method depends largely on the convergence of the series ateach matrix size. As we do not have a general formula for how many terms will be accuratefor a fixed matrix size, it is difficult to analyse the efficiency of the method. All we can dois observe the efficiency of the method relative to matrix sizes.

Suppose that we wish to run the algorithm until the matrices are of dimension m ×m,while keeping s series terms. Each addition of series takes O(s) operations, while each

82

multiplication of series takes O(s2) operations. This is also the efficiency of a divisionoperation on series.

Now, for m×m matrices, a matrix addition operation takes O(m2) addition operations,while a multiplication operation takes O(m3) operations, since each of the m2 elements in theresult requires O(m) addition and O(m) multiplication operations. For matrices of series,addition would therefore take O(m2s) operations, and multiplication would require O(m3s2)operations.

Now, each pass through the equations requires a fixed number of matrix multiplicationsand additions. Also, from the step where we expand the matrices to the step where weexpand them again is a fixed number of passes. On the other hand, if we build the matricesup from 1 × 1 matrices, we will need to expand them m− 1 times. While the matrices willnot always be m ×m in size, since we take the same number of iterations at every matrixsize, we can expect the ‘average’ size to be m

2, which is linear in m anyway. Therefore, the

iterative method will take O(m4s2) operations to produce matrices of dimension m×m withseries length s.

Theoretically, since this is polynomial time, the efficiency (or inefficiency) of this methodwould depend strongly on the relation of the series convergence to the matrix size, as shownin Table 2.1. For small sizes (1-4), this would appear to be linear, which would make theCTM extraordinarily (indeed implausibly) efficient, but at larger sizes convergence is muchslower (and also more unpredictable). If, as suggested earlier, the convergence is exponentialin the matrix size, then it is also exponential in the time taken.

In practical terms, however, the iterative method as given is simply not particularlyefficient. Quite apart from having to run the entire calculation 5 times to make use ofthe Chinese Remainder Theorem, the high (but fixed, so it does not show up in efficiencyanalysis) number of calculations required for each iteration, and the requirement to keepunpredictable numbers of extra series terms, make it hard for the method to get past 10×10matrices in a reasonable time.

2.10 The renormalization group method

The iterative method outlined in the above section will eventually reach a solution, givenenough time. However, in its current form it has (at least) three disadvantages. Firstly,because at each iteration we must apply the power method (which is itself an iterativemethod) twice, it isn’t very fast. Secondly, our expansion procedure, for moving from onematrix size to the next higher size, is very arbitrary — we use it only because it seems towork! Thirdly, but perhaps most importantly from a theoretical perspective, we do not haveany proof that the method actually converges.

The so-called corner transfer matrix renormalization group method (CTMRG), which wasfirst devised by Nishino and Okunishi in [105], does not suffer from any of these problems.In this section we present our version of this method, which differs only slightly from theoriginal.

In the iterative CTM method, we basically took all the CTM equations and tried to solve

83

PSfrag replacements

Vψ

aa

bb

c

d

σ1

σ2

σ3

σ4

σ5

FF

XYA

ω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.25: Expansion of F matrices in Equation 2.137.

them for all the variables in the equations. As the variables are matrices which only becomeexact when they are infinite-dimensional, this involves a large number of variables and alarge amount of computing time. In the renormalization group method, we ‘strip back’ themethod by returning to the meaning of the matrices.

Considering that from Equations 2.69 and 2.70, the X and Y matrices can be deriveddirectly from A and F , it would seem desirable to have a method where we only need tocalculate A and F . So we would like to have a procedure which, given values for A and F ,will find new values for A and F which are closer to the solution of the CTM equations.

We do this by looking at the graphical interpretation of the matrices. We have interpretedthe optimal A(a) as the transfer matrix of a quarter plane around a spin of value a, whileF (a, b) is the transfer matrix of half a row with end-spins a and b. We note that only in theinfinite-dimensional case (where the matrices transfer an infinite area) do the matrices yieldthe exact κ. Therefore, to put it imprecisely, we will probably get closer to the solution ofthe equations by making the finite-dimensional matrices transfer ‘as much as possible’.

This is done by sequentially expanding and reducing the A and F matrices. Every timewe expand the matrices, we make them transfer a larger area. Every time we reduce thematrices, we try to reduce them in such a way that as little information as possible is lost, sothey are closer to the exact solution of the equations. So now we need to find the proceduresfor expanding and reducing our matrices.

We expand by doubling the size of our matrices, so that the half-row is extended by 1spin. For the F matrices, we place this spin immediately next to the end-spin, so that theend-spin remains the same and we add one single cell. We then order the rows and columnsof the new F so that (if the possible spin values are 0 and 1) all the rows/columns wherethe new spin is 0 come first. We also need to add the weight of the extra cell, which resultsin a new F of

F ′(a, b) =

ω

(

0 a0 b

)

F (0, 0) ω

(

0 a1 b

)

F (0, 1)

ω

(

1 a0 b

)

F (1, 0) ω

(

1 a1 b

)

F (1, 1)

. (2.137)

We illustrate the expansion of F in Figure 2.25.If we look at the ψ that is generated by any particular F :

ψσ1,σ2,...,σm = Tr (F (σ1, σ2)F (σ2, σ3) . . . F (σm, σ1)) (2.138)

then it can be seen that F ′ corresponds to multiplying the half-plane partition function by

84

PSfrag replacements

Vψ

aa

b

c

d

σ1

σ2

σ3

σ4

σ5

F

F

XY

AA

ω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

Fig. 2.26: Expansion of A matrices in Equation 2.140.

the column transfer matrix:

[V ψ]σ1,σ2,...,σm = Tr (F ′(σ1, σ2)F′(σ2, σ3) . . . F

′(σm, σ1)). (2.139)

Now we can see that expanding F in this manner is equivalent to applying the power methodto V , since ψ is an eigenvector corresponding to the maximum eigenvalue of V . If we wereto expand F like this repeatedly, we would therefore converge to the solution of the model.

On the other hand, every time we expand F we double its dimension. Therefore we needto counterbalance the expansion by reducing the matrix size after every expansion. This iswhere A comes in. We keep it at the same size as F by expanding it in a similar manner,as shown in Figure 2.26. Instead of adding a cell, we add a cell and two half-row transfermatrices, to double the size of A. Formally, we replace A(a) with

A′(a) =

∑

b ω

(

0 b0 0

)

A(b)∑

b ω

(

1 b0 0

)

A(b)

∑

b ω

(

0 b1 0

)

A(b)∑

b ω

(

1 b1 0

)

A(b)

. (2.140)

To reduce the dimension, we note that finding κ is the result of a maximisation problem inA and F , so we would like it to be as large as we possibly can. Considering that

∑

a Tr A4(a)can be thought of as the partition function, our reduction procedure would like to make thisquantity as large as possible. Since the CTM equations are valid under the transformationsA(a) → P T (a)A(a)P (a) and F (a, b) → P T (a)F (a, b)P (b), it seems logical to try some sort ofreduction which involves multiplying by matrices with orthonormal columns. These matriceswould have to be non-square to reduce the dimension.

We recall our remark from Section 2.4.4 that the A matrices can be taken to be diag-onal. If they are indeed diagonal, then an immediately obvious reduction can be made byretaining the most significant (i.e. largest) elements and throwing away the rest. Thereforewe diagonalize A(a):

A(a) = V T (a)D(a)V (a) (2.141)

where V (a) is an orthonormal matrix and D(a) is diagonal. We further impose the conditionthat the elements of D(a) are in decreasing order. We then remove the desired number ofrows and columns from the bottom and right of D(a) to generate our new value for A(a).

85

Notably, we do not reverse the diagonalization. To keep the CTM equations valid, we mustapply the same transformation to F . First we calculate V T (a)F (a, b)V (b) and then removethe same number of rows and columns from the bottom and right as we did from D(a). Thisresults in our matrices being reduced to the desired size.

While this reduction algorithm works well when we deal with numerical values only, ifwe attempt to apply it when expressing our quantities as series, we then have to diagonalizematrices of series. As discussed in Section 2.9, this is not an easy task. We concluded thatit was best to avoid the diagonalization altogether. Instead, we used the Arnoldi methodas described in Section 2.8.2. This provides an ‘approximate’ diagonalization, while beingsubstantially easier to compute. However, in practice we found that the Arnoldi methoddoes not make the matrices converge as well as diagonalization.

We also need to figure out how to jump from one size to a larger one, i.e. how to expandthe matrices. Fortunately this is easy, as we can just cut off one less row and column whenreducing the matrices.

In summary, the procedure for the renormalization group method is:

1. Start with initial A and F of dimension 1 × 1.

2. Expand the A and F matrices using the formulas we have given above.

3. Diagonalize A, or use the Arnoldi method on A.

4. Reduce the matrices to their original size as described above. If we have iteratedsufficiently long at the current matrix size (again we used 5 iterations), then insteadreduce the matrices to one row and column more than their original size.

5. Go back to step 2.

After this procedure has been carried out, we can then calculate κ via Equation 2.88.We can also calculate other quantities of interest from the expressions in Section 2.6.

2.10.1 Convergence/results

We first attempted to apply the renormalization group method to the hard squares modelto find the partition function per site as a series of z, the fugacity of an occupied site.Unfortunately, while the series converged for 2 × 2 matrices, at higher levels it simply didnot converge, even when we tried diagonalizing the matrices outright instead of using theArnoldi method. We are not sure why this is so, except that seeing as we are not guaranteedto solve the CTM equations if we iterate at a fixed size, there is no requirement for ourintermediate approximations to κ to have their first few terms correct.

Also, when we tried to use the Chinese Remainder Theorem in the same way as we usedit for the iterative method, we ended up attempting to take the square root of many numberswhich did not have a root. There were a few such cases in the iterative method, but it seemedto be able to ‘shrug it off’, which the renormalization group method could not.

86

We did, however, notice that the convergence at a fixed matrix size seems to be slightlyless than that given by the iterative method — for example, we managed 10 correct seriesterms from 2 × 2 matrices, as opposed to 16 for the iterative method. This implies that wenever actually reach the solution of the equations at fixed size, a conclusion which will besupported by our numerical calculations on the second-neighbour Ising model in Chapter 3.

To try our hand at something more tractable, we attempted to use the method to findnumerical values, rather than series. Here we were more successful, and managed to apply themethod to both the hard squares model and the second-neighbour Ising model. We studiedthe convergence of the method on the second-neighbour Ising model in detail. The results ofthat study are given in Section 3.3. In short, the method appears to work well for numerics,but the convergence rate depends on the values of the parameters (the interactions), or morespecifically, how close the parameters are to a critical point or line. The closer to criticality,the worse the convergence.

By using numerics, we were able to run the method up to 20×20 matrices in a relativelyshort time. For calculating numerical values, this method is more attractive than the iterativemethod, because although it is less accurate at any matrix size, it is able to reach largermatrix sizes than the iterative method in the same time. This means that we can get moreaccurate approximations to κ with the renormalization group method.

We compared our approximations to κ (denoted by κ) at z = 1 with the value

κ = 1.5030480824753322643220663294755536893857810 (2.142)

found in [18]. A log-log plot of the results is shown in Figure 2.27 (the unscaled plot is notvery informative because the convergence is too rapid). It seems that the convergence roughlyobeys a power law, which is encouraging. Unfortunately, after this size, the approximationstend to fluctuate without converging, which we think is due to finite-iteration error.

2.10.2 Technical notes

We encountered similar problems to the iterative method when dealing with series. Withnumerics, we used the arbitrary precision real type cl R from the CLN library ([67]) forour data. To diagonalize our matrices, we first found the largest eigenvalue and correspond-ing eigenvector by means of the power method (iterated 15 times), eventually finding theeigenvalue by using the Rayleigh quotient

vTAv

vTv. (2.143)

To speed convergence, if we diagonalized the matrix in a previous iteration, we used theeigenvector from that previous diagonalization as a starting vector for the next diagonal-ization. After finding the eigenvalue and eigenvector, we then used the result below ([122,Theorem 9.5.1]) to deflate the matrix (i.e. turn it into a matrix with the same eigenvaluesexcept for the largest one).

Lemma 2.10.1. If a symmetric n× n matrix A has eigenvalues λ1, λ2, . . . , λn, where λ1 is

87

-80

-70

-60

-50

-40

-30

-20

-10

0

0 0.5 1 1.5 2 2.5 3

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

ln(κ− κ)

ln(matrix size)

Fig. 2.27: Log-log plot of approximated κ vs. final matrix size

the largest eigenvalue, then if vi is the normalised eigenvector of A corresponding to λi, thesymmetric matrix

A− λ1v1vT1 (2.144)

has eigenvalues λ2, λ3, . . . , λn, 0, with corresponding eigenvectors v2,v3, . . . ,vn,v1 respec-tively.

Finally we repeated the process (generating the next largest eigenvalue every time) untilthe desired number of eigenvalues had been found.

2.10.3 Efficiency

To find the efficiency of the method relative to the matrix size, suppose we wish to run thealgorithm to a m×m matrix size, while keeping s series terms (where s = 1 if we are doingnumerics). As above, series of length s require O(s) operations for addition, and O(s2)operations for multiplication. For m × m matrices, addition of two matrices will requireO(m2s) operations, and multiplication will take O(m3s2) operations.

Now each expansion of the matrices requires each element to be multiplied by a ω, andin the case of A, added together. This requires O(m2s2) operations. On the other hand,the reduction requires the multiplication of matrices (albeit a fixed number) which takesO(m3s2) operations. If we are using diagonalization, we need to find around m eigenvalues(remember we are diagonalizing 2m×2m matrices), each of which requires a fixed number ofmatrix-vector multiplications (which take O(m2s2) operations). Therefore diagonalization

88

also takes O(m3s2) operations, which means that that is the number of operations neededfor one iteration (expansion and reduction).

Again, we have a fixed number of iterations at each matrix size, and m − 1 matrix sizeexpansions. As our ‘average’ matrix size will again be linear in m, the number of operationsrequired for the renormalization group method is O(m4s2), the same as the iterative method.

For numerical calculations, we have s = 1, so the algorithm takes O(m4) time. If theerror of our approximations do indeed obey a power law in relation to matrix size, then thisimplies that it also obeys a power law in relation to time taken. This means that the methodis still an exponential time algorithm (in that it takes αn time to produce n digits of κ).

On the whole, the renormalization group method is much faster than the iterative methodin our testing. One reason is because the iterative method not only iterates within eachmatrix size, but at every pass (recalculation of every matrix once) it uses the power method,which is itself an iterative matrix method. The renormalization group, if using the Arnoldimethod, does not have this inefficiency. If it uses diagonalization, it does have to use aniterative matrix method, but even so, the number of matrix multiplications needed is far lessthan that needed for the iterative method.

2.11 Conclusion

In this chapter, we have looked at the problem of corner transfer matrices, and how togenerate solutions for statistical mechanical models using Baxter’s CTM equations. Firstly,we (rather laboriously) re-derived the CTM equations, paying special attention to the hardsquares model. Then we proposed two methods. One was our own invention, based oniterating through the CTM equations; the other was based on the renormalization groupmethod of Nishino and Okunishi. Neither of these methods yielded exactly what we wanted— the iterative method was too slow to improve significantly on Baxter’s hard square seriesof 25 years ago, while the faster renormalization group method did not work at all for series.

Although we have a fairly good understanding of the CTM equations and how they work,and we have devised some methods to exploit them, the fact of the matter remains that wehave not really achieved the breakthrough in efficiency that Baxter found in 1978, and thatthe CTM methods have promised ever since. This then raises the question: what otheravenues can we take to further the CTM idea?

For a start, considering that the renormalization group method works very efficiently fornumerics, it seems strange that it should fail so badly for series calculations. It is possiblethat we are missing some small change, or error, in the method that would allow it to workfor series. Certainly this is something to work on, although at the moment we have no ideawhat that small change or error might be. (It is also possible that the failure of the algorithmmight be due to programming bugs, although this seems unlikely!)

Other possibilities include attempting to apply both methods to other models. One modelthat has been frequently mentioned as a possibility is the q-state Potts model. This is anextension of the Ising model where each spin has q possible values, and interacts only withnearest-neighbour spins with the same value. Owing to the great increase in the number

89

of possible configurations, physical quantities for the q-state Potts model have not beencalculated as accurately as many people would like.

For the CTM methods, there is an obvious extension which would make the Potts modelquantities calculable (simply by not restricting spins to only 2 values, and then adjustingω). Although this is not very efficient, there is a possibility that there may be a symmetrybetween the non-zero states that can be exploited to further the method. Some thought hasbeen given in that direction, albeit with little success as yet.

Another, more immediate model that we can apply the CTM methods to is the second-neighbour Ising model, as defined in Chapter 1. In fact, we have done significant studieson this model with the renormalization group CTM method and the previously-used finitelattice method. This is the topic of the next chapter.

90

3. THE SECOND-NEIGHBOUR ISING MODEL

3.1 Introduction

In Chapter 1, we gave a brief introduction to statistical mechanical models and how theyarise. As mentioned, one of the most studied (if not the most studied) of these models is thespin-1

2square lattice Ising model, often referred to simply as the Ising model. This model

takes into account two types of magnetic interaction: an external field of strength H (whichacts on single spins), and an interaction of strength J (which acts on nearest-neighbour pairsof spins). The Hamiltonian that arises from these two interactions is

H(σ1, σ2, . . . , σN ) = −J∑

<i,j>

σiσj −H∑

i

σi, (3.1)

with the partition function defined as

ZN =∑

σ1,σ2,...,σN

e−βH(σ1 ,σ2,...,σN ). (3.2)

The Ising model is a very useful model to study, because it is relatively simple, andfurthermore a simple case (zero-field) has been solved exactly (in fact, it has been solvedon a variety of two-dimensional lattices — see [14], [42] and [43]). However, the assumptionthat the magnetic spin-pair interaction only applies to spins/atoms that are one unit or lessapart seems rather unrealistic. In reality, there would be magnetic interactions between allspins, but the strength of such an interaction would decrease very rapidly over distance.

On the other hand, even when the interaction is restricted to nearest-neighbour spins, themodel is hard to solve. If we removed this restriction entirely, the model would become muchharder. So we compensate, by allowing the interaction to affect nearest-neighbour pairs andsecond-nearest neighbour pairs only, with different strengths. We let the original nearest-neighbour interaction have strength J1, and let the second-nearest neighbour interactionapply in a similar manner, but with strength J2. This is an IRF model — the interactionsare shown in Figure 3.1. The Hamiltonian of the system is

H(σ1, σ2, . . . , σN) = −J1

∑

<i,j>

σiσj − J2

∑

<i,j>2

σiσj −H∑

i

σi (3.3)

where the second sum is over all second-nearest neighbour pairs of spins. We call this modelthe second-neighbour Ising model.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1 J1

J1

J1

J2

J2

H

H H

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)

iterationsmatrix size

uv

m(u, 0)(1

2−m)8

Fig. 3.1: The interactions around a cell for the second-neighbour Ising model.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.2: A lattice with two different types of spins (filled and hollow).

Physically, the second-neighbour Ising model is also appropriate when there is more thanone type of atom in the lattice. For example, if the lattice consisted of two different types ofatoms, interleaved with each other as in Figure 3.2, then all nearest-neighbour bonds wouldlink one of each type of atom, while all second-neighbour bonds would link two atoms of thesame type. The nearest-neighbour bonds would then all have the same strength, as wouldall the second-neighbour bonds, but we would expect these strengths to be different fromeach other — they might possibly even have different signs.

The second-neighbour Ising model, although well-studied, has not been studied nearlyas much as its simpler first-neighbour counterpart. In part, this is due to the complexity ofthe model: there are many more possibilities when another variable is added to the mix. Inparticular, this leads to much more complex phase transitions.

A phase transition occurs when there is a singularity in the free energy (or equivalently,the partition function). Physically, this is indicated by a drastic change in the properties ofthe magnet when one of the external factors is changed. For example, in our case of a magnet,the interaction strength between particles cannot generally be changed experimentally, butthe temperature can. If the temperature is below a certain temperature, called the criticaltemperature, then in the absence of an external magnetic field, the magnet will retain itsown magnetism. This phenomenon is called spontaneous magnetism. On the other hand, ifthe temperature is above the critical temperature, then in the absence of the external field,the magnet loses its magnetism, i.e. ceases to be a magnet.

If we denote the critical temperature by Tc, then the regime T < Tc is called the low-temperature phase, while the regime T > Tc is the high-temperature phase. At the boundary

92

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.3: A typical configuration in the ferromagnetic low-temperature phase of the Ising model.

(T = Tc), we therefore have a phase transition. In some other models, even more dramaticbehavioural changes are possible; for instance, in the model of a liquid, the liquid will becomea gas at the phase transition point.

In the case of the simple (first-neighbour) Ising model, the model has two zero-fieldphase transitions, at the critical points tanh Jc

kTc= ±(

√2 − 1), H = 0. This splits the range

of possible temperatures into 3 phases. In Figures 3.3 to 3.5, we show what the model lookslike in zero field in these phases. Figure 3.3 shows the ferromagnetic low-temperature phase(T < Tc, J > 0), where like neighbours are strongly encouraged. This means that mostspins take the same value. Figure 3.4 shows the anti-ferromagnetic low-temperature phase(T < Tc, J < 0), where unlike neighbours are strongly encouraged. This gives an alternatingpattern. Figure 3.5 shows the high-temperature phase (T > Tc), where the interaction isweak. This makes the spins look random.

Now, we wish to find the zero-field phase transitions of the second-neighbour Ising model.Naturally, the model with J2 = 0 is equivalent to the first-neighbour Ising model, andtherefore it contains the same critical points as the nearest-neighbour model. However,when J2 6= 0, the model also has phase transitions. It turns out that when −

√2 + 1 <

tanh J2

kT<

√2−1, the model always has two critical points. These critical points form a line

in the J1J2 plane, which we call the critical line.It is worth noting that both the simple and second-neighbour Ising model possess a

93

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.4: A typical configuration in the anti-ferromagnetic low-temperature phase of the Isingmodel.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.5: A typical configuration in the high-temperature phase of the Ising model.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.6: We can divide the lattice into two sets of spins such that every nearest-neighbour bondconnects one spin from each set.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

(a) A configuration of spins. There are 24 likebonds and 16 unlike bonds. The spins about tobe reversed are indicated.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

(b) The same configuration with half thespins reversed. There are now 16 like bondsand 24 unlike bonds.

Fig. 3.7: The Ising model is symmetrical in the parameter J .

large amount of symmetry with respect to spin values. In the case of the first-neighbourIsing model, there is a natural one-to-one correspondence between configurations in theferromagnetic phase (J > 0) and configurations in the anti-ferromagnetic phase (J < 0) ifthe external field H is 0. This can be seen by observing that we can divide the lattice intotwo sets of spins, as shown in Figure 3.6. Every nearest-neighbour bond connects one spinfrom each set.

Now, if we take a particular configuration of spins, and reverse the values of all the spinsin one set, we will reverse (i.e. switch like bonds with unlike bonds) every nearest-neighbourbond in the lattice. This creates a correspondence between a configuration in the model withinteraction strength J and another configuration in the model with interaction strength −J .The two configurations will have exactly the same weight, since we have no external magneticfield. We illustrate this in Figure 3.7.

This correspondence shows that the partition function of the simple Ising model is un-changed if the interaction strength is reversed. Therefore, any critical point has a corre-sponding critical point with equal but negative interaction strength, so the critical pointsalways come in pairs.

For the second-neighbour Ising model, a similar symmetry applies. We cannot decouplethe lattice into two sets of spins for which the bonds only link one spin from each set,but we can take the same sets that we used for the first-neighbour Ising model. When wereverse all the spins in one set, we reverse all first-neighbour bonds, but all second-neighbour

96

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.8: With no nearest-neighbour interaction, the lattice decouples into two separate squarelattices.

bonds stay the same. Thus we can reverse the nearest-neighbour interaction, but keep thesecond-neighbour interaction the same, and get the same partition function. In particular,this means that the critical line for the second-neighbour model is symmetrical about the J2

axis.A bit of reflection (no pun intended) shows us that when J1 = 0, there is no interaction

between nearest-neighbour sites on the lattice, even indirectly. Thus the lattice decouplesinto two square lattices, each of which have nothing to do with the other. We show this inFigure 3.8.

If we do this, the original second-neighbour interaction becomes a first-neighbour in-teraction on the two separate lattices. Therefore each lattice acts as an independent first-neighbour Ising model, and has the same critical points. Thus we know that the second-neighbour Ising model also has a phase transition when J1 = 0 and tanh J2

kT= ±(

√2 − 1).

It turns out that the topmost critical point (J1 = 0, tanh J2

kT=

√2 − 1), is connected to

both the J2 = 0 critical points by a critical line. On the other hand, the other critical point(J1 = 0, tanh J2

kT= −

√2 + 1) is actually on a new critical line altogether.

We view the critical lines by transforming into the variables u = tanh J1

kTand v =

tanh J2

kT, which we will use from now on. This enables us to view the entire realm of physical

possibilities in four unit squares. In these variables, the critical lines look approximately likethose shown in Figure 3.9.

We call the diagram of the critical lines the phase diagram. From this diagram, it can beseen that there are 4 different phases. The top right phase is the low-temperature ferromag-netic phase, where like nearest-neighbour spins are encouraged and the second-neighbourinteraction is either positive or insignificant compared to the nearest-neighbour interaction.An example of a likely configuration in such a phase has been shown in Figure 3.3.

The top left phase is the low-temperature anti-ferromagnetic phase, where unlike nearest-neighbour spins are encouraged, and again the second-neighbour interaction is positive orinsignificant. Note that any nearest-neighbour interaction (positive or negative) encourageslike second-neighbours. An example of a likely configuration in this phase has been shownin Figure 3.4.

The middle phase is the high-temperature phase, in which the interactions are too weakto enforce spontaneous magnetization, or work against each other. A typical configuration inthis phase has been shown in Figure 3.5. Notably, in these three phases, the second-neighbourinteraction never overrides the first-neighbour interaction. However, in the lowest phase, the

97

1

1−1

−1

Low−temperatureferromagnetic

Low−temperatureanti−ferromagnetic

High−temperature

Super anti−ferromagnetic

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

√2 − 1

√2 − 1

1 −√

2

1 −√

2

Fig. 3.9: An approximate phase diagram in the variables u and v.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.10: A typical configuration in the super anti-ferromagnetic phase of the second-neighbourIsing model.

anti-ferromagnetic second-neighbour interaction does override the first-neighbour interaction,forcing the lattice to decouple into two separate anti-ferromagnetic first-neighbour Isingmodels. This gives a characteristic ‘same row’ or ‘same column’ look to the spins. We showa likely configuration in this phase, called the super anti-ferromagnetic phase, in Figure 3.10.

The free energy and all its derivatives (including most physical quantities of interest) aresingular at a phase transition, and they tend to behave according to a power law close to thecritical point. For example, in the first-neighbour Ising model, the magnetization per sitem obeys such a law in terms of the temperature. In this chapter, we say that f(x) ∼ g(x)

as x → x0 if and only if limx→x0

ln f(x)ln g(x)

= 1. Note that this is a slightly different meaning tothat used in other chapters. Using this notation, m obeys the law

m(T ) ∼ (Tc − T )1

8 as T → T−c . (3.4)

Many physical quantities obey power laws at criticality, with varying exponents. Theseexponents are called critical exponents.

In Figure 3.9 we gave a rough sketch of what the phase diagram is estimated to look like.We would like to locate the critical lines more precisely. The location of the critical lines isa well-studied problem, with many different methods applied to it. These methods includeclosed-form approximations (starting from 1951 in [45], and continuing with [53], [61], [34],[63] and [23]), series expansions (starting from 1969 in [39], continuing with [112], [123] and[79]), Monte Carlo methods ([92], [25] and [28]), and renormalization-group theory ([101],[102] and [133]). More recently, the cluster variation method ([100]) has also been used toapproximate the shape of the critical lines.

Because the second-neighbour Ising model can be expressed as an IRF model, we can usethe CTM methods described in Chapter 2 to calculate quantities of the model, and throughthese, the location of the critical line. We do this later in this chapter. For this purpose,we have used the more efficient renormalization group method. However, we found thatfor unknown reasons (possibly a breakdown of symmetry), the method breaks down in thesuper anti-ferromagnetic phase. For this phase, we instead to used the finite lattice method(FLM) that we mentioned in Chapter 2. As we approach the critical point, the FLM is notas efficient as the CTM, but in the absence of the latter, we make do with the former.

While we are interested in the location of the critical line everywhere in the phase diagram,we are particularly interested in the crossover point, which is (0,

√2 − 1) — in particular,

in the behaviour of the critical line as it approaches this point from below. It has beenspeculated (in, for example, [23]) that the line has a cusp at that point, and that as itapproaches the point, it obeys a power law:

vc(0) − vc(u) ∼ uξc as u→ 0 (3.5)

where vc(u) denotes the critical v for a given u (usually pertaining to the higher critical line).The exponent ξc is called the crossover exponent, and has been estimated, using a scalingassumption, to be 4

7([48]). We will investigate the behaviour of the critical line at this point

later in this chapter.

99

Another aspect of the phase diagram that we are also interested in is the lower phaseboundary. It has been observed that on the upper critical line, the various critical exponentsalways stay the same, a phenomenon called universality. However, it was suggested by vanLeeuwen in 1975 ([133]) that the lower phase boundary is not universal. In 1979, Barber([8]) confirmed this by showing that the critical exponent for the specific heat is not constanton the lowest critical line. Other studies have also looked at the critical exponents onthis line, using Monte Carlo methods ([126], [92], [25], and [2]), series expansions ([112]),renormalisation-group calculations ([102]) and coherent-anomaly methods ([127] and [99]).We apply our methods to study both the location of this line and the critical exponents onit.

In Section 3.2, we outline the finite lattice method and how it works. In Section 3.3, weconduct a detailed analysis of the convergence of the renormalization group CTM methodwhen applied to the second-neighbour Ising model. In Section 3.4, we give a brief primer onscaling theory and generalised homogeneous functions, which we then use in Section 3.5 toderive a scaling estimate of the crossover exponent. Then we turn to our methods to verifyour estimate. In Section 3.6, we discuss some ways to estimate the location of the criticallines, using both numerical and series calculations. We do the actual calculations in Section3.7, as well as estimating the crossover exponent and the critical exponent along the lowercritical line. Finally, in Section 3.8, we recap what we have done and look at possible waysto take this research further.

3.2 The finite lattice method

3.2.1 Finite lattice approximation

In our analysis of the second-neighbour Ising model, we use the renormalization group cornertransfer matrix method that we described in Chapter 2. However, as the CTM method is stillrather experimental, we also analyse the model with the established finite lattice method.In particular, we have not managed to get the CTM method to work in the bottom phase,but the FLM still works there. In this section, we describe the finite lattice method and howit works. The majority of this section is taken from [50].

The finite lattice method, like the CTM method, attempts to calculate various quantitiesof interest in a statistical mechanical model. As stated previously, the most important ofthese quantities is the partition function, ZN , and the partition function per site, κ.

The starting point of the finite lattice method comes from the fact that the free energyper site can generally be expressed as a connected graph expansion — that is, as a serieswhere the coefficients depend on the numbers of some type of connected graph. In otherwords, the FLM assumes the existence of an expansion of the form

ψ =∑

α

bαφα(z) (3.6)

where the sum is over all connected graphs α, bα is the number of ways α can be embedded

100

in the lattice (divided by N), and φα(z) is the contribution to the free energy for the graphα. If, instead of the infinite lattice, we then apply a similar assumption to a finite lattice γ,then we assume that

ψγ =∑

α⊆γη(γ, α)φα(z) (3.7)

where η(γ, α) is the number of ways α can be embedded in γ. Here we have taken ψγ to bethe free energy of the lattice γ (as opposed to the free energy per site).

Now take a set of connected graphs A which contains γ, and has the property that anysubgraph of an element of A must also belong to A. We can then rewrite Equation 3.7 sothat the sum is over all elements of A, since η(γ, α) 6= 0 if and only if α ⊆ γ. Since γ isarbitrary, we can then treat the equation as a component of a matrix-vector equation. Thematrix in this equation has elements η(γ, α), so we can order the graphs so that it is lowertriangular. By construction, its diagonal elements are non-zero, so it is invertible. If theinverse has the elements ν(α, γ), then

φα(z) =∑

γ∈Aν(α, γ)ψγ (3.8)

for all α ∈ A.This then implies that we can approximate the free energy per site of the infinite lattice

by

ψ ≈∑

α∈Abαφα(z) =

∑

α∈A

∑

γ∈Abαν(α, γ)ψγ =

∑

γ∈Aaγψγ (3.9)

where aγ =∑

α bαν(α, γ). Thus the free energy per site can be approximated by a linearcombination of the free energies of finite sublattices. The more graphs we put in A, thebetter the approximation becomes.

At this point, this approximation is not particularly useful, since A contains all subgraphsof any of its elements, and the number of subgraphs of even one graph grows very quicklywith the number of vertices of that graph. However, we can simplify this to get a usefulapproximation. We define Amax to be a maximal set where no element is a subgraph ofanother element. Then, if we take A to be the set consisting of Amax and all subgraphs ofall elements of Amax, it can be proved ([70]) that the only elements γ ∈ A for which aγ 6= 0are the graphs which are intersections of any number of elements of Amax. This drasticallyreduces the number of graphs which we must sum over.

For the square lattice, we take Amax to be the set of rectangles of fixed perimeter 2k.We define lattice rectangles to include all vertices inside the perimeter, which is rectangular.Then the elements of A for which aγ 6= 0 are rectangles of perimeter ≤ 2k. Thus we canrewrite our approximation as

ψ ≈∑

m,n

am,nψm,n (3.10)

where the sum is over all positive m,n such that m + n ≤ k. We have taken ψm,n to be thefree energy of an m× n rectangular lattice. Multiplying by −β and exponentiating gives us

101

the equivalent expression for the partition function per site:

κ ≈∏

m,n

Zam,nm,n (3.11)

where Zm,n is the partition function of a m × n lattice. This approximation becomes moreaccurate as k becomes larger, and in the limit k → ∞, it will give the exact partitionfunction.

In fact, by using rectangles, we save even more computation than the above equationwould indicate, because it turns out that the coefficients am,n vanish if the half-perimeter isless than k − 3. From [50], the coefficients required are

am,n =

1 if m+ n = k−3 if m+ n = k − 1

3 if m+ n = k − 2−1 if m+ n = k − 3

0 otherwise.

(3.12)

If the model has reflection symmetry in the line at an angle of π4

to the horizontal, thenZm,n = Zn,m and we can enumerate only the rectangles for which m ≤ n. Then we use thecoefficients a′, where

a′m,n =

am,n if m = n2am,n if m < n

0 otherwise.(3.13)

3.2.2 Transfer matrix method

The finite lattice approximation gives us an efficient way of calculating the partition functionper site from finite lattice partition functions. This still leaves us with the problem of findingthose partition functions. The most obvious method, of course, is to simply calculate themvia direct enumeration — we need to sum over all possible configurations of spins, but thisis a finite sum and relatively easy to calculate. However, although direct enumeration ispossible, it quickly becomes infeasible due to the large number of calculations required, asmight be expected.

The method that is generally used to calculate the finite lattice partition functions isthe transfer matrix method. We described the basics of transfer matrices in the context ofan infinite lattice in Section 2.2. The same ideas apply here, except that instead of findingthe partition function of an infinite lattice by taking the maximum eigenvalue of an infinite-dimensional matrix, we can express the partition function as the trace of a finite power of afinite transfer matrix:

Zm,n = Tr V n. (3.14)

The above equation was derived by assuming toroidal boundary conditions. As we nowwish to calculate the partition function of finite lattices, this is no longer appropriate. For-

102

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

ψ1 ψ2V

Fig. 3.11: Transfer matrices for a finite lattice. Hollow spins are not in the lattice.

tunately, we can apply the same principle by using fixed boundary conditions — we picturethe finite lattice as a subset of an infinite lattice, and give all spins outside the finite latticea fixed value (which may differ for each spin). This value is usually in alignment with themost likely state, which we call the ground state. The spins outside the lattice contributeonly through interactions with spins in the lattice; they do not affect the partition functionotherwise.

If we modify V so that it includes interactions with the ground state at the top andbottom edges of the finite lattice, and remove the toroidal boundary conditions, then wecan still calculate the finite lattice partition function. We set ψ1 and ψ2 to be vectors withelements that are the interactive contributions to the partition function from the left andright edges of the lattice, respectively, given the spin values on those edges. For example,[ψ1](1,1,...,1) is the interactive contribution from the left edge of the lattice if all spins on thatedge have the value 1. Then we have

Zm,n = ψT1 Vnψ2. (3.15)

We illustrate the new arrangement in Figure 3.11.We would like to calculate the finite lattice partition function by using this equation

directly, but the dimension of V is the number of possible spin states on a cut of m spins,which is 2m. Thus V contains 4m entries, and in (for example) the Ising model, every oneof these elements is non-zero (since no configuration is directly prohibited). The number ofcalculations required therefore becomes prohibitively large very quickly as the lattice sizeincreases.

To overcome this problem, we break V up, in much the same way that we decomposed

it into single cells in Section 2.4.1. If we define ω

(

a bc d

)

to be the weight of a single cell

with spins a, b, c, and d, then we have

V(σ1,σ2,...,σm),(σ′1,σ′

2,...,σ′m) = eb(σ1, σ

′1)et(σm, σ

′m)

m−1∏

i=1

ω

(

σi σ′i

σi+1 σ′i+1

)

(3.16)

103

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.12: Single-cell transfer matrices. The first moves an n-spin cut to an n + 1-spin cut; thesecond moves the cut further to the right and down; the third reduces it to n spins.

where eb and et are the edge contributions. However, unlike what we did in Section 2.4.1,we do not split the weight of spins or bonds which lie in more than one cell. Instead, we setω so that it contains all the weight of its bottom right spin, and the edges adjacent to it.This introduces an asymmetry into ω.

For example, take the simple Ising model in the zero-field low-temperature phase. Wenormalise the Boltzmann weights so that the configuration with all 1 spins has weight 1.Then a like bond has a weight of 1, while an unlike bond has a weight of z = e−2βJ . We setall the boundary sites (not in the finite lattice) to be 1. This gives

ω

(

a bc d

)

= z(1−bd)/2z(1−cd)/2, et(a, b) = z(1−ab)/2z(1−b)/2, eb(a, b) = z(1−b)/2. (3.17)

Now we can break V down as a product of transfer matrices:

V = Wl

(

m∏

i=1

Wi

)

Wr (3.18)

where Wi is a matrix which transfers a single cell, given a ‘split’ cut of n+ 1 sites. Wl addsthe weight generated by the interaction of the top edge with the boundary spins, and turnsthe cut from an n-spin cut into an n + 1-spin cut (which means that it is not square). Wr

does the reverse, adding the weight of the bottom boundary and reducing the cut to n sites.For example,

[Wi](σ1,σ2,...,σm+1),(σ′1,σ′

2,...,σ′m+1

) =

ω

(

σi σ′i

σi+1 σ′i+1

)

if σj = σ′j for all j 6= i+ 1

0 otherwise(3.19)

and Wl and Wr are defined similarly, including interactions with the boundary conditions.Each of these matrices transfers a single cell, as illustrated in Figure 3.12. For more details,see [29].

Since each of these matrices has at most two non-zero elements in each row, they are

104

sparse, and we therefore do not have to devote much storage to them. In fact, we can justnot store them at all and multiply them implicitly. To find the partition function of a finitelattice, we start with ψ1, which is usually easy to find, and then (implicitly) multiply by thetransfer matrices in the prescribed order. We can then use the finite lattice approximationto reconstruct an approximation for the infinite lattice partition function per site.

3.2.3 The Ising model — an example

To illustrate the finite lattice method, we will construct an approximation for the partitionfunction per site of the simple Ising model. We first illustrate the TM method by finding thepartition function of a 2 × 2 lattice, expanding in the low-temperature variable z = e−2βJ .We start from a ground state of all 1s, to which we assign a normalised weight of 1. Each likebond has a weight of 1, and each unlike bond has a weight of z. We order the possible cutsalong a column in decreasing lexicographical order (which means that we order according tothe topmost spin, then the next topmost, and so on). We then have the transfer matrices

Wl =

1 0 0 0 z2 0 0 00 1 0 0 0 z2 0 00 0 z 0 0 0 z 00 0 0 z 0 0 0 z

(3.20)

W1 =

1 0 z2 0 0 0 0 00 z 0 z 0 0 0 01 0 z2 0 0 0 0 00 z 0 z 0 0 0 00 0 0 0 z 0 z 00 0 0 0 0 z2 0 10 0 0 0 z 0 z 00 0 0 0 0 z2 0 1

(3.21)

Wr =

1 0 0 00 1 0 00 0 z 00 0 0 z1 0 0 00 1 0 00 0 z 00 0 0 z

. (3.22)

Furthermore, the starting vectors are

ψT1 =(

1, z3, z3, z4)

(3.23)

andψT2 =

(

1, z, z, z2)

. (3.24)

105

Multiplying these out gives the normalised partition function

Z2,2 = ψT1 WlW1Wrψ2 = 1 + 4z4 + 4z6 + 7z8. (3.25)

To get an approximation for κ, we apply this procedure to the 1 × 1, 1 × 2 and 1 × 3lattice, which (without going into the details) gives

Z1,1 = 1 + z4 (3.26)

Z1,2 = 1 + 2z4 + z6 (3.27)

Z1,3 = 1 + 3z4 + 2z6 + 2z8 (3.28)

and thereforeκ ≈ Z3

1,1Z−61,2Z

21,3Z2,2 = 1 + z4 + 2z6 + 5z8 − 14z10 + . . . (3.29)

which is accurate up to the z8 term. The z10 term should have coefficient 14.

3.3 Convergence of the CTM method

When we use the CTM method to approximate quantities of the second-neighbour Isingmodel, we would like to know how accurate the approximations are. In this section we studythe convergence of the CTM method for this model — in particular, we would like to knownot only the rate of convergence, but if the method converges at all!

As we use the (spontaneous) magnetization to find our phase boundaries, we will look athow our calculated value for this quantity behaves as we execute the algorithm. We wouldexpect other quantities (like partition function per site, isothermal susceptibility etc.) toconverge in a similar manner. Note that due to ease of programming, we have made thespins take the values 0 or 1, rather than -1 or 1. Furthermore, the 0 spin now correspondsto what was the 1 spin. This means that the magnetization m is now what would have been12−m.

Recalling the renormalization group CTM method from Section 2.10, the method worksby finding values for the matrices A(a) and F (a, b) where a and b take all spin values. Thesematrices are fixed at a specific dimension (starting at 1×1). We then execute a fixed numberof ‘passes’, where each pass consists of expanding, and then reducing, the matrices accordingto the formulas given in Section 2.10. After this is done, we increase the dimension of thematrices by 1, and start again. The process ends when we reach a pre-determined matrixsize, upon which we calculate the quantities of interest using formulas given in Section 2.6.

Theoretically, it is difficult to determine how this process would converge. We assumethat it does in fact converge, an assumption which is generally borne out in practice. Wewould also like to know what the rate of convergence depends on. At first glance, there aretwo factors which prevent us from achieving an exact value. One is that we cannot expandthe matrices to infinite size. The other is that we cannot iterate the method at a fixed matrixsize for an infinite number of iterations. Therefore, we will analyse the convergence of therenormalization group CTM method as it depends on the final matrix size and the number

106

of iterations calculated at that final matrix size.It turns out that there is a third factor that affects the convergence — the point on the

phase diagram that we evaluate the magnetization at, i.e. the values of the parameters. Thisdoes not introduce an error directly, but affects the convergence of the other two factors.

3.3.1 Number of iterations

Firstly, we look at how the number of iterations in the final size affects the calculations.This effect is dependent not only on the location of the evaluation point, but also on thefinal matrix size. To measure the convergence, we fixed the parameters, ran the algorithmup to a fixed size, then iterated the method up to 1000 times, calculating the magnetizationat every iteration.

The general pattern seems to be that the calculated magnetization (which we denote bym) gradually converges to a final value, as we would hope. The nature of this convergenceis variable, however. If the evaluation point is far from the critical line, the convergence isgenerally monotonic. If the evaluation point is near to the critical line, however, all sorts ofbehaviour can occur. For low final matrix sizes, m again seems to be monotonic, as shownin Figure 3.13. Once we start making the size larger, we may get oscillatory convergenceto a final value if we are lucky, as shown in Figure 3.14; or we may get periodic oscillationsaround a value, without actually converging, as shown in Figure 3.15; or m may move inthe vicinity of a value, without converging or having any noticeable periodic behaviour, asshown in Figure 3.16. If we are unlucky, we may even encounter divergent behaviour, asshown in Figure 3.17.

It is difficult to say for sure exactly what effect the final matrix size has on the con-vergence of m in relation to the number of iterations. While it generally seems to be truethat oscillations occur at larger sizes, it is not always the case that the amplitude of thoseoscillations grows with matrix size. However, there does seem to be a negative relationshipbetween the size of the oscillations and the distance of the evaluation point from the criticalline.

Unfortunately, for some points and matrix sizes, the calculated magnetization does notconverge at all to the value that we wish. Because we always calculate in the absenceof a magnetic field, the magnetization that we are calculating is actually the spontaneousmagnetization — but because there is no external field, both spin values are equally likely.Therefore if m is the spontaneous magnetization, we should be equally likely to calculate avalue of 1 −m (remember that our spins now take values of 0 or 1). It sometimes happensthat our calculated figure switches between the two values, which throws the convergenceoff completely. Figure 3.18 shows an example of this happening. Furthermore, we cannoteven use our new value to find the original m, because it is possible for the calculatedmagnetization to switch more than once, which means that we never converge. In particular,this seems to happen at every iteration when we try to calculate the magnetization in thebottom phase.

Finally, when we evaluate on or very near to the critical line, it seems that the numberof iterations required to ‘settle’ the magnetization is very high (larger than 1000). This is

107

0.1377928926202

0.1377928926203

0.1377928926204

0.1377928926205

0.1377928926206

0.1377928926207

0.1377928926208

0.1377928926209

0.1377928926210

0.1377928926211

0.1377928926212

0.1377928926213

400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)

m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

Fig. 3.13: Calculated magnetization vs. number of iterations at the point (0.42, 0) with matrix size7. The value converges monotonically.

0.13779346043491

0.13779346043491

0.13779346043492

0.13779346043492

0.13779346043493

0.13779346043493

0.13779346043494

0.13779346043494

400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)

m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

Fig. 3.14: Magnetization vs. iterations at (0.42, 0) with matrix size 8. The value oscillates, butconverges.

0.137793

0.137793

0.137794

0.137794

0.137794

0.137794

0.137794

0.137795

100 200 300 400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)

m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

Fig. 3.15: Magnetization vs. iterations at (0.42, 0) with matrix size 10. The value appears to beperiodic.

0.09292250950

0.09292250951

0.09292250952

0.09292250953

0.09292250954

0.09292250955

0.09292250956

0.09292250957

100 200 300 400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

m(0.43, 0)

Fig. 3.16: Magnetization vs. iterations at (0.43, 0) with matrix size 19. The value stays around thesame value, but without discernible periodic behaviour.

0.1377937020

0.1377937025

0.1377937030

0.1377937035

0.1377937040

0.1377937045

300 400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)

m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

Fig. 3.17: Magnetization vs. iterations at (0.42, 0) with matrix size 9. The value oscillates, even-tually diverging.

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)

m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

Fig. 3.18: Magnetization vs. iterations at (0.42, 0) with matrix size 18. The value switches to 1−mhalfway.

0.22

0.24

0.26

0.28

0.3

0.32

0.34

100 200 300 400 500 600 700 800 900 1000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)

iterations

matrix sizeuv

m(u, 0)(1

2−m)8

m(0, 414, 0)

Fig. 3.19: Magnetization vs. iterations at (0.414, 0) with matrix size 8. The value is still increasingsignificantly after 1000 iterations.

shown in Figure 3.19. Fortunately, most of the non-converging or badly behaved cases tendto occur when the parameters are near the critical line.

Ultimately, we cannot really avoid the high-iteration convergence problems. We wouldjust like to give the algorithm enough time to settle so that it achieves its long-term behaviour(so that we don’t also have to deal with the short-term error). To do this we take the followingsteps:

• We compute a large number of iterations at the final matrix size (500 iterations)

• We do not calculate the magnetization directly on the critical line, and avoid calculatingit too near the line.

3.3.2 Matrix size

Now we look at how stopping the algorithm at a finite matrix size affects the calculatedmagnetization. We know that if we were to solve the equations exactly at each size, theapproximation would get better as the matrix size grows. On the other hand, for thismethod we are not solving the equations exactly, so this does not apply. To measure theerror, we fixed the evaluation point, and then ran the algorithm for 1000 iterations at eachsize, calculating the magnetization after the last iteration at each size.

Even though we cannot prove that the approximation gets more accurate monotoni-cally with size, this does in fact seem to be the case, as shown in Figure 3.20. An exact

111

0.105

0.11

0.115

0.12

0.125

0.13

0.135

0.14

0 2 4 6 8 10 12 14 16

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)

m(0.42, 0)

iterations

matrix size

uv

m(u, 0)(1

2−m)8

Fig. 3.20: Calculated magnetization vs. final matrix size at the point (0.42, 0).

measure of the nature of this convergence is difficult, owing to the error caused by the finite-iteration oscillations we observed above. However, as noted in Section 2.10.1 for κ, in thelow-temperature phase the calculated magnetization seems to depend on size by a powerlaw. We show a log-log plot of magnetization vs. size in Figure 3.21.

As we approach the critical line, the rate of convergence slows. If we look at the high-temperature phase, then for points which are far from the critical line, m is almost exactly12

for all matrix sizes (we do not gradually converge to 12), which is the exact value. An

interesting phenomenon occurs near the critical line in the high-temperature phase. Forsmaller matrix sizes, m increases monotonically, but then after a certain size, which dependson the location of the evaluation point, it jumps to almost exactly 1

2. We show this in Figure

3.22.We calculated some more data along the u-axis (where we know the exact magnetization)

to get a better picture. We found that at each matrix size, the line of calculated magnetiza-tion has a similar shape to the line of exact magnetization. However, the ‘critical point’ ateach finite size is different to the exact value. We show these lines in Figure 3.23. It appearsthat the ‘critical points’ increase monotonically with respect to matrix size, eventually con-verging to the exact value. Looking at this in another way, we can think of each matrix sizeas having its own critical line, and these lines converge to the true critical line. This gives usanother way of estimating the critical line. In Figure 3.24, we show a plot of these criticallines.

We can also look at how the ‘critical points’ converge to the actual critical line on theu-axis. It appears that they also obey a power law with respect to matrix size. Figure 3.25

112

-70

-60

-50

-40

-30

-20

-10

0

0 0.5 1 1.5 2 2.5 3

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

ln(m(0.5, 0) − m(0.5, 0))

ln(matrix size)

Fig. 3.21: Log-log plot of magnetization vs. size at the point (0.5, 0).

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0 2 4 6 8 10 12 14 16 18 20

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)

iterations

matrix size

uv

m(u, 0)(1

2−m)8

m(0.41, 0)

Fig. 3.22: Magnetization vs. size at the point (0.41, 0). At sizes higher than 2, the calculatedmagnetization is almost exactly 1

2 .

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.38 0.39 0.4 0.41 0.42 0.43 0.44

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

v

m(u, 0)

(12−m)8

Fig. 3.23: Calculated magnetization along the u-axis for final sizes 1-10. The leftmost line representssize 1, and the size increases as we move to the right.

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 0.05 0.1 0.15 0.2 0.25 0.3

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

v

m(u, 0)(1

2−m)8

Fig. 3.24: Estimated critical lines for sizes 1-5. The lowest line represents size 1, and the sizeincreases as we move upwards.

-8.5

-8

-7.5

-7

-6.5

-6

-5.5

-5

-4.5

-4

-3.5

0 0.5 1 1.5 2 2.5

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

ln(√

2 − 1 − uc)

ln(matrix size)

Fig. 3.25: Log-log plot of critical points on the u-axis vs. matrix size.

shows a log-log plot of critical points against size.

3.4 Scaling theory

For the second-neighbour model, one of the properties we would like to find is the crossoverexponent ξc. We can derive a theoretical estimate of what the crossover exponent should beby means of scaling theory. In this section, we give a quick primer on this theory. A moredetailed exposition can be found in [68], which is where most of this section is taken from,or [33].

In scaling theory, we assume that when we are very near to a critical point or line, alinear change of scale in the singular part of the free energy is equivalent to a power-lawchange of scale in the parameters. For example, consider the singular part of the free energyof the Ising model, denoted by ψs. If we fix the interaction strength J , then ψs is a functionof τ = T

Tc− 1 and the external field strength H. The assumption can then be expressed

formally asψs(λ

aτ τ, λaHH) = λaψψs(τ,H) (3.30)

for any λ, if we evaluate ψs near the critical point. This assumption is called the scalingassumption.

If ψs satisfies the scaling assumption, we call it a generalised homogeneous function. Notethat by changing λ to λ1/aψ , we can set aψ equal to 1. If we do so, we say that ψ has scalingpowers aτ and aH .

115

Generalised homogeneous functions (GHFs) have several interesting properties. In thefollowing lemmas, we will use GHFs in 2 variables, but the lemmas also apply for functionswith more or fewer variables. Firstly, any derivative of a GHF is itself a GHF.

Lemma 3.4.1. If f(x1, x2) is a GHF such that

f(λa1x1, λa2x2) = λaff(x1, x2), (3.31)

then the partial derivative

f (i,j)(x1, x2) =∂i

∂xi1

∂j

∂xj2f(x1, x2) (3.32)

is also a GHF, satisfying

f (i,j)(λa1x1, λa2x2) = λaf−ia1−ja2f (i,j)(x1, x2). (3.33)

Proof. We prove the lemma for j = 0; the extension to non-zero j is simple. Clearly f (0,0) isa GHF; so let us assume that f (n,0) is a GHF. Then

λaf−(n+1)a1f (n+1,0)(x1, x2) = λ−a1∂

∂x1

(

λaf−na1f (n,0)(x1, x2))

= λ−a1∂

∂x1f (n,0)(λa1x1, λ

a2x2)

= λ−a1∂

∂(λa1x1)f (n,0)(λa1x1, λ

a2x2)∂

∂x1

(λa1x1)

= f (n+1,0)(λa1x1, λa2x2) (3.34)

which proves the lemma by induction.

The property that makes GHFs interesting is that they follow a power law relationshipas they approach the origin along one axis.

Lemma 3.4.2. If f(x1, x2) is a GHF such that

f(λa1x1, λa2x2) = λaff(x1, x2), (3.35)

then as |x1| → 0,f(x1, 0) ∼ |x1|af/a1 (3.36)

and a similar expression holds for x2.

Proof. We prove the lemma for x1 only. By setting λ = |x1|−1/a1 in the GHF-definingequation and rearranging, we get

f(x1, x2) = |x1|af/a1f(±1, |x1|−a2/a1x2). (3.37)

116

If we set x2 = 0, we get a term of f(±1, 0) on the right-hand side, which is constant anddepends only on the sign of x1. Therefore f(x1, 0) is proportional to |x1|af/a1 .

If we now assume that ψs is a generalised homogeneous function, then all derivatives ofψs are also GHFs. In particular, since the magnetization m is related to the free energy by

m = − ∂ψ

∂H, (3.38)

the singular part of the magnetization must also be a generalised homogeneous function.This helps us to calculate some of the scaling powers of the free energy. For example, thelow-temperature spontaneous magnetization for the Ising model is known to be

m =

(

1 − (sinh2J

kT)−4

)1

8

(3.39)

for T < Tc, and therefore as τ → 0− it obeys the power law

m ∼ |τ | 18 . (3.40)

However, by Lemma 3.4.1, the scaling assumption that we made above implies that thesingular part of the magnetization, denoted by ms, satisfies

ms(λaτ τ, λaHH) = λ1−aτms(τ,H) (3.41)

and therefore, by Lemma 3.4.2, we have

ms(τ, 0) ∼ |τ |(1−aτ )/aτ . (3.42)

We can then say that1 − aτaτ

=1

8(3.43)

and therefore aτ = 89.

3.5 Scaling and the crossover exponent

We now show how scaling theory can be used to estimate the crossover exponent. This hasbeen done before (see [48]). We start off by defining some notation. Here we have chosen towork with the magnetization, but other thermodynamic quantities should work just as well.First we normalise our variables so that a critical point always occurs at the origin, and thefunction is 0 at that point. This enables us to state our results in simpler terms. Rememberthat we are using the variables u = tanh J1

kTand v = tanh J2

kT, and our spins can take the

values 0 or 1.

117

Definition 3.5.1. We define v′ to be the normalised v,

v′ = v −√

2 + 1 (3.44)

with corresponding critical line

v′c(u) = vc(u) −√

2 + 1. (3.45)

We also define ms to be the normalised magnetization per site:

ms(u, v′, H) =

1

2−m(u, v,H). (3.46)

If the last argument of ms is left out, it is assumed to be 0.

Now we make the scaling assumption. We could assume that the singular part of thefree energy is a GHF, but this is not necessary. Instead we make the (weaker) assumptionon the magnetization.

Proposition 3.5.1. As (u, v′, H) → (0, 0, 0), ms(u, v′, H) is a generalised homogeneous

function, i.e. there exists scaling powers au, av, aH such that

λms(u, v′, H) = ms(λ

auu, λavv′, λaHH). (3.47)

An obvious corollary of this assumption is that ms(u, v′, 0) is a GHF with scaling powers

au and av. Therefore, we shall work in zero field when the field is not needed. One immediateconsequence of this is that ms(u, 0) and ms(0, v

′) behave according to a power law in u andv′, respectively, when they are small.

Definition 3.5.2. We define ξ1 to be the critical exponent of the zero-field normalisedmagnetization as it approaches the origin horizontally from the right:

ms(u, 0) ∼ uξ1 as u→ 0+. (3.48)

Similarly, we define ξ2 to be the critical exponent as the origin is approached vertically:

ms(0, v′) ∼ (v′)ξ2 as v′ → 0+. (3.49)

These critical exponents are shown in Figure 3.26.From Lemma 3.4.2, it follows immediately that ξ1 = 1

auand ξ2 = 1

av. Now we are

interested in how the critical line behaves close to the crossover point. It can be shown thatunder the scaling assumption, it can indeed be described by a power law.

Proposition 3.5.2. The critical line v′c(u) obeys a power law as u→ 0+:

v′c(u) ∼ uξc as u→ 0+. (3.50)

118

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.26: Critical exponents near the crossover point.

Furthermore, the crossover exponent ξc can be expressed in terms of the other critical expo-nents:

ξc =ξ1ξ2. (3.51)

Proof. Let ε > 0 be fixed. Substituting λ =(

εv′

)1

av into the scaling assumption at zero fieldgives

ms(u, v′) =

(

v′

ε

)1

av

ms

(

( ε

v′

)auavu, ε

)

. (3.52)

Now for any u, the normalised magnetization ms is 0 at the critical point (u, v′c(u)) byconstruction. Since the critical line is not horizontal, we know that v ′c(u) is non-zero, andtherefore

ms

(

(

ε

v′c(u)

)auav

u, ε

)

= 0. (3.53)

Since u can be any number, but ms is not trivial, it must be true that(

εv′c(u)

)auavu is constant:

u = c

(

v′c(u)

ε

)auav

. (3.54)

Rearranging givesv′c(u) = εc−

avau u

avau (3.55)

which shows that v′c(u) obeys a power law in u. Furthermore, the crossover exponent is

ξc =avau

=1/ξ21/ξ1

=ξ1ξ2. (3.56)

From the above proposition and known exponents, we can conjecture the value of ξc. Asstated above, this has been done before (in terms of the temperature) in [48].

119

Proposition 3.5.3.

ξc =4

7. (3.57)

Proof. The isothermal susceptibility χ is defined as

χ =∂m

∂H= −∂ms

∂H. (3.58)

For the sake of convenience, we temporarily revert to using spins of value -1 and 1. In thehigh-temperature regime T > Tc, the magnetization is 0. Partial differentiation then givesus

χ =∂

∂H

[

limN→∞

1

ZN

∑

σ1,σ2,...,σN

σie−βH(σ1,σ2,...,σN )

]

= limN→∞

[

1

ZN

∑

σ1,σ2,...,σN

σiβ

N∑

j=1

σje−βH(σ1 ,σ2,...,σN ) − 1

Z2N

∂ZN∂H

∑

σ1,σ2,...,σN

σie−βH(σ1 ,σ2,...,σN )

]

= β limN→∞

⟨

σi

N∑

j=1

σj

⟩

− limN→∞

m

ZN

∂ZN∂H

= β limN→∞

N∑

j=1

〈σiσj〉 (3.59)

for any spin i. Without going into more details, differentiating with respect to J1 (andsimplifying) results in

∂χ

∂J1

= β2 1

NlimN→∞

∑

i,j,<k,l>

〈σiσjσkσl〉 (3.60)

when J1 = 0.Let A and B be the two sublattices which divide the lattice so that all bonds contain a

spin from A and a spin from B. Then if both i and j belong to the same sublattice, eitherσk or σl will be independent of the other three spins when J1 = 0, which would make theterm 0 (as it contains a multiple of m). Therefore (with a factor of 2) we can assume that iis in sublattice A and j is in sublattice B. This gives

∂χ

∂J1= β2 1

NlimN→∞

2∑

i,k∈A〈σiσk〉

∑

j∈B,δ〈σk+δσj〉

= β2 1

NlimN→∞

2N

2

∑

k∈A〈σiσk〉

∑

j∈B,δ〈σk+δσj〉

= 4χ2 (3.61)

where δ runs over all nearest neighbours of the origin (of which there are 4). Since u =

120

tanh J1

kT, converting back into a 0 − 1 spin system leads to the relation

∂χ

∂u

∣

∣

∣

∣

u=0

= −4χ2|u=0. (3.62)

As a derivative of a GHF (ms), χ is in fact a GHF itself, from Lemma 3.4.1. Furthermore,its scaling relation (in terms of the scaling powers of ms) is

λ1−aHχ(u, v′, H) = χ(λauu, λavv, λaHH). (3.63)

When squared, this relation gives

λ2−2aHχ2(u, v′, H) = χ2(λauu, λavv, λaHH). (3.64)

Now ∂χ∂u

is again the derivative of a GHF, so a similar scaling relation holds for it:

λ1−aH−au ∂χ

∂u(u, v′, H) =

∂χ

∂u(λauu, λavv, λaHH). (3.65)

Since these functions are proportional, they must have the same scaling powers. Therefore

1 − aH − au = 2 − 2aH (3.66)

which implies thatau = aH − 1. (3.67)

We know that when u = 0, the system decouples into two lattices where the original 2nd-neighbour bonds become nearest-neighbour bonds, as was shown in Figure 3.8. In this case,the magnetization is identical to that of a first-neighbour Ising model with nearest-neighbourinteraction strength J2. We know that the spontaneous magnetization of the Ising modelhas critical exponent 1

8; therefore ms(0, v

′) ∼ (v′)1

8 as v′ → 0+. This implies that av = 8,which in turn implies ξ2 = 1

8.

Furthermore, if v′ = 0 as well, the model will behave like a simple Ising model at critical-ity. Therefore (from [129, pp. 144]), we have Ms(0, 0, H) ∼ H

1

15 as H → 0+, which meansthat aH = 15. This implies that au = 14, and therefore ξ1 = 1

14. This gives us

ξc =ξ1ξ2

=1/14

1/8=

4

7. (3.68)

3.6 Finding the critical lines

Proposition 3.5.3 gives a prediction of the crossover exponent using scaling relations. Wewould like to verify this theoretical figure using the renormalization group CTM method

121

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

Fig. 3.27: We evaluate the magnetization along vertical and horizontal lines to estimate the locationof the critical line.

(which we will refer to as the CTM method from now on). There are (at least) two ways todo this.

Probably the easiest way is to make use of Proposition 3.5.2. Since we know that theexact value of ξ1 is 1

8, all we need to do is to calculate ξ2. To do this, we evaluate the

magnetization at various points along the line v =√

2 − 1. We can then analyse the datadirectly to find the exponent. We do this in Section 3.7.

However, using Proposition 3.5.2 still means that we have used a scaling assumption.Furthermore, we are also interested not only in the crossover exponent, but also the locationof the critical lines. By calculating the location of the lines directly, we can also analysethem (without the need for a scaling assumption) near (0,

√2 − 1) to find another estimate

for the crossover exponent.Unfortunately, finding the location of the critical lines itself does present us with some

difficulties — in particular, all numerical methods, including the CTM method, becomeinaccurate near the lines. The exponent estimate is therefore even more inaccurate; generally,if we want a useful estimate of the crossover exponent, we just have to use the scalingassumption. However, finding the critical lines is useful in itself, so we will still do it. Welook at each of the lines in turn.

3.6.1 The upper line

Firstly, we estimate the location of the upper critical line using the CTM method. To do this,we use the method to evaluate a quantity of the model (we used the magnetization) alonga line in the uv plane that we know intersects the critical line. By observing the behaviourof this quantity along these lines, we can work out where the critical line intersects ourevaluation lines. We use either horizontal or vertical lines, depending on which part of thephase diagram we are looking at (near the crossover point, we use vertical lines as they seemto provide more accurate data). This is shown in Figure 3.27.

Now the values that we get along these lines have some error (induced in part by stopping

122

the algorithm at a fixed matrix size), but in terms of general shape they are similar to theactual magnetization. We show an example in Figure 3.28. In particular, the ‘critical point’at finite matrix size (which can be located by observing when the magnetization stops being12) is different to the actual critical point. We can solve this problem in several ways:

• Naively, we can simply estimate the critical point by the point where the calculatedmagnetization departs from 1

2, with a small error (so for example we take the critical

point to be the smallest u where m < 0.499). This is the least complicated but alsothe most inaccurate way.

• We know that at the actual critical line, the magnetization is constant. If we assumethat our calculated magnetization is also constant (so that the error remains the samealong the critical line), we can use the fact that a critical point lies on (

√2 − 1, 0)

to calculate the value of m on the critical line. We can then estimate the criticalpoint to occur whenever the calculated magnetization attains that value, as shownin Figure 3.29. Unfortunately we cannot justify the assumption that the calculatedmagnetization is constant on the critical line, but at least this method provides anestimate of the critical line with one point correct.

• We can make use of the fact that the critical exponent of the magnetization along thetop phase boundary is 1

8along all horizontal and vertical lines, from universality. Then

calculating (12−m)8 along any of these lines will give a curve similar to Figure 3.30.

We can then select a few points which lie slightly above the critical point and fit a lineto them, taking the critical point as the intercept of that line. We show this in Figure3.31. The problem with this method is that the power law only holds very near thecritical point, so if we take the points too far away, the intercept will be different fromthe critical point — but if we take it too near, the inaccuracy inherent in the CTMmethod again ensures an error in the estimation of the critical point!

Theoretically, the third method seems the best, but it is difficult to implement, as wedo not have a good idea of where to take the points that we fit the line to. Empirically,we found that fitting many lines along different intervals and taking the largest interceptworked well, but this has its own difficulty — it requires many calculations just to find onecritical point. In the end, we used the second method to estimate the location of the criticalline, but estimated the error in our method from the known critical point (0,

√2 − 1), since

we want to observe the behaviour of the line around that point. The results are shown inSection 3.7.

3.6.2 The lower line

The CTM method works well when estimating the location of the upper critical line. How-ever, it does not work at all in the lowest phase. Furthermore, we cannot estimate the lowercritical line from the high-temperature phase, as we use the magnetization, which is constantin that phase. Therefore, we turn to the finite lattice method in the super anti-ferromagneticphase.

123

0

0.1

0.2

0.3

0.4

0.5

0.6

0.4 0.405 0.41 0.415 0.42 0.425 0.43 0

0.1

0.2

0.3

0.4

0.5

0.6

0.4 0.405 0.41 0.415 0.42 0.425 0.43 0

0.1

0.2

0.3

0.4

0.5

0.6

0.4 0.405 0.41 0.415 0.42 0.425 0.43

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

v

m(u, 0)

(12−m)8

Fig. 3.28: Calculated (left) and actual (right) magnetization along the u-axis for matrix size 3 × 3.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.36 0.38 0.4 0.42 0.44

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

v

m(u, 0)

(12−m)8

Fig. 3.29: Estimating critical points by assuming a constant error on the critical line. For thismatrix size (3 × 3), we then estimate the critical points to be where m(u, v) = 0.224.

-5e-05

0

5e-05

0.0001

0.00015

0.0002

0.00025

0.0003

0.41 0.412 0.414 0.416 0.418 0.42-5e-05

0

5e-05

0.0001

0.00015

0.0002

0.00025

0.0003

0.41 0.412 0.414 0.416 0.418 0.42

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

vm(u, 0)

(12−m)8

Fig. 3.30: ( 12 −m)8 for calculated (left, size 3) and actual (right) magnetization.

-0.0002

0

0.0002

0.0004

0.0006

0.0008

0.001

0.0012

0.0014

0.0016

0.36 0.38 0.4 0.42 0.44

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

vm(u, 0)

(12−m)8

Fig. 3.31: Calculating critical points by fitting a line to ( 12 −m)8. Our estimate is the intercept of

the line with the x-axis — in this case, 0.411.

For the CTM method, we calculated numerics because the method was unable to produceseries. However, the FLM can produce series, which give more accurate approximations ofboth critical points and critical exponents. For the lowest phase, we found the leadingterms of the magnetization as a series in the variable w = e4βJ1, for several fixed values ofJ2. Unfortunately, we were not able to generate very long sequences, as the FLM is muchmore inefficient with second-neighbour interactions present. This is due to the fact that theerror is determined by the smallest connected bond graph that cannot fit inside any of thefinite lattices. For models with only first-neighbour interactions, the number of bonds in thisgraph is approximately equal to the half-perimeter of the lattices, but with second-neighbourinteractions, this is roughly halved. Using finite lattices with half-perimeter up to 27, wecalculated the magnetization up to order w13.

To analyse these series, we used the technique of Pade approximants. This involves fittinga rational function P (z)

Q(z)to a function whose properties we want to estimate; in this case,

we use dlog Pade approximants, which means that we fit the function to the derivativeof the logarithm of our series. This is also equivalent to fitting the series to a first-orderhomogeneous linear differential equation with polynomial coefficients.

To find (regular) Pade approximants, we first set the order of P (z) and Q(z). If P hasorder M and Q has order N , then the approximant is called the [M,N ] Pade approximant.To calculate an [M,N ] approximant, we need M + N + 1 series terms. Although it obvi-ously depends on the function being approximated, often the diagonal or close to diagonalapproximants (e.g. [N,N ] or [N,N + 1]) produce better results.

Now we set the coefficients of P and Q to be variables, except for the constant term ofQ which we take to be 1 (since we can divide P and Q by any non-zero number to makethis so). By equating with the series and multiplying out, we generate a system of linearequations which can easily be solved.

As a simple example, suppose that we wanted to find the generating function of theFibonacci series. If we take M = N = 2, we then have 5 unknown coefficients in ourapproximant. We set

P (z) = a0 + a1z + a2z2 (3.69)

andQ(z) = 1 + b1z + b2z

2. (3.70)

To be able to solve for these coefficients, we need 5 terms from the series — 1, 1, 2, 3, 5,. . . . Then we have

a0 + a1z + a2z2

1 + b1z + b2z2= 1 + z + 2z2 + 3z3 + 5z4 + . . . (3.71)

which gives

a0 + a1z + a2z2 = (1 + b1z + b2z

2)(1 + z + 2z2 + 3z3 + 5z4 + . . . ). (3.72)

126

Equating the known coefficients of z gives the equations

a0 = 1 (3.73)

a1 = 1 + b1 (3.74)

a2 = 2 + b1 + b2 (3.75)

0 = 3 + 2b1 + b2 (3.76)

0 = 5 + 3b1 + 2b2 (3.77)

which, when solved, give the expected solution a0 = 1, a1 = a2 = 0, b1 = b2 = −1. Wewill also use Pade approximants (and an extension of them) in later chapters. For moreinformation on Pade approximants, see [6] and [7].

Now we return to analysing our series with dlog Pade approximants. If a function obeysthe relation f(z) ≈ A(z − zc)

γ when z is near zc, then we have

d

dzln f(z) ≈ Aγ(z − zc)

γ−1

A(z − zc)γ=

γ

z − zc. (3.78)

Therefore, we can locate the critical point by taking the reciprocal of the smallest zero of thedenominator of the Pade approximant. Furthermore, the critical exponent can be calculatedfrom the formula

γ ≈ P (zc)

Q′(zc)(3.79)

which is invariant if P and Q are multiplied by the same function. We do this for our series,and show our results in Section 3.7.

3.6.3 The disorder point line

As a check on our critical line estimates, we also calculate the disorder point line of themodel. Informally, the disorder point line marks the transition between the points wherethe first-neighbour interaction dominates the second-neighbour interaction, and the pointswhere the reverse applies. We used the formula given for the line in [49, Equations 4.10a-c].However, there is a slight error in this paper — the right-hand side of Equation 4.10a shouldbe −(x2 + y2 + 2xyw + 2xyw2 + x2y2w). This can be easily checked from Equation 4.9 inthe paper, remembering that (with a ±1 spin system) the square of each spin is 1.

This line enables us to check the accuracy of our critical line estimates, because it mustalways lie in the high-temperature phase. This becomes relevant as u tends to ±1, sinceboth the upper and lower phase boundaries converge to -1 at these points.

3.7 Results

Firstly, we attempt to corroborate our scaling theory estimate of ξc = 47

by calculating ξ1.As outlined above, we evaluated the magnetization on the line segment 0 < u < 0.001, v =

127

0.155

0.156

0.157

0.158

0.159

0.16

0.161

0 0.05 0.1 0.15 0.2

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

m(0.0005,√

2 − 1)

1/(matrix size)

Fig. 3.32: Magnetization vs. inverse size at the point (0.0005,√

2 − 1).

√2− 1. To counter finite-size effects, we calculated the magnetization at each size (running

the algorithm for a large number of iterations at each size). An example of the results weobtained is in Figure 3.32.

As we found in Section 3.3, the critical points should have a power law relationship withthe final matrix size, but without a known intercept or an exponent, it is difficult to fit oneto the data. In the end, we ended estimating the final value graphically. We repeated thisfor various u in the line segment, and produced a log-log plot of u against m, which is shownin Figure 3.33.

We fitted a straight line through the points in the plot, as shown in Figure 3.34. Themaximum and minimum slopes give the number ξ1 = 0.0726±0.0042. We know that ξ2 = 1

8,

so by applying Proposition 3.5.2, this implies that

ξc = 0.5804 ± 0.0332. (3.80)

This interval includes 47

= 0.5714, which supports Proposition 3.5.3 and gives credence toour scaling assumption.

Now we estimate the location of the critical line, using the methods described in Section3.6. The results are shown in Figure 3.35. Interestingly, fitting a power law to this curvenear the crossover point seems to suggest a crossover exponent close to 2

3, rather than 4

7, but

this may be due to the errors in the calculation. We also show the disorder point line, whichlies between our critical lines. This plot compares well with the plots in [23].

Lastly, we estimate the critical exponent of the magnetization, also denoted by β, along

128

-1.24

-1.22

-1.2

-1.18

-1.16

-1.14

-1.12

-1.1

-1.08

-1.06

-1.04

-1.02

-9.5 -9 -8.5 -8 -7.5 -7 -6.5

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln u

lnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

ln(12− m(u,

√2 − 1))

Fig. 3.33: Log-log plot of magnetization vs. u on the line v =√

2 − 1.

-1.24

-1.22

-1.2

-1.18

-1.16

-1.14

-1.12

-1.1

-1.08

-1.06

-1.04

-1.02

-9.5 -9 -8.5 -8 -7.5 -7 -6.5

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln u

lnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

ln(12− m(u,

√2 − 1))

Fig. 3.34: Figure 3.33 with fitted lines.

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

v

m(u, 0)(1

2−m)8

Fig. 3.35: Estimated critical lines in the uv plane. The disorder point line is also shown.

the lower phase boundary. We show the estimate for each individual approximant in Figure3.36. It is apparent that the critical exponent changes continuously along the boundary.However, as u becomes larger, the estimates vary wildly, and are very unreliable; this maybe due in part to the presence of ‘double roots’, where the smallest zero of the denominatoris very near to another zero, which throws off the estimate. However, the variation in theestimates is surprisingly small for large u (above 0.9), though we do not know why this isso. The estimates suggest that as u tends to 1, the exponent tends to 0.

In other papers, the critical exponent is often plotted against − J2

J1. For comparison, we

do the same in Figure 3.37. A plot of the same quantity is given by Alves and de Felıcio in[2, Figure 8], who produced their plot via Monte Carlo simulations. The two plots look verysimilar, and it is encouraging to see that two completely different methods produce suchsimilar results.

3.8 Conclusion

In this chapter, we have used the renormalization group CTM method and the finite latticemethod to estimate the crossover exponent and the location of the critical lines for thesecond-neighbour Ising model. After some discussion where we introduced the finite latticemethod and analysed the convergence of the CTM method for this particular model, weshowed how scaling theory can be used to estimate the crossover exponent at 4

7. We then

applied our numerical methods, which produced an estimate which more or less agreed withthis number. Using these methods, we also estimated the location of the critical lines, and

130

0

0.05

0.1

0.15

0.2

0.25

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


u

vm(u, 0)

(12−m)8

β

Fig. 3.36: Estimated critical exponents along the lower phase boundary.

0

0.02

0.04

0.06

0.08

0.1

0.12

0 1 2 3 4 5 6 7 8 9

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

−J2/J1

β

Fig. 3.37: Estimated critical exponents vs. − J2

J1.

the critical exponent along the lower phase boundary. While not entirely new results, ourcalculations provide another perspective on the existing estimations of these quantities, usinga new method. They also illustrate one of the uses of the CTM method.

We could extend this research by trying to increase the accuracy of our estimate. Inparticular, our ways for getting around the natural error of the CTM method are rather adhoc, and it would be good if we could measure the location of the critical line more accurately.One way to do this would be to increase the final matrix size of the CTM method, so asto produce a more accurate estimate. However, this does not alleviate the finite-iterationserror, which becomes increasingly significant at higher sizes. Alternatively, we could try toproduce series calculations from the CTM method — we can either use the iterative method,or modify the renormalization group method to produce series.

Another possible area we could look at is to try and force the CTM method to work forthe super anti-ferromagnetic phase. We suspect that it may be failing due to a breakdown inthe symmetry of the model, but have been unable to modify the CTM method to overcomethis problem as yet. However, considering that the CTM method works well for the otherphases, it seems reasonable to suppose that there exists some modification which will enableit to work in this phase.

132

4. DIRECTED LATTICE WALKS IN A STRIP

4.1 Introduction

One of the more common ideas which occurs in statistical physics is the concept of a walk.Simply put, a walk is a single path in a given space. It can also be thought of as the locusof an object moving at constant velocity, called a walker. Despite being a relatively simpleconcept (or perhaps because it is so simple and unspecific), many varied and interestingproblems involve a walk or walks.

The most obvious question to ask about any walk model is: how many possible walks ofa certain length are there? If we can find a closed form expression for this number for anymodel, we call that model solved. The most general walk is unrestricted in the direction thatit can take — it simply starts at a point and meanders around. Since, at any time, there isan infinite choice of directions for the walker, there are an infinite number of possible walksof any length. If we have more than one walker, the situation is even worse; we also haveto set the starting points of the walkers, which again presents us with an infinite number ofchoices. We show this situation in Figure 4.1.

We would like to have a walk model where the number of possible walks is finite. To dothis, we restrict the possible directions that the walk can take. For our purposes, we use thetwo-dimensional square lattice Z

2, shown in Figure 4.2, although this is by no means theonly restriction that we can make. Many interesting walk models can be made on different2-dimensional lattices, or in other dimensions. We will also assume, for the time being, thatthe walker starts at the origin.

In the simplest model, after every unit of time (or step), the walker reaches a vertex ofthe square lattice. It then has four possible directions that it can choose from, all of whichare equally likely. This model is called the random walk model. It is rather trivial to solve— since the walker has four possible choices after every step, the number of walks of lengthn is simply 4n.

The random walk model is very simple, and the solution is rather obvious. However, byjust tweaking it slightly, we can create an interesting model which does not have an obvioussolution. For example, one of the most famous walk models is the self-avoiding walk, firstdescribed by Orr in 1947 ([115]). This has the same restrictions as the random walk model,but with the added restriction that the walker must never visit a vertex that it has visitedbefore (hence the term self-avoiding). We show an example of a self-avoiding walk in Figure4.3.

Even though this model is only one condition away from the random walk, the self-avoiding walk problem (finding the number of self-avoiding walks of a given length) has not

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) A general walkin 2-space.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) Multiple walks in 2-space.

Fig. 4.1: General walks.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.2: The square lattice.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.3: A self-avoiding walk on the square lattice.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.4: The stretched and rotated square lattice.

been solved exactly, even through extensive studies. The current most efficient algorithm forenumerating the number of self-avoiding walks of a certain length is a variation of the finitelattice method discussed in Chapter 3, first implemented by Conway, Enting and Guttmannin 1993 ([37]). In [38] and [64] they extended the enumeration to self-avoiding walks oflength up to 51, a number which has subsequently been bettered by Jensen in 2004 ([74]),who counted walks up to length 71 (and, in [75], enumerated self-avoiding walks of lengthup to 40 on the triangular lattice). These studies indicate that the growth constant forself-avoiding walks on the square lattice is 2.63815853034. . . . More details and discussionon the self-avoiding walk can be found in [96] and [71].

As interesting as the self-avoiding walk model is, it is still limited by the fact that itdescribes only one walker. An interesting expansion on the model would be to have two ormore walkers, but still with some avoidance constraint. This leads to what we call viciouswalkers. Under this idea, if two walkers meet at one vertex, they will annihilate one another.We then ask: what is the probability that all walkers are ‘alive’ at a given time?

Since it is easy to calculate the number of configurations when the vicious constraint isremoved, this question can be answered by finding the number of walk configurations whereno annihilations take place. Finding the number of such configurations is the question thatthe vicious walk model poses.

However, this still leaves a few issues: where will the walkers start, and how do wedistinguish between two walkers meeting at one vertex, and two walkers visiting the samevertex, but at different times? We solve these by making the walkers directed. In previousmodels, the walks could move in any direction on the lattice, provided they did not intersectthemselves; in contrast, directed walkers can only move in two of the four possible directions.

In particular, instead of using the lattice Z2, we expand this lattice by a factor of

√2,

and then rotate it by an angle of π4

clockwise, so we still have integer coefficients. Thistransformation is shown in Figure 4.4. Then we restrict the walkers so that they may onlymove in the positive x-direction. Finally, we specify that the walkers must start at the samex-coordinate — in this case we use the points (0, 0), (0, 2), . . . . An example of such walksare shown in Figure 4.5.

Since vicious walks are always moving in the positive x-direction, they will never intersectthemselves. Furthermore, at any specified time (or length), they will always have the same

135

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.5: An example configuration of 4 vicious walkers.

x-coordinate, so if two walkers visit the same vertex, it will be at the same time. The viciouswalk model counts the configurations of fixed length where this does not happen.

As an aside, a walk model related to vicious walks is that of watermelons. This modelrestricts the endpoints of the walks so that they must end with the same separation as theirstarting separations. In other words, they must finish at their starting heights, up to avertical translation. For example, Figure 4.6 shows a watermelon configuration. This modelis so named because if the walkers are translated vertically so that they all start at the samepoint, they will also finish at the same point, giving a watermelon-like shape. To differentiatethe standard vicious walk model from the watermelon, a free endpoint model is referred toas a star model, again due to the general shape observed when the walkers are translated sothey start together.

An equivalent task to finding the number of walk configurations for all lengths is tofind the generating function (abbreviated to g.f.). The generating function of a sequence{an}, n = 0, 1, . . . is defined to be the formal power series

∑

n≥0

anzn. (4.1)

It contains all the information that the series contains, so a closed form for the generatingfunction is very valuable. More information on generating functions can be found in [141].

Generally, it is very hard to find an exact expression for the number of walks in a model(we often use the terms ‘walks’ for walk configurations). However, we do not always need toknow the exact numbers to work out how they behave. Useful information can be gained bycalculating the asymptotics — the behaviour of the number of walks as the length n → ∞.For example, in many walk models, the number of walks of length n grows exponentially,

136

which usually means that

Number of walks = cαn(nγ + a2nγ−1 + a3n

γ−2 + . . . ) as n→ ∞. (4.2)

Using the notation an ∼ bn to denote the fact that limn→∞anbn

= 1, this can be written as

Number of walks ∼ cαnnγ . (4.3)

Now we return to the directed vicious walk problem. This model was first introduced byFisher in 1984 ([54]). In this paper, he cast the walkers as short-sighted drunks who shooteach other on sight. He then asked: what is the probability that all walkers survive for nsteps? Fisher found that if there are p walkers in total, the probability of this occurrencedecreases asymptotically like n− 1

4p(p−1) as n → ∞. It is a simple matter to convert this to

the total number of vicious configurations by observing that the number of unconstrainedwalks is 2np, since there are np steps taken in total. This means that the number of viciouswalks is asymptotically 2npn−p2/4+p/4.

The vicious walk problem was also considered by Forrester ([55] and [56]), but with pe-riodic boundary conditions (so that, in effect, the walks were on a cylinder). The modelwas also considered for arbitrary dimension by Essam and Guttmann in 1995 ([52]), whoexpressed the generating function for the number of walks in terms of generalised hypergeo-metric functions.

While it is good to get an asymptotic expression for the number of vicious walks, itwould be preferable to derive an exact expression for this number. Such an expression wasconjectured in 1991 ([4]) and proved by Guttmann, Owczarek and Viennot in 1998 ([65]),who proved the formula by relating the walks to Young tableaux. They found that if thereare p walkers, then the total number of possible walks of length n is given by

∏

1≤i≤j≤n

p+ i+ j − 1

i + j − 1. (4.4)

In later papers, various modifications, tweaks and generalisations were introduced intothe directed vicious walker model. In 2002 Guttmann and Voge ([66] and [135]) introducedthe concept of friendliness. This is a relaxation of the vicious idea, where instead of thewalkers being prohibited from touching each other at all, we relax it by saying that they cantouch, but for only n vertices at a time (this is called n-friendly walkers). Furthermore, thewalkers still cannot cross each other, and 3 walkers may not occupy the same vertex. Theextreme, where two walkers may touch for any length of time, is called ∞-friendly walkers,or GV ∞-friendly walkers. Figure 4.6 shows a configuration of friendly walkers. Note thatvicious walkers can also be called 0-friendly.

In [66], Guttmann and Voge analysed the n-friendly walker model and found the gener-

137

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.6: An example configuration of four 3-friendly walkers. The thicker lines contain more thanone walker.

ating function for 2 walkers with 0-, 1-, 2- and ∞-friendliness to be

2 + 2z2(

βn + 21−4z2+

√1−4z2

)

1 − 2z2 − 4z2βn +√

1 − 4z2(4.5)

where βn = 1−2nz2n

1−2z2and n is the friendliness. In fact they actually derived a generalisation

of this formula for the anisotropic case, where different variables are used for up-steps anddown-steps.

Also in this paper, Guttmann and Voge conjectured the generating function for three1-friendly walkers (also called osculating walkers) to be

3 − 15z − 4z2 − 3(1 − z)√

1 − 8z

8z2(1 + z). (4.6)

This was recently proved (but not published) by Gessel ([60]); a proof was published byBousquet-Melou in 2005 ([30]).

Another similar variation to the vicious walk model was proposed by Tsuchiya and Katoriin 1998 ([130]). In this paper they suggested another version of ∞-friendliness, which issimilar to the GV model, except that any number of walkers may visit a site at any onetime. This became known as the TK ∞-friendly walker model, to distinguish it from theGV model. Using a slightly different model (all the walkers started at the same point),Tsuchiya and Katori related the generating function of the walks to the partition function ofa statistical mechanics model known as the chiral Potts model. In fact, vicious and friendlywalks can be related to many other models, for example directed percolation ([35]) andrandom GOE matrices ([5]).

The concept of friendliness was incorporated into the vicious walk model when another

138

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.7: Example configuration of four 3-friendly walkers in a strip of width 6.

restriction was added in a subsequent paper by Guttmann, Krattenthaler and Viennot ([89]).This restriction, called a ‘wall’ for obvious reasons, prohibits any walker from going belowthe line y = 0. In this paper, it was found that the number of n-friendly walks with m stepsand p walkers without a wall grows asymptotically like 2mpm−p2/4+p/4 (we have changed thevariable of the number of steps to m to avoid clashing with the friendliness, n). Interestingly,this number does not actually depend on the friendliness of the walkers.

They also found in this paper that the number of n-friendly walks with m steps, p walkersand a wall grows like 2mpm−p2/2. Although, as expected, this is different from the numberof walks without a wall, it still does not depend on the friendliness.

In the next paper in the series ([90]), Krattenthaler, Guttmann and Viennot added an-other restriction to the model. Instead of allowing the walkers to go as high as they wanted,they are now constrained from both the bottom and the top, with the lines y = 0 and y = L.This forms a horizontal strip of width L, as shown in Figure 4.7. Of course L must be greaterthan or equal to 2p− 2, where p is the number of walkers, so that all the walks start withinthe strip. This problem was also considered (in a different form) by Grabiner ([62]), and alsoby Brak, Essam and Owczarek ([31] and [32]) for vicious walks.

For this model, it was found that the number of walks for p m-step vicious walks grows

asymptotically like 4p2

(L+2)p

(

2p∏p

s=1 cos sπL+2

)m. For the finite-width strip, the asymptotic

growth does depend on the friendliness, as they also found that the number of TK ∞-

friendly walks grows asymptotically like 4p2

(L+2p)p

(

2p∏p

s=1 cos sπL+2p

)m

. Although the TK ∞-

friendly model is not actually an extension of the n-friendly model, it is equivalent to theGV ∞-friendly model for 2 walkers. This means that in the case of two walkers, the growthconstants of vicious and ∞-friendly walkers are clearly different, and therefore there mustbe some change as the friendliness increases. It would be reasonable to expect that this alsohappens for greater numbers of walkers as well.

A little reflection shows that the difference between the half-plane (one wall) and finitestrip models is not really surprising. The half-plane is essentially a two-dimensional model,while the finite strip model extends to infinity in only one direction.

The real question we would now like to look at is: how does the number of walks behave

139

for a strip of finite width as we adjust the friendliness? In particular, we know that thegrowth constant changes as we move from vicious to ∞-friendly walkers, but does the growthconstant change with every n, or does it stay the same for some n and change only at certainn?

This chapter, which is a more detailed version of a previous paper ([36]), attempts toanswer the above question. In Section 4.2, we show how we can guess the generating functionof various models using Pade approximants. In Sections 4.3 and 4.4, we present two differentmethods that we use to generate the first few terms of the series of number of walks — Section4.3 is based on transfer matrices, and also gives the general transfer matrix for 2 walkers,while Section 4.4 is based on recurrences.

Then we move to proofs of generating functions. In Section 4.5, we prove the generatingfunction for one walker in a strip of any width. This generating function is well-known, butit is instructive to compare both the result and the proof technique with later results. InSection 4.6, we prove various formulas for the generating functions when we keep two of theparameters (number of walkers, width, and friendliness) fixed and vary the other parameter.We managed to prove some results for varying friendliness and number of walkers. In Section4.7, we discuss the behaviour of the growth constants we have obtained, and then propose apossible extension to the model in Section 4.8. Finally, in Section 4.9, we look back on whatwe have done and propose possible ways to take our research further.

4.2 Finding the generating function

The first step in analysing the walks is to try to find the generating function. To do this,we first gain some idea of what form the generating function can take. This is done in thefollowing lemma.

Lemma 4.2.1. The generating function for p n-friendly walkers in a strip of width L is arational function.

Proof. Another way that we can look at the n-friendly walk model is by considering pathson a graph. For example, suppose that we have just one walker in a strip of finite width. Weconsider all the possible y-coordinates of the walker to be states of a graph, and say that atany time, the system is in a certain state if the walker is at that height. We can then connecttwo states if and only if the walker can go from one to the other in exactly one step. We callthis graph the transfer graph of the model, and show an example in Figure 4.8(a). Usingthis graph, we can now recast the problem of one walker as a path on the graph startingfrom state 0. We then have to find the total number of possible paths on this graph.

For greater numbers of vicious walkers, this conversion becomes slightly more compli-cated. The states still contain all possible heights for the walkers at any one time, but sincethere are now (say) p walkers, this then becomes a p-tuple of heights, so for example if atwo-walker system is in state (1, 3), then the walkers are at heights 1 and 3. States are stillconnected if and only if one state can become the other after one unit of time. We show anexample of this in Figure 4.8(b).

140

1 2 3 40

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) Transfer graph for one walker in a strip of width 4.

(0,2)

(1,3) (1,5) (3,5)

(2,4)(0,4)

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) Transfer graph for two vicious walkersin a strip of width 5.

Fig. 4.8: Some simple transfer graphs.

This approach also works for 1-friendly walkers; however, when we get to 2-friendlywalkers or above, the validity of each state depends not only on the previous state, but alsoon the states before that. This makes things trickier, but can be overcome by changing thestates to ordered sets of n p-tuples (where n is the friendliness). The states represent theheights of the walkers at the current step and previous n− 1 steps, which is all we need toknow to verify the validity of the next state. Using this, a representation of the problem aspaths on a graph can still be achieved.

The important thing to note about the expression of the problem in this fashion is that(unlike a half-plane model) the transfer graphs all have a finite number of states, since thereare only a finite number of ways you can place the walkers in a finite strip at any one time.Therefore, if we construct the adjacency matrix or transfer matrix of the graph (a matrixwith one row/column for each state, containing 1s where the row and column states areconnected and 0 otherwise), this matrix will also be finite. We can then use the followingtheorem, which we quote from [124, Theorem 4.7.2].

Theorem 4.2.2. If A is the transfer matrix of a system, then the generating function ofpaths from state i to state j is given by

(−1)i+j det(I − xA; j, i)

det(I − xA)(4.7)

where det(I−xA; j, i) is the minor of I−xA obtained by deleting the jth row and ith columnand taking the determinant of the resulting matrix.

Now we wish to find the total number of paths starting from a particular state. This isa sum of terms of the form given by Theorem 4.2.2. Since A is finite-dimensional for anyfinite-width strip, both numerator and denominator are finite polynomials. Therefore thegenerating function is a sum of rational functions, and therefore is rational itself.

As an aside, the transfer matrix in the above proof has one row/column for each setof n p-tuples of heights. Since there are L + 1 possible heights in a strip of width L, the

141

transfer matrix has dimension (L+1)np×(L+1)np. From Theorem 4.2.2, this means that thenumerator of the generating function has order at most (L+1)np−1 and the denominator hasorder at most (L+ 1)np. In practice, however, the orders of the numerator and denominatorare generally much lower.

Now that we know that the generating function must be rational, we can use Padeapproximants (which we discussed in Section 3.6.2) to guess the generating function. To dothis, we generate the first few terms of the series of the total number of walk configurations.Then we approximate the generating function by a rational function.

Although Pade approximants can be used to approximate any generating function, we cango one step further. Because our generating function is rational, we can actually produce theexact generating function if the degree of the approximant is high enough. On the other hand,it is not always obvious when we have the exact function, as opposed to an approximation.However, by looking at the proof of Lemma 4.2.1, we see that both the numerator anddenominator of the generating function can be expressed with integer coefficients. So whenour approximants give us integer coefficients, it is reasonable to suppose that they are exact.

It is worth noting that because the approximants are exact, when we take an approximantthat has higher order polynomials than the actual generating function, our system of linearequations is redundant. We solve this by setting all necessary higher order terms to be 0.This can also tell us when the approximant is exact.

Once we have the generating function, it is a simple matter to calculate the growthconstant. The reciprocal of the smallest positive real zero of the denominator gives us thisvalue.

4.3 A transfer matrix algorithm

If we are given enough terms in the series for the number of walks, we can use Pade ap-proximants to find the generating function and growth constant. However we still need togenerate the terms first. One way to go about this is to use the adjacency matrix of thetransfer graph, as discussed in Lemma 4.2.1. For small parameter values, we can calculatethis matrix directly. For this we use the following lemma.

Lemma 4.3.1. For two n-friendly walkers in a strip of width L, the transfer matrix has the

142

block structure

L− 1 L− 3 L− 5 . . . L+ 1 L + 1 L+ 1 . . .

L− 1 T0I0

0 . . . 0 I 0 0 0 . . .

L− 3 0 I 0 T0I0

. . . 0 0 0 . . .

L− 5 0 0 I 0 T. . . 0 0 0 . . .

......

. . .. . .

. . ....

...... . . .

L+ 10I0

0 0 . . . 0 T 0 . . .

L+ 10I0

0 0 . . . 0 0 T. . .

L+ 10I0

0 0 . . . 0 0 0. . .

(n times)...

......

......

......

.... . .

where T is the tri-diagonal matrix with 0s on the diagonal and 1s on the off-diagonals, I isthe identity matrix, and 0 is either a single row of 0s, a single column of 0s, or a matrix ofzeros. The first row and column give the widths of each block; the dimensions of the matricesshould be clear.

Proof. An important fact used implicitly in this proof is that the walkers must always be aneven number of units apart from each other. Also, if two walkers remain a constant distancefrom each other (i.e. they take the same steps, but at a fixed distance from each other),then they essentially act like one walker in a strip of lesser width. In this case, the transfermatrix must be the same as that for one walker, which is T .

Firstly, let us consider the vicious case. The transfer matrix for this case is the top leftsection of the above matrix. We set up the states in the manner suggested in the proof ofLemma 4.2.1. The first block of L − 1 states represents the states where the walkers aretwo units apart - (0, 2), (1, 3), and so on, in that order. The second block of L − 3 statesrepresents the states where the walkers are four units apart - (0, 4), (1, 5), and so on, againin that order. This continues in a similar manner, with the last block representing stateswhere the walkers are as far apart as possible.

Now each step can only change the distance separating the walkers by -2, 0, or 2, asshown in Figure 4.9. Therefore the sections of the transfer matrix which connect blocksthat are not adjacent to each other (in the matrix) must all be 0 matrices. For the sections

143

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.9: The distance between walkers can only change by -2, 0, or 2 for any step.

connecting blocks to themselves, the walkers keep the same distance apart, and therefore thesections are T blocks.

Finally, for the sections connecting blocks which are adjacent (in the matrix), there isonly one way to step so that the distance separating walkers increases or decreases. Sincethe states in each block are ordered by height, this gives the characteristic ‘0 I 0’ sectionsof the transfer matrix. Putting these observations together, it can be seen that the transfermatrix for 2 vicious walkers is the top left section of the above matrix.

Now consider the case of arbitrary n (friendliness). Here we set up the states in a slightlydifferent manner to that suggested in Lemma 4.2.1. Instead of keeping track of all the stateswe have visited in the last n steps, we really only have to know if the two walkers can orcannot be together. So all we need to do is to keep track of how long they have been together,if they are; otherwise we can use the same states as vicious walkers. This is what we do, sothe blocks of states before the double line in the transfer matrix are the states where thewalkers are apart, and the blocks of states after the line are states where the walkers aretogether.

However, to keep track of how long the walkers have been together, we attach a subscriptto each of the latter states. This subscript gives the number of vertices that the walkers havebeen together prior to (and including) this vertex. So, for example, a state of (2, 2)3 meansthat both walkers are at height 3, and they have been together for exactly 2 steps prior tothis vertex. Then we make the first block of L+ 1 states represent the states (0, 0)1, (1, 1)1,and so on, while the second set represents (0, 0)2, (1, 1)2, and so on, and this continues untilwe reach (L, L)n.

Now if two walkers which are together step apart, they will end up 2 units apart. Againthe ordering by height gives the ‘0 I 0’ sections of the lower first column. Conversely, if thewalkers are two units apart and step together, they have only been together for the currentvertex, and must have a subscript of 1. Since it is impossible to move from a distance of 4or more apart to the walkers being together in one step or vice versa, we have the top rightand bottom left sections of the above transfer matrix.

Finally, if two walkers that are together stay together after one step, they have beentogether for one vertex more, and the subscript of the state must increase by 1. Apart fromthat, they act as a single walker in a strip of width L, which has transfer matrix T . Thiscompletes the transfer matrix as stated above.

144

By taking powers of this transfer matrix and summing the entries in the first row, wewere able to enumerate the walker series many times faster than by a direct enumerationalgorithm. We can then guess the generating function using Pade approximants.

4.4 A method of recurrences

The transfer matrix method described in the above section works well, and provides extrainformation in the form of the entire transfer matrix (which can often be mined for moredata). However, it does have a few shortcomings: the transfer matrix gets large very quicklywith respect to width and friendliness, and it only works for 2 walkers.

For situations where the transfer matrix method is unwieldy or inappropriate, we useda method of recurrences to generate the terms of the series. This method was suggested tous by a technique that Gessel used in [60], in his proof of the osculating 3-walker generatingfunction. It is extremely efficient, taking only a linear amount of time in the number of stepsto generate terms.

This algorithm starts out by generalising the problem. Suppose we wish to find thenumber of walks for two n-friendly walkers in a strip of width L. We define h(i, j, n′, m) tobe the total number of walks of m steps for two n-friendly walkers in a strip of width L,given that they start at heights i and i + 2j respectively, and have been together (if j = 0)for n′ vertices before the start, not including the current vertex. We wish to find h(0, 1, 0, m)for arbitrary m. We will do this by calculating h for all values of its parameters.

For cases where the walkers are at illegal starting positions, h must be 0. This occurswhen i < 0 (below the lower boundary), j < 0 (walkers cross), i + 2j > L (above the upperboundary) and n′ ≥ n (‘too friendly’). Also, in situations where the walkers are in validpositions but have no more steps to take (m = 0), h must be 1.

We can now divide the remaining parameter values into two cases. In the first case, thewalkers start together, i.e. j = 0. In this case, we look at the possible first steps of thewalks, as shown in Figure 4.10(a). The possibilities are:

• The walkers both step upwards. This increases i and n′ by 1 while still leaving j at0. Also m is reduced by 1 since we have taken a step and therefore have one less stepremaining.

• The walkers both step downwards. This increases n′ by 1 and decreases i and m by 1.j is still 0.

• The walkers step apart. This decreases i and m by 1, increases j by 1, and forces n′

to be 0.

We do not have to worry about going to an invalid state because the value of h will auto-matically be set to 0 for that state. This means that we have the equation

h(i, 0, n′, m) = h(i+1, 0, n′ +1, m−1)+h(i−1, 0, n′ +1, m−1)+h(i−1, 1, 0, m−1). (4.8)

145

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) Possible steps if the walkers are together.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) Possible steps if the walkers are apart.

Fig. 4.10: Possible first steps for two walkers.

The other case occurs when the walkers are apart (which means n′ = 0). Again we lookat the possible first steps (shown in Figure 4.10(b)). The possibilities are:

• The walkers both step upwards. This increases i by 1, decreases m by 1, and keeps jthe same.

• The walkers both step downwards. This is similar to the previous possibility butdecreases i by 1.

• The walkers step apart. This decreases i by 1 and increases j by 1. m goes down by 1.

• The walkers step together. This increases i by 1 and decreases j and m by 1. If thewalkers end at the same height, we will still have n′ = 0 because they have not beentogether prior to this vertex.

Putting it all together, this gives the recurrence equation for j > 0:

h(i, j, 0, m) = h(i+1, j, 0, m−1)+h(i−1, j, 0, m−1)+h(i−1, j+1, 0, m−1)+h(i+1, j−1, 0, m−1).(4.9)

Now to find h(0, 1, 0, m), we simply find h for all values of i, j, n′ and m′ for m′ < m andthen apply the above equation. We will be able to find any value of h by adding values fromthe set of values of h with m decreased by 1. After finding the first few terms, we can thenuse Pade approximants to guess the generating function.

For three walkers, we can use the same principle, except that we need more variables— the height of the first (lowest) walker, the height difference between the first and secondwalkers and second and third walkers, the number of vertices that the first two walkers havebeen together, the number of vertices that the second two walkers have been together, and

146

the number of steps remaining. Since we do not have a general formula for the three walkertransfer matrix, this is the method of choice for calculating generating functions for threewalkers.

Theoretically, this method will work for any number of walkers, but obviously it getsmuch more complicated as the number of walkers increases. We have used it for up to 4walkers. The generating functions for some of these cases (which we have not proved) are inAppendix A. We also give their critical points in Appendix B.

4.5 One walker

If there is only one walker, the model is considerably simplified, because the issue of friend-liness does not arise. The generating function for this case is well known (for example, in[88]). We will prove it again here using a generating function argument that is similar toproof techniques that we will use later on in the chapter. Also, it is instructive to comparethis result with similar results for more walkers.

From now on, we will assume that the range of a summation is from −∞ to ∞ unlessspecified otherwise.

Theorem 4.5.1. In a strip of width L, the (isotropic) generating function for one walk thatends on the x-axis is

gL(x) =hL(x)

hL+1(x)(4.10)

and the generating function for one walk with an arbitrary end-point is

fL(x) =1

hL+1(x)

L∑

i=0

xihL−i(x) (4.11)

where

hL(x) =∑

i

(−1)i(

L− i

i

)

x2i (4.12)

is a polynomial of degree 2bL2c. If we define λ± = 1±

√1−4x2

2, then this becomes

gL(x) =λL+1

+ − λL+1−

λL+2+ − λL+2

−(4.13)

and

fL(x) =1

λL+2+ − λL+2

−

(

λL+1+ − xL+1

1 − x/λ+− λL+1

− − xL+1

1 − x/λ−

)

. (4.14)

Proof. We first consider the case where the walk must end on the x-axis. We know that thewalker returns to the x-axis at least once, so we divide the walk at all points where it returnsto the x-axis, as shown in Figure 4.11. Now we look at the generating function of each ofthese divided segments.

147

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) A walk in a strip of width 3.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) The divided walk.

Fig. 4.11: Dividing a single walk in a strip at the points where the walker returns to height 0.

Each of these segments consists of a walk in the strip, starting and ending at height 0,and not touching 0 in between. Therefore it can be described as an up-step, followed by awalk in a strip of width L − 1 starting and ending at the equivalent of height 0 (i.e. theheight of the lower boundary), followed by a down-step. Therefore the generating functionfor each segment is x2gL−1(x). Since our original walk consists of an arbitrary number ofthese segments, we must have

gL(x) =1

1 − x2gL−1(x). (4.15)

We can now use induction on L. Obviously g0(x) = 1, since a walker cannot walk in astrip of no width. Now assume that gL−1(x) has the form given in Equation 4.10. Then

gL(x) =(

1 − x2gL−1(x))−1

=

(

∑

i(−1)i(

L−ii

)

x2i − x2∑

i(−1)i(

L−1−ii

)

x2i

∑

i(−1)i(

L−ii

)

x2i

)−1

=

∑

i(−1)i(

L−ii

)

x2i

∑

i(−1)i((

L−ii

)

+(

L−ii−1

))

x2i

=

∑

i(−1)i(

L−ii

)

x2i

∑

i(−1)i(

L+1−ii

)

x2i. (4.16)

148

By induction, the first part of the theorem is proved.Now consider the case where the walker can have an arbitrary end-point. We can divide

any such walk into two parts, separated by the last return of the walker to the x-axis. Againwe demonstrate this in Figure 4.11. The first part is a walk in the strip which returns to0, and thus has generating function gL(x). The second part is either trivial (if the walkerends at 0), or consists of an up-step followed by a walk in a strip of width L − 1, startingfrom the equivalent of height 0 and with arbitrary end-point. Therefore the second part hasgenerating function 1 + xfL−1(x). This gives us

fL(x) = gL(x)(1 + xfL−1(x))

= gL(x) + xgL(x)gL−1(x)(1 + xfL−2(x))

= ...

= gL(x) + xgL(x)gL−1(x) + · · · + xL−1gL(x) . . . g1(x) + xLgL(x) . . . g1(x)f0(x)

= gL(x) + xgL(x)gL−1(x) + · · · + xL−1gL(x) . . . g1(x) + xLgL(x) . . . g1(x)g0(x)

(4.17)

since f0(x) = g0(x) = 1. Continuing,

fL(x) =hL(x)

hL+1(x)+ x

hL(x)

hL+1(x)

hL−1(x)

hL(x)+ · · ·+ xL

hL(x)

hL+1(x). . .

h0(x)

h1(x)

=1

hL+1(x)

L∑

i=0

xihL−i(x). (4.18)

The alternate forms we have given for gL(x) and fL(x) come from expressing hL(x) inthe form

hL(x) =λL+1

+ − λL+1−√

1 − 4x2(4.19)

which can be derived from [84, Section 1.2.9, Exercise 15]. This then gives

gL(x) =λL+1

+ − λL+1−√

1 − 4x2

√1 − 4x2

λL+2+ − λL+2

−=λL+1

+ − λL+1−

λL+2+ − λL+2

−(4.20)

149

and

fL(x) =1

hL+1(x)√

1 − 4x2

L∑

i=0

xi(λL−i+1+ − λL−i+1

− )

=1

λL+2+ − λL+2

−

(

λL+1+

L∑

i=0

(

x

λ+

)i

− λL+1−

L∑

i=0

(

x

λ−

)i)

=1

λL+2+ − λL+2

−

(

λL+1+

1 − (x/λ+)L+1

1 − x/λ+

− λL+1−

1 − (x/λ−)L+1

1 − x/λ−

)

=1

λL+2+ − λL+2

−

(

λL+1+ − xL+1

1 − x/λ+− λL+1

− − xL+1

1 − x/λ−

)

. (4.21)

4.6 Results

Using the transfer matrix algorithm or method of recurrences and Pade approximants, wecan find the generating function (and hence growth constant) of p n-friendly walkers in astrip of width L, for any given values of p, n and L (if they are small enough). However, ifwe want to get a sense of how the growth constant changes with respect to the parametersof the model, it would be much more useful to have a general formula for the generatingfunction in terms of these parameters. On the other hand, while we can find generatingfunctions for fixed parameter values simply by evaluating series, we have to prove theoreticalresults if we want to find a general formula, which is much harder. Rather unsurprisingly,we were unable to prove a general result in all three parameters.

However, all is not lost, as it is still of some value to prove a general formula in one of theparameters, while fixing the values of the other two. We were able to prove several formulasof this nature simply by calculating the generating functions for several cases, and guessingthe general formula. Then we used generating function arguments similar to those in Section4.5 to prove the actual formula. In this section, we give those results.

4.6.1 Variable friendliness

As we discussed in the introduction, we would like to observe how the growth constantchanges as the model moves from 0-friendliness to ∞-friendliness. Therefore it makes senseto start by fixing the number of walkers and width and changing the friendliness. We dothis in the next theorem.

Theorem 4.6.1. The generating function for two n-friendly walkers in a strip of width 3 is

∑ni=0 Fix

i

1 − x−∑n−1

i=0 Fixi+2

=1 − Fn+1x

n+1 − Fnxn+2

1 − 2x− x2 + x3 + Fnxn+2 + Fn−1xn+3(4.22)

150

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) A configuration with dividing lines.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) The divided walks.

Fig. 4.12: Dividing walks in a strip of width 3 when the walkers are 2 units apart.

where Fn is the nth Fibonacci number (F0 = F1 = 1, Fn = Fn−1 + Fn−2 (n ≥ 2)). Thisextends to ∞-friendly walkers which have generating function

1/(1 − x− x2)

1 − x− x2/(1 − x− x2)=

1

1 − 2x− x2 + x3. (4.23)

Proof. Because the walkers must be an even distance apart at any time, and the strip haswidth 3, any configuration of walks must end either with the walkers together or 2 unitsapart. We start by analysing the second case.

For any walk configuration where the walks end 2 units apart, we divide the configurationsat the x-coordinates where the walkers are 2 units apart, as shown in Figure 4.12. This givesus a series of segments where the walkers start and end 2 units apart, but are never 2 unitsapart in between. We will first find the generating function of one of these segments.

Suppose that the walkers in the segment start in the state (0, 2). We can do this withoutloss of generality because this state, when reflected in the line y = 3

2, gives us the only other

possible starting state (1, 3). There are two possibilities for the first step: the walkers caneither step in the same direction to (1, 3), which immediately ends the segment, or they cancome together. In the former case, the segment contributes x to the generating function (onestep). In the latter, the walkers can stay together for up to n vertices, but once they moveapart, they will be 2 units apart and the segment ends.

Let us examine this case in more detail. The initial step, in which the walkers steptogether, contributes x to the generating function, as does the final step apart; the remaining

151

steps are identical to one walk starting at height 1 in a strip of width 3. However, to ensurethat the bounds are not exceeded when the walkers step apart, the walk must end at height1 or 2. We now can use Theorem 4.2.2, and the transfer matrix of one walk in a strip ofwidth 3, which is the 4 × 4-dimensional T , i.e.

0 1 0 01 0 1 00 1 0 10 0 1 0

. (4.24)

From Theorem 4.2.2, the generating function of a walk which starts at height 1 and endsat height 1 is

1 − x2

1 − 3x2 + x4. (4.25)

The generating function of a walk which starts at height 1 and ends at height 2 is

x

1 − 3x2 + x4. (4.26)

Therefore the generating function of one walk in a strip of width 3, starting at 1 and endingat 1 or 2 is

1 + x− x2

1 − 3x2 + x4=

1

1 − x− x2=∑

i≥0

Fixi. (4.27)

However, we have left out one restriction — the walk cannot be longer than n − 1 steps(which is equivalent to visiting n vertices). To add this restriction we simply disallow allwalks that are too long; this gives the generating function

n−1∑

i=0

Fixi. (4.28)

Recalling that we have 2 extra steps for this case, and the other case contributes x, wesee that the generating function for each of the segments is

x + x2

n−1∑

i=0

Fixi. (4.29)

The entire configuration (if the walks end apart) is a sequence of an arbitrary number ofthese segments, and therefore has the generating function

1

1 − x−∑n−1i=0 Fix

i+2. (4.30)

Now suppose that we do not limit the endpoints of the walks. We divide the walks at thelast point where the walkers are 2 units apart. The first segment is exactly the configuration

152

that we have analysed above, since the walkers end apart; the second segment is either empty(i.e. the walks end apart) or contains the walkers coming together and then walking togetherfor up to n − 1 steps. The first case contains no steps and therefore contributes 1 to thegenerating function. The second case consists of one step (contributing x) followed by theequivalent of one walk starting at height 1 in a strip of width 3, taking no more than n− 1steps. As above, we apply Theorem 4.2.2 to the transfer matrix, and find that if the lengthrestriction is removed, the generating function of this walk is

1 + x

1 − x− x2=∑

i≥0

Fi+1xi (4.31)

which means that the generating function for the second section is

1 +

n−1∑

i=0

Fi+1xi+1 =

n∑

i=0

Fixi. (4.32)

Putting it all together, the total generating function for the model that we want is theproduct of the generating functions for the first and second parts, i.e.

∑ni=0 Fix

i

1 − x−∑n−1

i=0 Fixi+2

(4.33)

which is the first expression given above.Now, multiplying a truncated Fibonacci series by 1 − x− x2 gives us

(

n∑

i=0

Fixi

)

(

1 − x− x2)

=

n∑

i=0

Fixi −

n∑

i=0

Fixi+1 −

n∑

i=0

Fixi+2

= 1 + x +n∑

i=2

Fixi − x− Fnx

n+1 −n∑

i=2

Fi−1xi − Fn−1x

n+1 − Fnxn+2 −

n∑

i=2

Fi−2xi

= 1 − (Fn−1 + Fn)xn+1 − Fnx

n+2 +

n∑

i=2

(Fi − Fi−1 − Fi−2)xi

= 1 − Fn+1xn+1 − Fnx

n+2. (4.34)

From this it is easily seen that multiplying the top and bottom of the first expression by1 − x− x2 gives us the second expression above.

Using a similar technique, we were able to extend this result to a strip of width 4.

153

Theorem 4.6.2. The generating function for two n-friendly walkers in a strip of width 4 is

∑ni=0 aix

i − 2(3kxn+2 +∑n

i=k+2 3i−2x2i+1)∑k+1

i=0 bix2i + 2(3kxn+3 +

∑ni=k+2 3i−2x2i+2)

=1 + 2x− x2 − 2x3 + 3kxn+1(−2 − 6x + 4x3) + 2(32k)x2n+3

1 − 8x2 + 8x4 + 3kxn+3(9 − 4x2) − 2(32k)x2n+4(4.35)

if n = 2k + 1 is odd, and

∑ni=0 aix

i + 3k−1xn+1 + 2(−3k−1xn+2 +∑n

i=k+1 3i−2x2i+1)∑k

i=0 bix2i − 2(3k−1xn+2 +

∑ni=k+1 3i−2x2i+2)

=1 + 2x− x2 − 2x3 + 3k−1xn+1(−3 − 8x− x2 + 6x3) − 2(32k−1)x2n+3

1 − 8x2 + 8x4 + 3k−1xn+2(5 + 4x2) + 2(32k−1)x2n+4(4.36)

if n = 2k is even, where ai are coefficients defined by a0 = 1, a1 = 2, a2i = 2(3)i−1 (i ≥ 1),and a2i+1 = 4(3)i−1 (i ≥ 1), and bi are coefficients defined by b0 = 1, b1 = −5, andbi = −7(3)i−2 (i ≥ 2). This also extends to the ∞-friendly case, which has generatingfunction

1 + 2x+ 2x2(1 + 2x)/(1 − 3x2)

1 − 5x2 − 7x4/(1 − 3x2)=

1 + 2x− x2 − 2x3

1 − 8x2 + 8x4. (4.37)

Proof. We work on a similar principle to the proof of Theorem 4.6.1. In that proof, there wereonly 2 possible states (1 up to reflection) where the walkers could be separate, and we dividedthe walks whenever we reached those states. We do the same here, but this time there are 4such states — (0, 2), (1, 3), (2, 4), and (0, 4). As before, we will identify the states (0, 2) and(2, 4) with each other, as they are equivalent under reflection in y = 2, and denote this stateby ‘(0, 2)/(2, 4)’. We give these states the arbitrary ordering of (0, 2)/(2, 4), (1, 3), (0, 4).

Firstly, suppose that the walks end apart. We divide the walks whenever one of thesestates is attained, as before. We can now recast the problem as a path-on-a-graph problem,but in a different way from before. Rather than checking the state of the walkers at everyx-coordinate, we only have states on the graph whenever the walkers are not together. Thismeans that we have exactly 3 states, so the corresponding transfer matrix is 3 × 3.

However, the elements of the transfer matrix must now contain the possibility of thewalkers coming together in between the dividing points. To do this, we modify the transfermatrix — instead of only containing 1s and 0s, the elements are now the generating functionof all possible transitions from state to state. Note that this isn’t a ‘true’ transfer matrix —it corresponds to xA in Theorem 4.2.2 rather than A itself.

We now try to find this modified transfer matrix. For the column and row pertaining to(0, 4), all steps to or from (0, 4) must go to (1, 3), so the corresponding elements are x forthe state (1, 3) and 0 otherwise. For the other states, we must cover 2 possibilities — eitherthe walkers do not come together (in which case the generating function is easy to find) orthey do (in which case we must resort to the model of 1 walker in a strip of width 4).

154

For example, let us find the element of the matrix in the row of (0, 2)/(2, 4) and columnof (1, 3). We will take the starting state to be (0, 2) (since (2, 4) gives the same result). Nowit is possible to go from (0, 2) to (1, 3) in one step, which contributes x to the generatingfunction. Otherwise, any transition between these states must consist of an opening andclosing step (contributing x2) bracketing the equivalent of a single walk in a strip of width4, starting at height 1 and ending at height 2, and comprising no more than n− 1 steps.

Now the transfer matrix for such a walk is the 5 × 5 T matrix, i.e.

0 1 0 0 01 0 1 0 00 1 0 1 00 0 1 0 10 0 0 1 0

. (4.38)

Using Theorem 4.2.2 again, removing the length restriction on the single walk gives us agenerating function of

x− x3

1 − 4x2 + 3x4=

x

1 − 3x2=∑

i≥0

3ix2i+1. (4.39)

To ensure that there are no powers of x exceeding n−1, we must cut off the sum at i = b n−22c,

and therefore the modified transfer matrix entry is

x +

bn−2

2c

∑

i=0

3ix2i+3 = x + x3 1 − (3x2)bn2c

1 − 3x2. (4.40)

Using more or less identical procedures, we can find the remaining elements of the modi-fied transfer matrix. The only point of interest is that when calculating entries for the columncorresponding to (0, 2)/(2, 4) we must allow for the possibility of going to either state. Thatis why, for example, the ((1, 3), (0, 2)/(2, 4)) entry is twice that of the ((0, 2)/(2, 4), (1, 3))entry, although any of the walks can be reversed. From our calculations, we find the modifiedtransfer matrix to be

x2 1−(3x2)bn+1

2c

1−3x2 x + x3 1−(3x2)bn2c

1−3x2 0

2x+ 2x3 1−(3x2)bn2c

1−3x2

13x2 + 2

3x2 1−(3x2)b

n+12

c

1−3x2 x

0 x 0

. (4.41)

We can now apply (the always-useful) Theorem 4.2.2 to this transfer matrix. As we haveincluded the step weights in the matrix, the matrix replaces xA in the theorem. We usethe theorem to find the generating functions of all configurations where the walkers end ina certain state.

For example, suppose we wish to find the g.f. of walks where the walkers end at (0, 2)

155

or (2, 4). If we assume that n = 2k is even (the calculation for n odd is similar), Theorem4.2.2 gives the numerator of the g.f. as

1 − 1

3x2 − 2

3x2 1 − (3x2)k

1 − 3x2− x2 =

1

1 − 3x2

(

1 − 5x2 + 4x4 + 2(3k−1)xn+2)

(4.42)

and the denominator as

(

1 − x2 1 − (3x2)k

1 − 3x2

)

1

1 − 3x2

(

1 − 5x2 + 4x4 + 2(3k−1)xn+2)

−(

x + x3 1 − (3x2)k

1 − 3x2

)(

2 + 2x3 1 − (3x2)k

1 − 3x2

)

=1

1 − 3x2

(

1 − 8x2 + 8x4 + 3k−1xn+2(5 + 4x2) + 2(32k−1)x2n+4)

(4.43)

after some manipulation (we used Maple). We can do the same for all possible ending states,to get the generating function of all walks which end at a given state.

Now we return to the original problem, where the walkers are not constrained to endapart. As in Theorem 4.6.1, we divide the configurations at the last point where the walkersare apart. Figure 4.13 shows the full decomposition of a walk configuration.

From above, we know the generating function of the first section for any given endingstate. The second section is either empty with generating function 1, or contains a stepwhere the walkers come together (g.f. x) followed by the equivalent of one walker in a stripof width 4. The starting height of this walker depends on the final state of the first section,but we can use Theorem 4.2.2 and the transfer matrix of one walker to find the generatingfunction for any starting height.

Without going into the details (which in any case are almost identical to previous calcula-tions involving Theorem 4.2.2), it is simpler to consider the cases n even and odd separately,and then calculate the generating functions to each state (first section) and from each state(second section). These generating functions are shown in Table 4.1.

To find the generating function of the total number of walks with unrestricted endings,we multiply the generating functions to and from each state, and sum the results over allstates. After a lot of algebraic manipulation, we reach the second form in the statement ofthe theorem above. In a similar fashion to Theorem 4.6.1, dividing both sides of the fractionby 1−3x2 results in the first form, which seems to be the form with the lowest order in bothnumerator and denominator.

It is worth noting that it should be possible to use a similar procedure to the above twotheorems to find the generating function in terms of n for two walkers in a strip of any width,but obviously the algebra becomes very much more complicated as the width increases.

Interestingly, there appears to be a similar formula for three n-friendly walkers in a stripof width 4.

156

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) Walks in a strip of width 4 with dividing lines.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) The divided walks.

Fig. 4.13: Dividing walks in a strip of width 4 whenever the walkers are apart.

State (0, 2)/(2, 4) (1, 3) (0, 4)

g.f. to state (n = 2k) 1−5x2+4x4+2(3k−1)xn+2

d1(x)x−2x3−3kxn+3

d1(x)x2−2x4−3kxn+4

d1(x)

g.f. to state (n = 2k + 1) 1−5x2+4x4+2(3k)xn+3

d2(x)x−2x3−3kxn+2

d2(x)x2−2x4−3kxn+3

d2(x)

g.f. from state (n = 2k) 1 + x 1−2x−(1+2x)3kxn

1−3x2 1 + x1+3x+x2−(4+6x)3k−1xn

1−3x2 1

g.f. from state (n = 2k + 1) 1 + x 1+2x−(2+3x)3kxn

1−3x2 1 + x1+2x+x2−(2+4x)3kxn

1−3x2 1

where

d1(x) = 1 − 8x2 + 8x4 + 3k−1xn+2(5 + 4x2) + 2(32k−1)x2n+4

d2(x) = 1 − 8x2 + 8x4 + 3kxn+3(9 − 4x2) − 2(32k)x2n+4.

Tab. 4.1: Generating functions to and from all states for two walkers in a strip of width 4.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.14: A possible way in which walkers may separate and then join in a single step.

Theorem 4.6.3. The generating function for three n-friendly walkers in a strip of width 4is

∑ni=0 aix

i − 5∑n

i=k+1 3i−2x2i

∑ki=0 bix

2i +∑n

i=k+1 3i−2x2i=

1 + 2x+ 2x2 + 3k−1xn+1(−6 − 20x) + 5(32k−1)x2n+2

1 − 8x2 + 16(3k−1)xn+2 − 32k−1x2n+2

(4.44)if n = 2k is even, and

∑ni=0 aix

i + 3kxn+1 + 5∑n

i=k+2 3i−2x2i

∑ki=0 bix

2i − 3k+1xn+1 −∑n

i=k+2 3i−2x2i=

1 + 2x+ 2x2 + 3kxn+1(−4 − 6x+ 2x2) − 5(32k)x2n+2

1 − 8x2 + 3kxn+1(2 + 8x2) + 32kx2n+2

(4.45)if n = 2k + 1 is odd, where ai are coefficients defined by a0 = 1, a2i−1 = 2(3i−1) (i ≥ 1), anda2i = 5(3i−1) (i ≥ 1), and bi are coefficients defined by b0 = 1, bi = −5(3i−1) (i ≥ 1).

Again, this extends to the (GV) ∞-friendly case, which has generating function

1 + (2x+ 5x2)/(1 − 3x2)

1 − 5x2/(1 − 3x2)=

1 + 2x + 2x2

1 − 8x2. (4.46)

Proof. We approach this proof in a similar manner to the proofs of the previous two theorems— an essentially two-layered approach where we divide the configurations at certain points,find generating functions for all possible walks in each segment, and then combine thesefunctions together into one big generating function for the required walks.

In the previous theorems, we chose our division points to be all the points where thewalkers were apart. However, what we really want is for each segment to have an easilycalculable generating function. In the previous theorems, our method of division workedbecause by forcing the walkers to stay together, we essentially created a one-walker model.It is tempting to try the same thing for 3 walkers and divide whenever the walkers areall separate, with the hope of forcing two of the walkers together and creating a vicioustwo-walker model (which is vicious because we cannot have 3 walkers at the same vertex).However, this does not work because it is possible to step from a state where two walkers aretogether to a state where two different walkers are together in a single step, and thereforewithout reaching a state (with integral x-coordinate) where all the walkers are separate, asshown in Figure 4.14.

In order to ensure that walkers which are together stick together within a segment, we

158

divide the walks not only when all walkers are separate (state (0, 2, 4)) but also when twowalkers are together for the first vertex. Extending our notation for two-walker states in anintuitive manner, these states are (0, 2, 2)1, (1, 1, 3)1, (1, 3, 3)1, and (2, 2, 4)1. Note that anystates where two walkers are together and at height 0 or 4 are reachable (for large enoughn), but will not occur with subscript 1, as that would imply that one of the walkers hadexceeded the boundary in the previous step.

As before, we identify the states (0, 2, 2)1 and (2, 2, 4)1 with each other, since they arereflections of each other in y = 2. We also identify the states (1, 1, 3)1 and (1, 3, 3)1. As before,we denote these amalgamations by (0, 2, 2)1/(2, 2, 4)1 and (1, 1, 3)1/(1, 3, 3)1 respectively.

Now, as before, we set up a modified transfer matrix that contains the generating func-tions of all possible transitions between states. The actual calculations of these generatingfunctions are similar to before, although we must of course note the qualitative differencebetween the states.

For the row corresponding to (0, 2, 4), any one step must bring two walkers together, andin the next state all walkers must have odd y-coordinates. Since the walkers can go to either(1, 1, 3)1 or (1, 3, 3)1, the only non-zero entry in this row is in the column corresponding tothe state (1, 1, 3)1/(1, 3, 3)1, where the entry is 2x.

Now let us look at the entry from (0, 2, 2)1/(2, 2, 4)1 to (0, 2, 4). By reflection, we can justcount the paths from (0, 2, 2)1 to (0, 2, 4). These paths will consist of a section where thetwo higher walkers stay together, after which they must separate, since they end separated.However, at the time when the walkers separate, either all walkers are separate, or the lowertwo walkers are at the same height for the first vertex. Therefore they must separate at adividing state, and thus cannot separate before the final state (0, 2, 4).

The only state which can go to (0, 2, 4) in one step, but has the top two walkers joined,is (1, 3, 3) with any subscript. Since the top two walkers must stay together, any set of walksfrom (0, 2, 2)1 to (1, 3, 3) is equivalent to two vicious walkers in a strip of width 4, startingfrom (0, 2) and ending at (1, 3). Since the original walkers are n-friendly, the vicious walkersmust take no more than n− 1 steps, but are unrestricted otherwise.

From Theorem 4.3.1, the transfer matrix for two vicious walkers in a strip of width 4 is

0 1 0 01 0 1 10 1 0 00 1 0 0

(4.47)

where we take the states in the order (0, 2), (1, 3), (2, 4), and (0, 4). Applying Theorem 4.2.2,the g.f. of walks from (0, 2) to (1, 3) is

x

1 − 3x2=∑

i≥0

3ix2i+1. (4.48)

Applying the length restriction and remembering that we still need another step to get from

159

State (0, 2, 4) (0, 2, 2)1/(2, 2, 4)1 (1, 1, 3)1/(1, 3, 3)1

g.f. to state (n = 2k) 1−6x2+10(3k−1)xn+2−32k−1x2n+2

d1(x)2x2−2(3k)xn+2

d1(x)2x−8x3+2(3k)xn+3

d1(x)

g.f. to state (n = 2k + 1) 1−6x2+2(1+x2)3kxn+1+32kx2n+2

d2(x)2x2−2(3k+1)xn+1

d2(x)2x−8x3+2(3k)xn+2

d2(x)

g.f. from state (n = 2k) 1 (1+x)(1−3kxn)1−3x2

1+x−(1+3x)3kxn

1−3x2

g.f. from state (n = 2k + 1) 1 1+x−(1+3x)3kxn

1−3x2

(1+x)(1−3k+1xn)1−3x2

where

d1(x) = 1 − 8x2 + 16(3k−1)xn+2 − 32k−1x2n+2

d2(x) = 1 − 8x2 + 3kxn+1(2 + 8x2) + 32kx2n+2.

Tab. 4.2: Generating functions to and from all states for three walkers in a strip of width 4.

(1, 3, 3) to (0, 2, 4) gives the entry in the modified transfer matrix as

x

bn−2

2c

∑

i=0

3ix2i+1 = x2 1 − (3x2)bn2c

1 − 3x2. (4.49)

The remaining entries are calculated in a similar way. In each case we have set thedivisions up so that the walkers which are together must stay together within a segment,except at the last point. Taking the states in the order (0, 2, 4), (0, 2, 2)1/(2, 2, 4)1 and(1, 1, 3)1/(1, 3, 3)1, the full modified transfer matrix is

0 0 2x

x2 1−(3x2)bn2c

1−3x2 x2 1−(3x2)bn2c

1−3x2 x1−2x2−x2(3x2)bn−1

2c

1−3x2

x1−(3x2)bn+1

2c

1−3x2 x1−(3x2)bn+1

2c

1−3x2 x2 1−(3x2)bn2c

1−3x2

. (4.50)

What remains is for us to construct the possible end-segments. Again, if walkers whichare together step apart, a dividing state must be reached, and therefore walkers which aretogether at the last dividing point must stay together. For (0, 2, 4), this means that there canbe no further steps, but for the other states, we have the equivalent of two vicious walkersstarting at either (0, 2) or (1, 3). Again we can find the required generating functions byapplying Theorem 4.2.2. The results are shown in Table 4.2.

By multiplying the generating functions to and from each state and summing over allstates, we achieve the second form given in the statement of the theorem. Again, the firstform is achieved by dividing both numerator and denominator by 1 − 3x2.

Looking at the first forms given in the above theorems, a pattern emerges. Both numer-

160

ator and denominator have ‘fixed’ coefficients — as n increases, these coefficients stay thesame. It seems that for any particular n, all coefficients up to xn (and sometimes xn+1) arefixed, while higher coefficients (if they exist) vary with n. We also notice that if we extendthese fixed coefficients into an infinite series, we derive the ∞-friendly generating function.Unfortunately, it seems that as we increase the strip width, the order of the numerator anddenominator grows, but the number of fixed coefficients stays the same, so we just get moreunfixed (and unpredictable) coefficients.

4.6.2 Variable number of walkers

Although varying the friendliness (n) and analysing the effect is our main goal, it is stillinteresting to study the effect of other variables on the generating function and growthconstant. Again, general formulas are difficult to prove, but for a very small width tonumber of walkers ratios, the lack of ‘room’ for the walkers to move in can result in verysimple generating functions. We present a few such results in this section.

Theorem 4.6.4. The generating function for p vicious walkers in a strip of width 2p− 1 is

1

1 − x. (4.51)

The generating function for p vicious walkers in a strip of width 2p is

1 + x

1 − (p+ 1)x2. (4.52)

Proof. The first result is obvious, as there is only one possible configuration of walks at anylength.

For the second result, at positions with odd x-coordinate there is only one possible statethat the walkers can be in. At positions with even x-coordinate, there are p + 1 possiblestates. This can be seen by observing that there are p+1 points in the strip with even height,so there must be one ‘empty’ space which can be placed in any of p+ 1 places. All of thesestates can be reached from and can go to the only possible state with odd x-coordinate withone step. Thus the number of configurations is a sequence 1, 1, p+1, p+1, (p+1)2, (p+1)2, . . . .The generating function now follows.

Extending the width by one more unit gives us a much more complicated generatingfunction, but it is still provable.

Theorem 4.6.5. The generating function for p vicious walkers in a strip of width 2p+ 1 is

fp(x) =hp−1(x)

hp+1(x)(4.53)

161

where

hp(x) =∑

i

(−1)bi+1

2c(bp+i

2c

i

)

xi (4.54)

is a polynomial of degree p. Alternatively, if we define λ± = 2−x2±x√x2−4

2, then

hp(x) =

(

1

2− 2x+ x2

2x√x2 − 4

)

λk+ +

(

1

2+

2x+ x2

2x√x2 − 4

)

λk− (4.55)

if p = 2k is even, and

hp(x) =

(

1 − x

2− 2x + x2 − x3

2x√x2 − 4

)

λk+ +

(

1 − x

2+

2x + x2 − x3

2x√x2 − 4

)

λk− (4.56)

if p = 2k + 1 is odd.

Proof. A number of properties of the conjectured generating function are instrumental inproving this theorem. For convenience we state these as a separate lemma.

Lemma 4.6.6. If fp(x) and hp(x) are as stated in Theorem 4.6.5, then

1. hp(x) = hp−2(x) − xhp−1(−x)

2. hp(x) + hp−2(x) = (2 + (−1)px)hp−1(x)

3. hp(x) = (2 − x2)hp−2(x) − hp−4(x)

4. fp(x) = 12−x2−fp−2(x)

5. fp(x) =1+fp−1(x)

3−x2−fp−1(x)

6. fp(−x) =(

fp−1(x)−1

fp+1(x)−1

)

fp+1(x).

Proof. 1.

hp(x) =∑

i

(−1)bi+1

2c(bp+i

2c

i

)

xi

=∑

i

(−1)bi+1

2c((bp−2+i

2c

i

)

+

(bp−1+i−12

ci− 1

))

xi

= hp−2(x) + x∑

i

(−1)bi+2

2c(bp−1+i

2c

i

)

xi

= hp−2(x) − x∑

i

(−1)bi2c+i(bp−1+i

2c

i

)

(−x)i

= hp−2(x) − xhp−1(−x) (4.57)

162

since (−1)bi2c+i = (−1)i−b i

2c = (−1)b

i+1

2c. Note that this implies

hp(−x) = (hp−1(x) − hp+1(x))/x.

2. If p is even, then

(2 + (−1)px)hp−1(x)

= 2∑

i

(−1)bi+1

2c(bp−1+i

2c

i

)

xi + (−1)p∑

i

(−1)bi+1

2c(bp−1+i

2c

i

)

xi+1

=∑

i

(

2(−1)bi+1

2c(bp−1+i

2c

i

)

+ (−1)p+b i2c(bp−2+i

2c

i− 1

))

xi

=∑

i even

(

2(−1)i2

(p2

+ i2− 1

i

)

+ (−1)i2

( p2

+ i2− 1

i− 1

))

xi

+∑

i odd

(

2(−1)i+1

2

( p2

+ i−12

i

)

+ (−1)i−1

2

(p2

+ i−12

− 1

i− 1

))

xi

=∑

i even

(−1)i2

((p2

+ i2

i

)

+

( p2

+ i2− 1

i

))

xi

+∑

i odd

(−1)i+1

2

(

2

(p2

+ i−12

i

)

−( p

2+ i−1

2− 1

i− 1

)

−(p

2+ i−1

2− 1

i

)

+

( p2

+ i−12

− 1

i

))

xi

=∑

i even

(−1)i2

((p2

+ i2

i

)

+

( p2

+ i2− 1

i

))

xi +∑

i odd

(−1)i+1

2

((p2

+ i−12

i

)

+

( p2

+ i−12

− 1

i

))

xi

=∑

i

(−1)bi+1

2c((bp+i

2c

i

)

+

(bp−2+i2

ci

))

xi

= hp(x) + hp−2(x). (4.58)

The proof is similar for the case where p is odd.

3. From (1),

hp(x) = hp−2(x) − xhp−1(−x)= hp−2(x) − x(hp−3(−x) + xhp−2(x))

= (1 − x2)hp−2(x) + hp−2(x) − hp−4(x)

= (2 − x2)hp−2(x) − hp−4(x). (4.59)

163

4. From (3),

fp(x) =hp−1(x)

hp+1(x)

=1

(2−x2)hp−1(x)−hp−3(x)

hp−1(x)

=1

2 − x2 − fp−2(x). (4.60)

Note that this implies fp(x) = 2 − x2 − 1fp+2(x)

.

5. From (2) and (3),

1 + fp−1(x)

3 − x2 − fp−1(x)=

hp(x) + hp−2(x)

(3 − x2)hp(x) − hp−2(x)

=hp(x) + hp−2(x)

hp(x) + hp+2(x)

=(2 + (−1)px)hp−1(x)

(2 + (−1)p+2x)hp+1(x)

=hp−1(x)

hp+1(x)= fp(x). (4.61)

6. From (1),

fp(−x) =hp−1(−x)hp+1(−x)

=(hp−2(x) − hp(x))/x

(hp(x) − hp+2(x))/x

=hp−2(x)/hp+2(x) − hp(x)/hp+2(x)

hp(x)/hp+2(x) − hp+2(x)/hp+2(x)

=fp−1(x)fp+1(x) − fp+1(x)

fp+1(x) − 1

=

(

fp−1(x) − 1

fp+1(x) − 1

)

fp+1(x). (4.62)

We return to the proof of Theorem 4.6.5. To prove the theorem, we will find a recur-rence in p that the actual generating function satisfies, and prove that the stated generatingfunction also satisfies it.

Note that at any particular time, there are exactly p+1 possible positions for the walkersto be in. This can be seen by observing that the walkers must have either all odd or all

164

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.15: A configuration showing all possible even-height states for three vicious walkers in a stripof width 7. There is only one state where the lowest walker is at height 2.

even y-coordinates; in either case, there are p + 1 such y-coordinates in a strip of width2p + 1, which means that there are p + 1 choices for the only unoccupied y-coordinate (or‘hole’), and thus p + 1 choices for the entire set of walkers. Note in particular that if thefirst (lowest) walker has y-coordinate 2 or 3, the no-crossing constraint ensures that the lone‘hole’ is below the first walker, and thus there is exactly 1 possible arrangement for all thewalkers, namely (2, 4, . . . , 2p) if the first walker is at height 2 or (3, 5, . . . , 2p+ 1) otherwise.Here we have again extended the notation of Lemma 4.3.1 in the logical manner. We showthis situation in Figure 4.15.

Let the generating function of the walks be gp(x). We shall show that gp(x) = fp(x).gp(x) is also the g.f. of walks which start from (3, 5, . . . , 2p+1), since that state is a reflectionin y = p+ 1

2of the required starting state (0, 2, . . . , 2p−2). We will construct a recurrence by

considering the generating function of walks which start from (2, 4, . . . , 2p), which we denoteby gp(x). Since this state is the only state that is reachable from (3, 5, . . . , 2p+ 1) after onestep, we know that gp(x) is merely gp(x) with the first step taken off. In other words,

gp(x) =gp(x) − 1

x. (4.63)

Now consider a configuration of walks starting from (2, 4, . . . , 2p). We divide the walksat all points where the first walker has a height of 2. An example is shown in Figure4.16. Looking at the first walker only, there are two possibilities for a divided segment, asdemonstrated in Figure 4.16(b): either the walker steps up, in which case it steps downimmediately and is at height 2, or the walker steps down, spends an unspecified amount oftime oscillating between 0 and 1, and then returns to height 2.

Because the highest p− 1 walkers have only one position to go to when the first walkeris at height 2 or 3, the first possibility has generating function x2.

The second possibility is more complicated. We can say that since the first walker startsat 2, the starting position for all walkers must be (2, 4, . . . , 2p). In particular, the highest

165

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(a) A configuration for three vicious walkers ina strip of width 7.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(b) The first walk of the configuration. There aretwo possibilities for this walk.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

(c) The remaining walks of the configuration. Depending on the firstwalk, they are either totally constrained or are p − 1 vicious walks in astrip.

Fig. 4.16: Dividing configurations when the first walker reaches height 2.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.17: Possible non-trivial paths for the first walker in an end-segment.

p−1 walkers start from (4, 6, . . . , 2p). These walkers never go below the line y = 2, since thatwould violate the vicious constraint. However, until the last point of the segment (where thefirst walker returns to 2), the only other restrictions on the walkers are the vicious constraintand the upper wall.

Therefore we can treat the walkers, from the starting point of the segment to one stepbefore the first walker returns to 2, as p − 1 vicious walkers in a strip of width 2p − 1 =2(p − 1) + 1, namely the strip from y = 2 to y = 2p + 1. The walkers start from the state(4, 6, . . . , 2p). This is shown in Figure 4.16(c). Translating this model down by 2 units showsthat its generating function is gp−1(x).

However, we must restrict the lengths of these walks to be odd, because the first walkermust end at height 1 (to take the next step to height 2). To do this we subtract gp−1(−x)from gp−1(x) and divide by 2. We must also multiply by a factor of x for the uncounted laststep, as there is only a single possibility for this step since the first walker goes to height 2.So the generating function for the second possibility is

1

2x (gp−1(x) − gp−1(−x)) =

1

2(gp−1(x) + gp−1(−x)) − 1 (4.64)

which means that the total generating function for each segment is

x2 +1

2(gp−1(x) + gp−1(−x)) − 1. (4.65)

Now we look at the possible end-segments after the last division point. There are 3possible configurations, of which the non-trivial ones are shown in Figure 4.17. In the firstpossibility, the walks end at the last division point, so the end-segment has generating func-tion 1. In the second, the first walker steps up to height 3, and must end there. This leavesexactly one possibility for the remaining walkers, so the segment has generating function x.

The last possibility is that the first walker steps down to 1 and does not return to 2. Wecan use the same argument as above to relate the remaining walkers to p− 1 vicious walkersin a strip of width 2p− 1. However, this time the first walker does not need to end at height1, so the length can be even or odd. Furthermore, the first walker takes at least 1 step, so wemust ensure that the remaining walkers have length at least 1. This is done by subtracting1 from the generating function to give a g.f. of gp−1(x) − 1. Therefore the total generating

167

function for an end-segment is

1 + x + gp−1(x) − 1 = x+gp−1(x) − 1

x. (4.66)

We now put all our generating functions together. The generating function for p viciouswalkers in a strip of width 2p+1 starting from (2, 4, . . . , 2p) is gp(x). But these walks consistof an arbitrary number of divided segments, followed by exactly one end-segment. Therefore

gp(x) − 1

x=

1

1 − (x2 + 12(gp−1(x) + gp−1(−x)) − 1)

(

x+gp−1(x) − 1

x

)

(4.67)

and so

gp(x) = 1 +x2 + gp−1(x) − 1

2 − x2 − 12(gp−1(x) + gp−1(−x))

=1 + 1

2(gp−1(x) − gp−1(−x))

2 − x2 − 12(gp−1(x) + gp−1(−x))

. (4.68)

Since g1(x) = f1(x) = 11−x−x2 , all that remains is to show that fp(x) satisfies the above

recurrence. This is equivalent to showing that

(4 − 2x2)fp(x) − fp(x)fp−1(x) − fp(x)fp−1(−x) − fp−1(x) + fp−1(−x) − 2 = 0. (4.69)

Now we can apply Lemma 4.6.6, (4), (5) and (6):

(4 − 2x2)fp(x) − fp(x)fp−1(x) − fp(x)fp−1(−x) − fp−1(x) + fp−1(−x) − 2

= (4 − 2x2)fp(x) − fp(x)fp−1(x) −fp−2(x) − 1

fp(x) − 1f 2p (x) − fp−1(x) +

fp−2(x) − 1

fp(x) − 1fp(x) − 2

= (4 − 2x2 − fp−1(x))fp(x) −fp−2(x) − 1

fp(x) − 1(fp(x) − 1)fp(x) − fp−1(x) − 2

= (4 − 2x2 − fp−1(x) − fp−2(x) + 1)fp(x) − fp−1(x) − 2

= (5 − 2x2 − fp−1(x) − (2 − x2 − 1

fp(x)))fp(x) − fp−1(x) − 2

= (3 − x2 − fp−1(x))fp(x) − (1 + fp−1(x))

= (3 − x2 − fp−1(x))1 + fp−1(x)

3 − x2 − fp−1(x)− (1 + fp−1(x))

= 0. (4.70)

The alternate form of hp(x) can be derived by calculating the generating function of

168

hp(x). From Lemma 4.6.6 (3), we have

∑

p≥4

hp(x)yp = (2 − x2)

∑

p≥4

hp−2(x)yp −

∑

p≥4

hp−4(x)yp (4.71)

∑

p≥0

hp(x)yp = 1 + (1 − x)y + (1 − x− x2)y2 + (1 − 2x− x2 + x3)y3

+(2 − x2)y2

(

∑

p≥0

hp(x)yp − 1 − (1 − x)y

)

− y4∑

p≥0

hp(x)yp

(4.72)∑

p≥0

hp(x)yp(1 − (2 − x2)y2 + y4) = 1 + (1 − x)y − (1 + x)y2 − y3 (4.73)

∑

p≥0

hp(x)yp =

1 + (1 − x)y − (1 + x)y2 − y3

1 − (2 − x2)y2 + y4. (4.74)

Expanding in partial fractions gives the alternate form.

As an interesting aside, we know the actual growth constant for the vicious walk modelfrom [90] — for p walkers in a strip of width L, it is 2p

∏ps=1 cos sπ

L+2. By equating this with the

growth constants that we derive from our generating functions, we derive the trigonometricidentities

2pp∏

s=1

cossπ

2p+ 1= 1 (4.75)

2pp∏

s=1

cossπ

2p+ 2=√

p+ 1 (4.76)

and since the inverse of the growth constant is a zero of the denominator of the generatingfunction, we also have

hp+1

(

2pp∏

s=1

cossπ

2p+ 3

)−1

= 0. (4.77)

169

In fact, the first identity is fairly simple to derive by standard methods:

2pp∏

i=1

cosiπ

2p+ 1sin

iπ

2p+ 1=

p∏

i=1

sin2iπ

2p+ 1

= sin2π

2p+ 1sin

4π

2p+ 1. . . sin

(2p− 2)π

2p+ 1sin

2pπ

2p+ 1

= sin2π

2p+ 1sin

4π

2p+ 1. . . sin

3π

2p+ 1sin

π

2p+ 1

=

p∏

i=1

siniπ

2p+ 1(4.78)

where we use sin(π − x) = sin x on the last d p2e terms. But since this also implies

2p+1

p+1∏

s=1

cossπ

2p+ 3= 1, (4.79)

the third identity can be restated as

hp+1

(

2 cos(p+ 1)π

2p+ 3

)

= 0. (4.80)

Furthermore, using a combination of the first and second identities, we can restate thegrowth constant for vicious walkers. It turns out to be

2pp∏

s=1

cossπ

2p+ 2k= 21−k√p+ k

k−1∏

i=1

sec(p + i)π

2p + 2k(4.81)

if L = 2p+ 2k − 2 is even and

2pp∏

s=1

cossπ

2p+ 2k + 1= 2−k

k∏

i=1

sec(p+ i)π

2p+ 2k + 1(4.82)

if L = 2p+ 2k − 1 is odd.

4.7 Growth constants

Recalling our motivation in the introduction, we wanted to look at the dependence of thegrowth constant on n, the friendliness, for fixed width and number of walkers. We will denotethe growth constant for p n-friendly walkers in a strip of width L by µp,n(L). By analyzing thezeros of the denominators of our calculated generating functions, it is immediately apparentthat the growth constant increases monotonically with n. We show a plot of µ against n in

170

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

0 5 10 15 20 25 30

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.18: Growth constants vs. friendliness for 2 walkers, width 3 (plus signs) and 4 (crosses).

Figure 4.18.Now, we would like to know the nature of this dependence. By taking a log-plot of our

growth constants, it seems very likely that the relationship is exponential in nature. Byfitting lines to the points, we derive the (approximate) relationships

µ2,∞(3) − µ2,n(3) ∼ 0.415(2) × 0.7192(1)n (4.83)

andµ2,∞(4) − µ2,n(4) ∼ 0.4329(7) × 0.66265(4)n. (4.84)

The standard errors are shown in brackets. We show the log-plot for width 4 in Figure 4.19.In our previous paper ([36]), we noted that the vicious growth constant for two walkers

is 4 cos πL+2

cos 2πL+2

and the ∞-friendly growth constant is 4 cos πL+4

cos 2πL+4

. Because of this,we speculated that the growth constant for finite friendliness might take the similar form4 cos π

L+λ2(n)cos 2π

L+λ2(n). Unfortunately, this is (rather obviously, as it turned out) not the

case, as simply calculating possible values for λ2(1) by finding the growth constants for L = 4and L = 5 produces different values. However, the idea is that the dependence of the growthconstant on the width of the strip is of a similar nature for any friendliness, and this seemsto be true. We show a plot of this in Figure 4.20.

171

-14

-12

-10

-8

-6

-4

-2

0

0 5 10 15 20 25 30

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

n

Lgrowth constant

ln(µ2,∞(4) − µ2,n(4))

Fig. 4.19: Log-plot of Figure 4.18 for width 4, with asymptotic fitted line.

0

0.5

1

1.5

2

2.5

3

3.5

2 3 4 5 6 7 8

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

n

L

growth constant

Fig. 4.20: Growth constant vs. strip width for 2 walkers, for vicious (plus signs) and 1-friendly(crosses) walkers.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

Fig. 4.21: Example configuration for three 4-friendly walks in a strip of width 5, bandwidth 3.

4.8 Bandwidth

Extending the model to cases with p > 2 suggests a natural generalization. The restrictionthat at most two walkers can meet at any one point — which is the sole difference betweenthe GV ∞-friendly model and the TK ∞-friendly model — seems to be relatively arbitrary.This suggests that we can create another parameter of the model, which we call bandwidth(denoted by b), which denotes the maximum number of walkers that can meet at any pointor line.

Naturally, the concept of bandwidth is only relevant if p, the number of walkers, is greaterthan 2. The GV ∞-friendly model then has b = 2, the smallest possible value it can takewithout forcing the walkers to be vicious, and the TK model has b = p, the largest possiblevalue it can take. We show an example configuration with higher bandwidth in Figure 4.21.

We were able to adapt the method of recurrences described in Section 4.4 for higherbandwidth. Using this method, we calculated some generating functions for p = b = 3, forsmall widths. The results are shown in Appendix A. The most notable feature of thesegenerating functions is that the degree of both numerator and denominator grows morerapidly with n than in the b = 2 case. Also the pattern of ‘fixed’ coefficients which weobserved above appears to be absent.

Theorem 4.8.1. The number of 1-friendly walks with m steps, for fixed p and L, is inde-pendent of the bandwidth b for b > 1.

Proof. Since there are exactly two bonds that a walk can traverse to reach any point, if werequire more than two walkers to reach the same site, then at least 2 walkers must share abond. However, the walkers are osculating, so this is impossible. Thus the number of walksfor any b is the same as the number of walks with b = 2.

4.9 Conclusion

In this chapter, we have analysed the n-friendly directed walks model in a horizontal strip offinite width. Firstly, we showed how we could guess the generating functions of these walks

173

from the first few terms, using Pade approximants. Then we showed how to generate thoseterms, by investigating the general transfer matrix for 2 walkers, or with a more generalmethod based on recurrences.

By extending patterns in these generating functions, we were then able to prove, bymeans of generating function recurrence arguments, a number of general results in one ofthe parameters of the model — number of walkers, strip width and friendliness. We provedseveral cases where the other parameters were small. We then analysed the growth con-stants for the various models, and found that they depend in an exponential manner on thefriendliness of the walkers. We also proposed a new extension to this model, the bandwidth.

We could extend this research by trying to prove general results for other parametricvalues, for example 2 walkers with width 5 and arbitrary friendliness. Theoretically, itshould at least be possible (if extremely tedious) to calculate the generating function for 2walkers in any fixed strip width using similar techniques to those which we have been using.It would be extremely valuable if we could somehow combine all these results and prove ageneral generating function for 2 walkers with any friendliness or strip width, but that isstill some distance away.

Another direction that we could look at is in further modifications to the model. Band-width has not been fully explored, although its usefulness has not really been established;we could also look, among other options, at watermelons instead of stars, or assign differentweights to steps rather than a standard weight for each step — in particular, we could look atthe anisotropic generating functions. There are many modifications that we could possiblymake, but we think that our current model already displays most of the important featuresof this class of models.

174

5. MEAN UNKNOTTING TIMES

5.1 Introduction

A very interesting and important area of human biology is the study of the behaviour ofDNA strands. DNA famously has a double helix structure, but at large scale we can thinkof the double helix structure as a single strand. Then the DNA can more or less be thoughtof as being a long line that is tangled with itself. This is shown in Figure 5.1, taken from[138]. For a cell to replicate, the DNA inside it must be untangled. A good way to describethis process is through the use of knots.

A knot is formally defined as a single, simple, and closed curve in 3 dimensions. WhileDNA is not always closed, this still provides a useful model of entanglement. In the study ofknots, we are generally interested in how the knot is tangled, rather than the exact positionof the curve. Thus the exact dimensions of the knot are usually unimportant. Instead,the most common representation of a knot is through its embedding, which is a drawingof the knot, projected down into a (usually unspecified) plane. However, where the knotcrosses itself on the plane (a crossing), indications are made as to which ‘strand’ of the knotlies higher than the other strand. This is usually done by breaking the lower strand as itapproaches the crossing. Some knots are shown via their embeddings in Figure 5.2.

It is immediately obvious that one knot can have many distinct embeddings, dependingon the plane in which it is projected. Not only is this so, but it is possible to have two knotswhich are different in 3-space, but which have the same embedding. So embeddings are notnecessarily unique representations of knots. What we really wish to represent from a knot,which the embedding encompasses, is its ‘entanglement’ — the way in which it is tangledwith itself. Informally, when two knots have the same entanglement, we say that they areequivalent. The equivalence class of any knot is known as its knot type.

This leads to the very important question: when are two knots equivalent? The intuitiveanswer is obvious: two knots are equivalent when one can be transformed into the otherby ‘moving the strands’ in a continuous physical fashion, without breaking or cutting anystrand. What remains is to make this formal.

This is done by defining the Reidemeister moves. These are operations which we canperform on a knot which do not change its type. They were first introduced in 1927 byReidemeister ([117]). The first move involves untwisting or twisting a loop. The secondinvolves separating two strands which are not tangled with each other, or putting one on topof the other. The third move involves moving a strand underneath a crossing of two strandswith which it is not tangled. These three moves are shown in Figure 5.3.

We define two knots as equivalent if and only if one can be reached from the other through

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.1: Electron micrograph of tangled DNA (from [138]).

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.2: Some knots.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) Move 1

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) Move 2

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) Move 3

Fig. 5.3: Reidemeister moves.

a succession of Reidemeister moves. Intuitively, the Reidemeister moves are all ‘physical’moves — if the knot was a tangled piece of string, all of these moves would be physicallypossible simply by moving the string. Note that it is possible to define other sets of moveswhich will serve the same purpose as the Reidemeister moves, so they are not unique.

Of special importance in knot theory is the concept of the unknot. This is the knot whichcan be embedded as just a single loop which does not cross itself. The unknot is, in a sense,the simplest knot possible. However, it is not always obvious whether any given knot, with agiven embedding, is equivalent to the unknot (unknotted) or not. By looking at and mentally‘manipulating’ the knot via the Reidemeister moves, this can sometimes be worked out, butthere is no fixed process of applying the moves to reach the unknot (or prove that the knot isnot unknotted). We need other, more efficient ways to calculate if two knots are equivalent.This gives rise to the concept of knot invariants.

Knot invariants are numerical properties of knots which are the same for any two knotswhich are equivalent. Ideally, they should be different for any two knots which are notequivalent, although this is not always the case. Therefore knot invariants are generally moreuseful in telling whether two knots are different, rather than whether they are the same. Fora property to be a knot invariant, it must be unchanged under the Reidemeister moves.Some important knot invariants are the Jones polynomial ([78]) and the knot invariant thatwe will use, the Alexander polynomial. For more information on knot invariants and somelinks to statistical mechanics, see [143].

The Alexander polynomial of a knot is a polynomial in one variable that can be calculatedfrom the crossings of the knot. It was invented by Alexander in 1928 ([1]). The Alexanderpolynomial is identical for equivalent knots (and is therefore a true knot invariant), andalthough there are pairs of nonequivalent knots which have the same Alexander polynomial,the frequency of these occurrences is very small. In fact, the ‘smallest’ (i.e. least numberof crossings) knot which has the same Alexander polynomial as the unknot, but is notunknotted, has 11 crossings. So the Alexander polynomial is unambiguous when it comes toidentifying knots with less than 11 crossings.

Technically, the Alexander polynomial is defined via topological avenues, but an easierway would be to define it recursively. Firstly, the notion of an Alexander polynomial isextended so that it is defined for tangled sets of knots (called links). Next, the Alexanderpolynomial of an unknot, or any number of disjointed unknots (an unlink), is 0. Now supposethat we wish to calculate the Alexander polynomial of a link L, which we call ∆L(x). If wetake one crossing of L and change it to the positions shown in Figure 5.4, we then call theresulting links L+, L0 and L−, one of which will be L. The Alexander polynomial can thenbe defined by the equation

∆L+(t) − ∆L−(t) + (t−1/2 − t1/2)∆L0

(t) = 0. (5.1)

Although an elegant definition, the recursive definition of the Alexander polynomial doesnot lend itself well to calculation. By choosing the crossing of L wisely, we can always ensurethat both of the remaining links are ‘simpler’ than L. However, it is not always clear whichcrossing to choose.

177

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) L+

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) L0

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) L−

Fig. 5.4: Different positions for a single crossing.

The Alexander polynomial may be calculated efficiently in the following way (taken from[98]): take a knot L with a particular embedding (the Alexander polynomial will be thesame for all embeddings). Then orient the knot in one particular direction. This is done bygiving a direction to the knot strand that it follows the whole way around the knot, and isrepresented by arrows. We show this in Figures 5.5(a) and 5.5(b).

Now suppose that L has n crossings. Start at any arbitrary point on the knot and travelin the direction of the orientation. Whenever an ‘underpass’ (a crossing where the strand weare travelling on goes below the crossing strand) is reached, assign that crossing a numberwhich is one more than the previous underpass (starting from 1). Since all crossings haveexactly one strand which goes underneath, every crossing will have exactly one numberassigned to it. Then divide the knot into segments divided by the underpasses, and assigneach segment a number which is the same number as the labelling of the next underpass(according to the orientation of the knot). We show this by continuing with our example inFigure 5.5.

Now define an n× n matrix called the knot matrix, where each row is determined by thecorrespondingly numbered underpass. Suppose that the underpass labelled k goes under asegment labelled i. If i = k or i = k + 1 (where the labelings are circular so that n + 1 isequivalent to 1), then the elements of the knot matrix in row k are ak,k = −1, ak,k+1 = 1 andall other elements are 0. If i 6= k or k+ 1, then we further classify each underpass accordingto whether the overpass approaches from the right or the left, as shown in Figure 5.6.

If the overpass approaches from the right as we travel along the direction of the underpass,then the elements in row k are ak,k = −t, ak,k+1 = 1, and ak,i = t− 1 with all other elements0. If the overpass approaches from the left, the elements ak,k and ak,k+1 are swapped around.In our example, the knot matrix is

−t 1 0 t− 1t− 1 1 −t 0

0 t− 1 −t 1−t 0 t− 1 1

. (5.2)

The Alexander polynomial is the determinant of any n − 1 × n − 1 minor of the knotmatrix, multiplied by a factor of ±tm so that its lowest-power term in t is a positive constant.

178

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) An example knot.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) An orientation for the knot.

24

3

1

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) An underpass numbering forthe knot. The starting point isindicated.

1

23

4

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(d) A segment numbering for theknot.

Fig. 5.5: Calculating the Alexander polynomial of an example knot.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) Approaching from the left.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) Approaching from the right.

Fig. 5.6: Different types of crossing.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.7: The action of topoisomerase II (from [118]).

Therefore the Alexander polynomial of the knot in our example is

−t−1(−t + 3t2 − t3) = 1 − 3t+ t2. (5.3)

The Alexander polynomial gives us a way of distinguishing knots, so now we can startdoing calculations with them. We return to our original idea of modelling DNA strands byknots. In our model, we hypothesise that the DNA is twisted in a knot-like fashion. Thuswe can represent a strand of DNA by a single knot.

Now, in order to replicate, the strand of DNA must be untangled. The shape of DNAis changed by enzymes known as topoisomerases; in particular, it becomes disentangled bythe use of the enzyme topoisomerase II, which acts on the DNA by breaking it at one point,passing another section of the DNA through the break, and then rejoining the broken ends.This process is well known; for a more detailed account, see [118], [137] or [136]. The processis illustrated in Figure 5.7 (taken from [118]).

We model this physical process by means of reversing crossings. To reverse a crossing ina knot, we swap the strands of a crossing so that the strand which was previously on topnow lies on the bottom, and vice versa. The remainder of the knot is unchanged. We showthis process in Figure 5.8.

The action of reversing a crossing produces an identical effect to the topoisomerase en-zyme. To continue the analogy, we will then need to transform a knot into the unknot by

180

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.8: Reversing a crossing.

means of crossing reversals (we often say the unknot instead of ‘all knots which are equiva-lent to the unknot’). Certainly a crossing reversal will almost always produce a knot with adifferent knot type to the original, and it seems reasonable that, given enough time, we couldtransform any knot into the unknot by means of this operation. In fact we can always dothis, so then the next issue becomes the time needed to effect this transformation. In termsof our knot model, the question becomes: given a particular knot, what is the minimumnumber of crossing reversals needed to transform that knot into the unknot?

This number is called the unknotting number of the knot. This is a question which hasbeen well studied (see for example [81], [27], [24], [125] or [80]), but only the unknottingnumber of the smallest knots are known (there is a knot with a minimal embedding of 8crossings for which the unknotting number is not yet known). This is a reflection of the factthat the unknotting number is the minimum number of crossing reversals needed over allembeddings of the knot, so we must consider all possible embeddings of a particular knotbefore we can be certain that we know the unknotting number.

In particular, a minimal embedding (embedding with the least number of crossings) of aknot may need more crossing reversals to reach the unknot than another embedding of thatsame knot with more crossings, so it is not sufficient to just consider minimal embeddings.We show how this is possible in Figure 5.9 (taken from [27]).

As an example, in Figure 5.10 we show how the knot which is known as 51 can beunknotted in two crossing reversals. It can be shown that it is impossible for 51 to beunknotted with just one reversal, so the knot 51 has unknotting number 2.

For our DNA model, it seems unlikely that the topoisomerase enzyme, being an enzymewhich acts locally, will know exactly which crossings to reverse to unknot the DNA in theminimum time possible. As such, the unknotting number may not be a relevant statisticto calculate in this circumstance. A more likely scenario is that the enzyme acts randomlyon the DNA. In this case, a more relevant question to investigate would be: if we reversecrossings at random, what is the average number of reversals needed to unknot a given knot?We call this number the mean unknotting time of the knot, and this is the question that wewill be studying in this chapter. As far as we are aware, the mean unknotting time has notbeen studied previously.

One small variation that we will also consider is the use of immediate reversals. Underthe above model, it is possible to reverse a crossing, and then immediately reverse it again,resulting in the original knot. Since this entire operation does nothing to unknot the DNA,we also look at cases where we do not allow the reversing of a crossing which has just beenreversed.

181

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) A knot with 10 crossings. Thisembedding is minimal and cannotbe unknotted with 2 reversals.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) An equivalent knot which canbe unknotted by reversing the 2marked crossings.

Fig. 5.9: A knot demonstrating why it is not sufficient to consider minimal embeddings to find theunknotting number.

51

31 0

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.10: The knot 51 takes two crossing reversals to become the unknot.

Unfortunately, while the unknotting number depends only on the knot type, and not theembedding of the knot, the mean unknotting time of a knot does depend strongly on theembedding used. To see this, just observe that adding a simple twist to any knot via a typeI Reidemeister move will not change the type of the knot, but will create an extra crossingthat does not change the knot type when reversed. So it makes much more sense to talkabout the mean unknotting time of a knot embedding, rather than a knot. But, given aknot, what embedding should we then use?

We can go in two directions: either we can always use the minimal embedding, or we canuse a random embedding. Both these ways present some difficulties. If we use a minimalembedding, can we be sure that there might not be some other, also minimal, embedding ofthe same knot which has a different mean unknotting time? And if we wish to use a randomembedding, how can we generate such embeddings? And how would we define a distributionof embeddings which would be random anyway? In fact, is there even such a thing as arandom embedding of a knot?

There are no easy answers to these questions — we really need to derive our knots bylooking back at the DNA which we are modelling. Nevertheless, we have used both theseapproaches. For small knots, we have chosen a particular minimal embedding, and calculatedthe mean unknotting time for that embedding. For larger knots, we approximate a randomknot and embedding by using self-avoiding polygon trails.

The idea of a self-avoiding polygon trail is an extension of the concept of a self-avoidingwalk. We have already discussed self-avoiding walks in Chapter 4. Briefly speaking, thisis a walk on the integer square lattice Z

2 that never touches itself. We can extend theidea of a self-avoiding walk by imposing the condition that the end-point must be adjacentto the starting point. Adding the bond between the start and end points then results ina closed walk which does not intersect itself, called a self-avoiding polygon. Self-avoidingpolygons have also been studied extensively (see for example [96]). For our purposes, it isan interesting construct because the closed loop imitates the shape of a knot.

However, for self-avoiding walks and polygons the walk cannot cross itself, which gives usno analogues for the crossings of a knot. To solve this problem, we relax the self-avoidanceconstraint — whereas a self-avoiding walk must never occupy the same point twice, we nowonly require that the walker never occupy the same bond twice, so it may visit a previouslyvisited site. In this manner, the walk may cross itself at a vertex without breaking theavoidance constraint. A walk with this constraint is called a self-avoiding trail, and againhas been studied widely (see for example [73], [145] or [96]).

We amalgamate these ideas by imposing both polygonal and trail conditions at the sametime. Thus a self-avoiding polygon trail (which we often shorten to SAPT) is a path whichcannot occupy the same bond twice, but which must end at the point from which it started.Formally, we define a SAPT of length n to be a path p = p0, p1, . . . , pn where:

• pi ∈ Z × Z for all i = 0, 1, . . . , n;

• p0 = pn = (0, 0);

• |pi+1 − pi| = 1 for all i = 0, 1, . . . , n− 1; and

183

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.11: Example of a SAPT.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.12: Converting a SAPT to a knot.

• if pi = pj and i 6= j, then pi+1 6= pj−1, pj+1 and pi−1 6= pj−1, pj+1.

Note that SAPTs are not rooted, which means that a SAPT is equal to its (translated) cyclicpermutations, or directed, which means that it is also equal to the SAPT which traverses thesame bonds in reverse. We show an example of a self-avoiding polygon trail in Figure 5.11.At the points where the trail ‘veers off’ from a vertex, it actually touches that vertex andmoves off at a right angle, but we draw it in this way to differentiate from the case whereit crosses itself. For the remainder of this chapter, we will not use this configuration in ourexamples, so if a trail visits a vertex twice, it is crossing itself.

Now we can use self-avoiding polygon trails to model knot embeddings. The placeswhere the SAPT crosses itself will become the crossings (we do not use the vertices wherethe path touches itself without crossing). At crossings, there is no indicator of which strandshould pass over the other, so we assign a random strand to be the overpass. This yields anembedding of a knot. We illustrate this in Figure 5.12. Note that due to the necessity ofchoosing overpasses, a SAPT with m crossings can generate 2m distinct embeddings.

Self-avoiding polygon trails now help us solve our problem of precisely defining a randomknot embedding. We define a random knot embedding as an embedding generated from arandomly generated SAPT of fixed length. Then our random embedding depends only onthe length of the SAPT used to generate it. The only disadvantage this has is that we canno longer specify the exact knot that we wish to use. However, since a longer SAPT hasmore chance of generating a more complex knot, the length of the SAPT is in some way ameasure of the complexity of the knot which we attempt to unknot. It also corresponds tothe length of the DNA.

184

To generate the SAPT, we must choose an element from the set of all SAPTs of fixedlength, where all elements of the set are equally likely to be chosen. To achieve this weadapted the pivot method for self-avoiding walks. The pivot method, which we shall formaliselater, was invented by Lal in 1969 ([91]), but only really came to prominence when it wasstudied in detail by Madras and Sokal in 1988 ([97]). Essentially, in every iteration of themethod, we take a section of the walk, apply a length-preserving transformation to onesection, and then join it back to the remainder.

The pivot method can be adapted to self-avoiding polygon trails efficiently, and by con-verting our generated SAPTs into knots and studying the time taken to unknot these knots,we can derive a relation between the length of the SAPT and the mean unknotting time.

In Section 5.2 we show the results we obtained for mean unknotting times when weused the minimal embeddings of small knots. In Section 5.3, we present a proof of Kesten’spattern theorem, which is instrumental in our analysis of self-avoiding polygon trails, and wegive a quick primer on Fourier transforms in Section 5.4. In Sections 5.5 and 5.6, we presentrigorous bounds on the mean unknotting time, by looking at a seemingly unrelated problem.In Sections 5.7 and 5.8, we define the pivot algorithm that we used to generate SAPTs andshow that it is indeed a valid algorithm. In Section 5.9, we detail our experimental resultsfrom generating SAPTs, and in Section 5.10, we look back at what we have done and suggestfuture avenues of research.

5.2 Small knots

To start our analysis of mean unknotting times, we calculated the mean unknotting times forsome small knots — that is, all knots with minimal embedding of up to 7 crossings. As wepointed out before, using different embeddings of the same knot (or even different minimalembeddings) may change the mean unknotting time. We have used the knots from Table1.1 of [95].

For very small knots, it is possible to calculate the mean unknotting time exactly byhand, simply by considering all possible crossing reversals until all possibilities reach theunknot, or a previously considered knot. For example, take the knot 51. Since it has 5-foldsymmetry, reversing any crossing will result in the same knot up to rotation, which is 31 asshown in Figure 5.13. Suppose then that we allow immediate reversals. If we then reversethe crossing that was first reversed, which happens with probability 1

5if we reverse random

crossings, we will obviously get back the knot 51. On the other hand, if we reverse anyother crossing (with probability 4

5), we will get the unknot, some cases of which are shown

in Figure 5.13.Therefore we can draw the transfer graph of the resulting Markov chain in Figure 5.14,

where each transition is labelled by the probability of its occurrence. We can then use thisto generate the probability of any sequence of reversals. To arrive at the mean unknotting

185

1 1

1

1/5

1/5 1/5 . . .

0 0

3 (x5)5

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.13: Some possible reversals of 51.

11 01

1/5

4/535

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.14: Transfer diagram for 51.

time we average the unknotting time over all possible unknotting sequences:

mean unknotting time = 2 · 1 · 4

5+ 4 · 1 · 1

5· 1 · 4

5+ 6 · 1 · 1

5· 1 · 1

5· 1 · 4

5+ . . .

=8

5

(

1 + 2 · 1

5+ 3 ·

(

1

5

)2

+ . . .

)

=8

5

1

(1 − 1/5)2

=8

5

25

16=

5

2. (5.4)

If we disallow immediate reversals, the procedure is similar. We were able to apply thisprocedure to all knots with 5 crossings or less, and the knots 61 and 31#31. The results thatwe derived are in Table 5.1.

As the knots become more complicated, more and more possible states occur, so that itbecomes very tedious to enumerate all possibilities by hand. We can still estimate the meanunknotting time of a knot by simply taking the knot, fixing the embedding, and randomlyunknotting it many times, taking the average of the number of reversals required. We didthis for knots of up to 7 crossings, calculating the Alexander polynomial in the mannerdescribed in the introduction to find out if a given knot was unknotted. Note in particularthat since we did not use knots of more than 7 crossings, the Alexander polynomial is totallyaccurate at finding this.

186

Unknotting Immediate reversals No immediate reversalsKnot Embedding number µ Var µ Var

31

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 1 0 1 0

41

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 1 0 1 0

51

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

2 52

54

2 0

52

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 2011

105121

85

625

61

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 158

5164

53

29

31#31

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

2 103

349

259

14581

Tab. 5.1: Exact mean unknotting times.

For each knot, we ran the unknotting procedure 1,000,000 times, averaging the unknottingtime over all the runs. We also calculated the variance of the runs, in order to observe howthe unknotting time varies. We did this twice, once with immediate reversals allowed andonce without. The results from this process are in Table 5.2.

It is worth noting that we did not perform any Reidemeister moves to reduce the numberof crossings once we started reversing. This can, and did, result in occurrences such as aknot becoming a more complicated knot when we reversed a crossing.

As might be expected, there appears to be some relation between unknotting numberand mean unknotting time. Also, the mean unknotting time is always smaller if we disallowimmediate reversals, which again is not unexpected since immediate reversals never makethe knot ‘more unknotted’.

5.3 Kesten’s pattern theorem

Now we progress from observing small knots to modelling knots with self-avoiding polygontrails. Firstly we attempt to find some theoretical bounds on the mean unknotting time. Todo this, we need a few theoretical results. One important theorem that we will use is a versionof Kesten’s pattern theorem, applied to self-avoiding polygon trails. The original patterntheorem, for self-avoiding walks, was proved by Kesten in 1963 ([83]). Loosely speaking,this theorem tells us how many times a random SAPT of fixed length will contain a givenpattern (smaller self-avoiding trail).

We will use this theorem in two ways, to find both a lower and an upper bound on themean unknotting time. It will tell us that both the average number of crossings and theaverage number of ‘knotted’ segments (segments that would be knotted if their endpointswere joined) in a random SAPT is proportional to its length.

Our proof is based on the corresponding proof for lattice ribbons given in [134], and thecorresponding proof for self-avoiding walks in [96]. We will start off with self-avoiding trails,and then move to the closed polygon case. First we make some definitions.

Definition 5.3.1. Let t = (x0, y0), (x1, y1), . . . , (xn, yn) be a self-avoiding trail in Z2. We

call t a half-space trail if for all i > 0, x0 < xi.We call t a bridge if for all i > 0, x0 < xi ≤ xn.We call the line x = x′ a cutting line of the bridge t if there exists an index j where

0 < j < n, xj = x′, and the x-coordinates satisfy the conditions

x0 < xi ≤ xj ∀ 0 < i < j (5.5)

andxj < xi ≤ xn ∀ j < i < n. (5.6)

A cutting line divides a bridge into two separate non-empty bridges. If a non-empty bridgehas no cutting line, we call it prime. Note that bridges can be trivial, but prime bridgescannot.

188

Unknotting Immediate reversals No immediate reversalsKnot Embedding number µ Var µ Var

62

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 2.4204(14) 1.8573(57) 2.0719(8) 0.6905(21)

63

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 2.4273(17) 2.8560(85) 2.1563(13) 1.6288(45)

71

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

3 4.2644(20) 4.1330(124) 3.4672(12) 1.4209(54)

72

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 2.6649(16) 2.6763(66) 2.3534(12) 1.3966(33)

73

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

2 3.6825(19) 3.4952(99) 3.0598(12) 1.3608(45)

74

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

2 3.0230(14) 2.0674(60) 2.5617(8) 0.7129(26)

75

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

2 3.8202(20) 4.0246(113) 3.1884(18) 1.7776(58)

76

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 3.0256(19) 3.4376(93) 2.6294(14) 1.8312(51)

77

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

1 2.4066(16) 2.4339(70) 2.1590(12) 1.4218(39)

31#41

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

2 3.445(21) 4.2895(127) 2.9460(15) 2.3340(74)

Tab. 5.2: Estimated mean unknotting times. Numbers in brackets are the standard errors.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) A half-space trail.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) A bridge, with acutting line.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) A prime bridge.

Fig. 5.15: Three types of trails. The starting vertices are denoted by hollow circles.

Examples of these terminologies are given in Figure 5.15, where the first vertex is denotedby a hollow circle.

Now let t(n) be the number of self-avoiding trails of length n, and let T (x) be its gener-ating function, i.e.

T (x) =∑

n≥0

t(n)xn. (5.7)

We define b(n) and B(x), h(n) and H(x), and p(n) and P (x) correspondingly for bridges,half-space trails and prime bridges respectively.

To prove that trails and bridges grow exponentially, we first quote the following lemma,which is part of [96, Lemma 1.2.2].

Lemma 5.3.1. Let an (n ≥ 1) be a sequence of real numbers which is sub-additive, i.e.an+m ≤ an + am for all n,m. Then the limit limn→∞

1nan exists.

Now for any n and m, splitting a self-avoiding trail of length n + m by cutting it afterthe nth step results in two self-avoiding trails of length n and m respectively. Furthermore,splitting two different trails this way will never produce the same component trails. Therefore

t(n +m) ≤ t(n)t(m) (5.8)

and soln t(n+m) ≤ ln t(n) + ln t(m). (5.9)

Thus ln t(n) is sub-additive, and limn→∞1n

ln t(n) exists. We set µt to be the growth constantfor self-avoiding trails, so that lnµt = limn→∞

1n

ln t(n).Unfortunately, splitting a bridge does not always result in two bridges, so we cannot use

the same trick. However, joining two bridges of length n and m respectively always results

190

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constant

hn1hn2

hn3

hn4

Fig. 5.16: Decomposition of a half-space trail. Here A1 = 6, A2 = 5, A3 = 3 and A4 = 2. We alsohave n1 = 8, n2 = 21, n3 = 29, and n4 = n = 33.

in a bridge of length n +m, which is unique given n and m. So we have

b(n +m) ≥ b(n)b(m) (5.10)

and therefore− ln b(n+m) ≤ − ln b(n) − ln b(m). (5.11)

Then − ln b(n) is sub-additive, so again limn→∞1n

ln b(n) exists. We set µb to be the growthconstant for bridges.

Definition 5.3.2. If A is an integer, we define PD(A) to be the number of partitions ofA into distinct integers, i.e. the number of ways to write A = A1 + A2 + · · · + Ak, whereA1 > A2 > · · · > Ak.

From [69], we know that

lnPD(A) ∼ π

√

A

3as A→ ∞. (5.12)

In order to prove our main result, we need a few preliminary theorems.

Theorem 5.3.2.h(n) ≤ PD(n)b(n). (5.13)

Proof. We essentially ‘unfold’ the half-space trail until it becomes a bridge. We start with ahalf-space trail h of length n, which consists of the points h0, h1, . . . , hn, where hi = (xi, yi).Let n1 be the index of the rightmost point in h — if this is not unique, we take the pointwith the largest index. Let A1 = xn1

−x0. Then the trail with the points hn1, hn1+1, . . . , hn is

the reflection in a vertical line of a (shorter, and more horizontally narrow) half-space trail.We can then define A2 and n2 similarly for this trail, and continue recursively until we reachthe end. We give an example in Figure 5.16.

191

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) A half-space trail in H22[5, 2]. Heren1 = 16.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) The half-space trail after one reflection.Now it is in H22[7].

Fig. 5.17: Transformation of a half-space trail.

By construction, each of the subtrails hnj , hnj+1, . . . , hnj+1is either a bridge or the reflec-

tion of one. Furthermore, if there are k such subtrails, we must have A1 > A2 > · · · > Ak > 0since the original trail is a half-space trail. Let Hn[a1, a2, . . . , ak] be the set of n-step half-space trails which, when decomposed by this method, have Ai = ai for all 1 ≤ i ≤ k. Notethat, as k = 1 for bridges, Hn[a] is the set of bridges where xn − x0 = a.

Now, given a half-space trail h ∈ Hn[a1, a2, . . . , ak], we transform it to a new half-spacetrail h′ as follows: for 1 ≤ i ≤ n1, we set h′i = hi, and for n1 < i ≤ n, we set h′i to be thereflection of hi in the line x = xn1

. We illustrate this in Figure 5.17. As all the points thatwe reflect are on the left of this line, h′ is still a half-space trail, and as the points we reflectbecome another half-space trail, we can say that h′ ∈ Hn[a1 + a2, a3, . . . , ak]. Furthermore,the reflected half-space trail never touches the reflecting line apart from the first point, sogiven a transformed half-space trail and a1, we can reverse the transformation. Thereforethis transformation is one-to-one, and we can say

|Hn[a1, a2, . . . , ak]| ≤ |Hn[a1 + a2, a3, . . . , ak]|. (5.14)

Since any half-space trail can be decomposed in this manner, we have

h(n) =∑

k

∑

a1,a2,...,ak

|Hn[a1, a2, . . . , ak]|

≤∑

k

∑

a1,a2,...,ak

|Hn[a1 + a2, a3, . . . , ak]|

≤ . . .

≤∑

k

∑

a1,a2,...,ak

|Hn[a1 + a2 + · · ·+ ak]| (5.15)

If we let a = a1 +a2 + · · ·+ak, and let b(n, a) be the number of bridges where xn−x1 = a,

192

while noting that for all bridges a ≤ n, then we get

h(n) ≤∑

k

∑

a1,a2,...,ak

|Hn[a1 + a2 + · · ·+ ak]|

=∑

k

∑

a1,a2,...,ak

b(n, a1 + a2 + · · · + ak)

=

n∑

a=1

PD(a)b(n, a)

≤n∑

a=1

PD(n)b(n, a)

= PD(n)b(n). (5.16)

Theorem 5.3.3.µt = µb. (5.17)

Proof. Obviously, every bridge is a self-avoiding trail, so for all n we have b(n) ≤ t(n), andtherefore µt ≥ µb. We need merely show that the reverse inequality also holds.

Firstly, given that

lnPD(A) ∼ π

√

A

3as A→ ∞, (5.18)

we can say that for any arbitrary ε > 0, there exists a constant c(ε) such that for all A,

PD(A) ≤ c(ε) exp

(

(1 + ε)π

√

A

3

)

(5.19)

since if this was not true, PD(A) would have to be greater than or equal to exp(

(1 + ε)π√

A3

)

for all A larger than some finite integer, which contradicts the first statement.Now suppose that we are given an arbitrary trail t, with points ti = (xi, yi). Let x′ =

min0≤i≤n xi, and let m be the largest index such that xm = x′. Then the trail tm, tm+1, . . . , tnis a (possibly trivial) half-space trail of length n−m, and the trail tm− (1, 0), tm, tm−1, . . . , t0is a half-space trail of length m + 1. An example of this decomposition is shown in Figure5.18. This decomposition can be reversed to produce the original trail, so it is one-to-one.Therefore for any ε > 0,

193

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

tm

(a)

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b)

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c)

Fig. 5.18: Decomposition of a self-avoiding trail into two half-space trails. (a) is the original trail;it decomposes into (b) and (c).

t(n) ≤n∑

m=0

h(n−m)h(m+ 1)

≤n∑

m=0

PD(n−m)b(n−m)PD(m + 1)b(m+ 1)

≤ b(n + 1)

n∑

m=0

PD(n−m)PD(m+ 1)

≤ b(n + 1)n∑

m=0

c2(ε) exp

(

(1 + ε)π

√

2

3

(

√

n−m

2+

√

m+ 1

2

))

≤ c2(ε)b(n + 1)n∑

m=0

exp

(

(1 + ε)π

√

2

3

√n−m +m+ 1

)

= c2(ε)(n+ 1)b(n + 1) exp

(

(1 + ε)π

√

2

3(n+ 1)

)

(5.20)

Here we have used the inequality√x +

√y ≤ √

2x + 2y for non-negative x and y. Sincenone of the terms in the right-hand side of the above inequality are exponential in n apartfrom b(n + 1), and ε can be made arbitrarily small, we can say that µt ≤ µb. This provesthe theorem.

Lemma 5.3.4.

B(x) =1

1 − P (x). (5.21)

Proof. Every bridge is either trivial or non-trivial. If it is non-trivial, it is either prime or

194

has a cutting line. In the first case, it is then a prime bridge followed by an empty bridge.In the latter case, we cut it at the last point (index-wise) on its leftmost cutting line. Thisresults in a prime bridge followed by a non-empty bridge. This decomposition is unique.Conversely, joining any prime bridge and another bridge will result in a unique non-trivialbridge. Therefore

B(x) = 1 + P (x)B(x) (5.22)

which, when rearranged, gives the lemma.

Now let p be a prime bridge of length m. We say that another self-avoiding trail tcontains p if t contains some translate of p, in other words if there exists a j such thatti+j = pi + tj for all 0 ≤ i ≤ m. When used in this context, p is often called a pattern. LetT (x; p) be the generating function for all self-avoiding trails which do not contain p, and letH(x; p), B(x; p) and P (x; p) be defined likewise. Furthermore, we define µt(p) and µb(p) asthe growth constants of the number of trails and bridges which do not contain p — it canbe proved that these exist in a manner similar to that which we used to prove the existenceof µt and µb. With this notation, we are ready to prove an important theorem.

Theorem 5.3.5. For any prime bridge p,

µt(p) < µt. (5.23)

In other words, the proportion of self-avoiding trails of length n that do not contain p becomesexponentially small as n→ ∞.

Proof. After all our preliminary work, the proof is actually quite simple. Firstly, fromTheorem 5.3.3, we know that

µt = µb. (5.24)

From Lemma 5.3.4 we have

B(x) =1

1 − P (x). (5.25)

Using similar arguments (and the property that you cannot divide a prime bridge intotwo prime bridges), it can be shown that these equations hold for trails and bridges that donot contain p, i.e.

µt(p) = µb(p) (5.26)

and

B(x; p) =1

1 − P (x; p). (5.27)

Now because the radius of convergence of B(x) is 1µb

, we must have P ( 1µb

) = 1. Since

at least one prime bridge contains p (which is p itself), we know that P (x; p) < P (x) andtherefore P ( 1

µb; p) < 1. This implies that B( 1

µb; p) is finite, and therefore that its radius of

195

convergence is greater than 1µb

, i.e.1

µb(p)>

1

µb. (5.28)

Then we haveµt(p) = µb(p) < µb = µt (5.29)

which proves the theorem.

Now we finally come to the pattern theorem for self-avoiding trails.

Theorem 5.3.6. Let t(n; p,m) be the number of self-avoiding trails of length n where theprime pattern p occurs at most m times. Then there exists a number a(p) > 0 such that

limn→∞

1

nln t(n; p, a(p)n) < lnµt. (5.30)

Proof. Since µt is the growth constant of t(n) and µt(p) is the growth constant of t(n; p, 0),and µt(p) < µt, then given a small enough ε > 0, we can find an m such that for all n ≥ m,

t(n) ≤ (µt(1 + ε))n (5.31)

andt(n; p, 0) ≤ (µt(1 − ε))n. (5.32)

Furthermore, as explained in Theorem 5.3.3, there exists a constant c so that for all n,

t(n) ≤ c(µt(1 + ε))n. (5.33)

Let ε be fixed. Now consider an arbitrary self-avoiding trail t of length n. Divide t intom-step subtrails by taking the ith segment to be the self-avoiding trail tim, tim+1, . . . , t(i+1)m.We will have M = b n

mc such subtrails. This decomposition is reversible, so (taking into

account the n−mM remaining steps) we have

t(n) ≤ (t(m))M t(n−mM). (5.34)

Now suppose that the original t contains the prime pattern p at most k times. Then thenumber of subtrails which contains any copies of p at all cannot exceed k. By summing over

196

the actual number of these subtrails, we derive

t(n; p, k) ≤k∑

j=0

(

M

j

)

(t(m))j(t(m; p, 0))M−jt(n−mM)

≤k∑

j=0

(

M

j

)

(µt(1 + ε))mj(µt(1 − ε))m(M−j)t(n−mM)

≤ cµmMt (µt(1 + ε))n−mMk∑

j=0

(

M

j

)

(1 + ε)mj(1 − ε)mM−mj

= cµmMt µn−mMt (1 + ε)n−mMk∑

j=0

(

M

j

)(

1 + ε

1 − ε

)mj

(1 − ε)mM

≤ cµnt (1 + ε)m(k + 1)

(

M

k

)(

1 + ε

1 − ε

)mk

(1 − ε)mM (5.35)

assuming that M > 2k, since 1+ε1−ε > 1 and n−mM < m by construction.

Now choose a number ρ. We have

1

Mln

(

M

ρM

)

=1

Mln

M !

(M − ρM)!(ρM)!

=1

M(lnM ! − ln(M − ρM)! − ln(ρM)!)

→ 1

M(M lnM −M − (M − ρM) ln(M − ρM) +M − ρM − ρM ln ρM + ρM)

= (1 − ρ) lnM

M − ρM− ρ ln ρ

= −(1 − ρ) ln(1 − ρ) − ρ ln ρ (5.36)

as M → ∞, from Stirling’s approximation. Now as ρ → 0+, 1 − ρ → 1−, so the first termwill tend to 0 from above and the second term will also tend to 0 from above. Since we havenot placed any restrictions on ρ, we can therefore make this expression as small as we wantby setting ρ close to 0. We will use this in the above expression by setting k = ρM . Theonly restriction on k is that M > 2k, which requires ρ < 1

2— but this merely places an

upper, not a lower bound on ρ.Returning to the original quantity of interest, we note that m is fixed, so as n → ∞,

197

M → ∞. We also have n > M by construction. This gives us

1

nln t(n; p, ρM) ≤ 1

nln(c(1 + ε)m) + lnµt +

1

nln(ρM + 1) +

1

nln

(

(

M

ρM

)(

1 + ε

1 − ε

)ρmM

(1 − ε)mM

)

<1


1

nln(ρM + 1) +

1

Mln

(

(

M

ρM

)(

1 + ε

1 − ε

)ρmM

(1 − ε)mM

)

=1


1

nln(ρM + 1) +

1

Mln

(

M

ρM

)

+ ρm ln

(

1 + ε

1 − ε

)

+m ln(1 − ε)

(5.37)

Now choose ρ to be small enough that

−m ln(1 − ε) > limM→∞

1

Mln

(

M

ρM

)

+ ρm ln

(

1 + ε

1 − ε

)

. (5.38)

As both m and ε are fixed, and 1 − ε < 1, the left-hand side is positive, so this is possible.This gives

limn→∞

1

nln t(n; p, ρM) ≤ lim

n→∞

[

1


1

nln(ρM + 1)

]

+ limM→∞

[

1

Mln

(

M

ρM

)

+ ρm ln

(

1 + ε

1 − ε

)

+m ln(1 − ε)

]

< lnµt. (5.39)

If we take a(p) to be small enough to satisfy a(p)n < ρM for all n, this proves thetheorem.

The result that we really want is the corresponding result for self-avoiding polygon trails,which we will now prove. Firstly, we need to prove that the number of self-avoiding polygontrails grows like the number of self-avoiding trails. Let to(n) be the number of self-avoidingpolygon trails of length n. Obviously this is only non-zero when n is even.

Lemma 5.3.7. The limit

limn→∞

1

2nln to(2n) (5.40)

exists.

Proof. Suppose that we are given two self-avoiding polygon trails, one of length n and oneof length m. The first SAPT must contain a bond whose centre is at least as far right (i.e.has a maximal x-coordinate) as the centre of any other bond in the SAPT. If this bondis horizontal, the rightmost vertex must be incident on another bond whose centre has aneven greater x-coordinate, so this bond must be vertical. Similarly, the second SAPT mustcontain a vertical bond whose centre is at least as far left as any other bond.

198

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) A SAPT with rightmostbond marked.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) A SAPT with leftmost bondmarked.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) The two SAPTs are joined.

Fig. 5.19: Joining two SAPTs to form a larger one.

Suppose that the rightmost bond of the first SAPT occupies the points (x′, y′) and (x′, y′+1). Now translate the second SAPT so that its leftmost bond occupies the points (x′ + 1, y′)and (x′ + 1, y′ + 1). Then form a new SAPT which consists of all bonds in the two SAPTsexcept for the rightmost bond of the first and the leftmost bond of the second, and whichcontains the bonds (x′, y′), (x′ + 1, y′) and (x′, y′ + 1), (x′ + 1, y′ + 1). As all the bonds of thefirst SAPT lie in the half-plane x ≤ x′, and the bonds of the second SAPT lie in x ≥ x′ + 1,this construct is indeed a SAPT. By construction, it has length n+m. We demonstrate thisconstruction in Figure 5.19.

Conversely, given a SAPT of length n+m that has been constructed in this way, and thelength of the component SAPTs, we can uniquely deconstruct the SAPT by taking the firstSAPT to contain the leftmost n points, and the second to contain the rightmost m points,with bonds determined in an obvious way. Therefore we have

to(n)to(m) ≤ to(n+m). (5.41)

From Lemma 5.3.1, and given that n and m must be even, this implies that

limn→∞

1

2nln to(2n) (5.42)

199

exists.

We define the growth constant of to(n) to be µot .

Theorem 5.3.8.µot = µt. (5.43)

Proof. As removing any bond from a SAPT and then assigning a direction to the remaindergives a unique self-avoiding trail, we can say that to(n) ≤ t(n− 1), and therefore µot ≤ µt.

To prove the reverse inequality, we take advantage of Theorem 5.3.3, and prove thatµot ≤ µb. For any point a, we define the set Bn[a] to be the set of bridges of length n thatend at a (and, by definition, start at the origin). Now choose an a which makes this set non-empty, and let α and β be two bridges from the set. Let v be a vector that is perpendicularto the line joining the origin to a.

For the remainder of this proof, we will identify each point with the vector that joins theorigin to that point. So, for example, αi could mean either the ith point of α (starting from0) or the vector joining the origin to that point. It should be clear which is meant from thecontext.

We can now choose an index j which maximises αj · v, and we define the trail α′ to be

α′ = {αj − αj, αj+1 − αj, . . . , αn − αj, α1 + αn − αj, . . . , αj + αn − αj} (5.44)

where addition and subtraction is taken component-wise. This is just a translate of a cyclicpermutation of α, and since α is a bridge, the x-coordinate of all the points after αn − αj isgreater than the x-coordinates of all the points before. Therefore α′ is a self-avoiding trail.

Furthermore, because αn · v = a · v = 0 and j maximises αj · v, we have

α′i · v ≤ αj · v − αj · v = 0 (5.45)

for all i. Finally we observe that α′n = a. What we have essentially done here is to cyclically

permute the trail α so that it lies entirely on one side of the line joining the origin to a. Wedemonstrate this in Figure 5.20.

We apply the same transformation to β to get β ′, but reverse the inequality, choosing jso that βj · v is minimised. This leads to β ′

i · v ≥ 0 for all i. This ensures that β ′ lies on theopposite side of the line joining the origin to a than α′.

Now choose a point e that is a neighbour of the origin such that e · v > 0. We thenconstruct a self-avoiding trail by starting with α′, continuing to a, then moving to a + e,finishing with the reverse of β ′, translated so that it starts at a + e and ends at e. Sinceall points coming from α′ have nonpositive dot product with v, and all points originallyfrom β ′ will now have strictly positive dot product with v, the two component trails willnot intersect. As they are themselves trails, the resulting construct is indeed a self-avoidingtrail. By construction, it has length 2n + 1. We give an example of this construction inFigure 5.21.

200

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1 a

v

αj

βjαn − αjβn − βj

(a) α.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

avαj

βj

αn − αjβn − βj

(b) β.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

avαjβj

αn − αj

βn − βj

(c) α′.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

a

vαjβj

αn − αjβn − βj

(d) β′.

Fig. 5.20: Transformation of bridges into self-avoiding trails which do not cross a line.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.21: Joining two trails to make a self-avoiding trail which ends at a neighbour of the origin.We used the transformed trails in Figure 5.20.

If n > 0, then a cannot be the origin. By construction, the only bond in the new trailthat has one vertex lying on the line joining the origin to a, with the other vertex havinga positive dot product with v, is the bond {a, a + e}. In particular, the bond {0, e} isunoccupied.

If we are given the cyclic permutation of a bridge that came from Bn[a], there are n+ 1possible choices for the original vertex, and therefore at most n+1 bridges from Bn[a] couldpossibly permute into the given bridge. Furthermore, joining two permuted bridges in themanner described above gives a self-avoiding trail which is unique even if we do not knowthe endpoints of the component bridges, because we know that they have equal length.Therefore, if we let t′(n) be the number of self-avoiding trails of length n which end at aneighbour of the origin, and do not contain the bond joining the origin to the endpoint, ourconstruction gives us the inequality

1

(n + 1)2

∑

a

|Bn[a]|2 ≤ t′(2n+ 1). (5.46)

For |Bn[a]| to be positive, we must have the x-coordinate of a greater than or equal to0, but a cannot be farther than n steps from the origin. Therefore it is certainly enclosedin the rectangular box 0 ≤ x ≤ n,−n ≤ y ≤ n, which contains (n+ 1)(2n+ 1) points. Fromthe Cauchy-Schwarz inequality ([26, Example 1.11]), we have

(b(n))2 =

(

∑

a

|Bn[a]|)2

=

(

∑

a

|Bn[a]| · 1)2

≤∑

a

|Bn[a]|2∑

a

1

≤ (n+ 1)(2n+ 1)∑

a

|Bn[a]|2

≤ (n+ 1)3(2n+ 1)t′(2n+ 1). (5.47)

We can complete a self-avoiding trail of length n that ends at a neighbour of the origin,but does not contain the bond linking the origin to that neighbour, by adding that bondto the end of the trail. This gives us a SAPT of length n + 1, which is not unique. If wewant to reverse this completion, we can do it in 2(n+1) different ways (since there are n+1possible bonds to remove and 2 ways to direct the remaining trail). Therefore

t′(n) ≤ 2(n+ 1)to(n+ 1) (5.48)

202

and we can say

1

2nln to(2n) ≥ 1

2nln

1

2(2n+ 1)t′(2n− 1)

≥ 1

2nln

1

2(2n+ 1)n3(2n− 1)(b(n− 1))2

=1

nln b(n− 1) +

1

2nln

1

2(2n+ 1)n3(2n− 1). (5.49)

Therefore µot ≥ µb and the theorem is proved.

The most important result is the actual pattern theorem for SAPTs.

Theorem 5.3.9. Let to(n; p,m) be the number of self-avoiding polygon trails where the primepattern p occurs at most m times. Then there exists a number a(p) > 0 such that

limn→∞

1

2nln to(2n; p, 2a(p)n) < lnµot . (5.50)

Proof. Given any self-avoiding polygon trail containing at most m occurrences of the patternp, removing any step from this trail and adding a direction will result in a self-avoiding trailof length n− 1. As we have not added any steps, this trail contains at most m occurrencesof p. Therefore

to(n; p,m) ≤ t(n− 1; p,m). (5.51)

Choose a′(p) according to Theorem 5.3.6 so that

limn→∞

1

nln t(n; p, a′(p)n) < lnµt, (5.52)

and choose a(p) such that for all positive n,

a(p)n + 2a(p) < a′(p)n. (5.53)

Then

limn→∞

1

2nln to(2n; p, 2a(p)n) ≤ lim

n→∞

1

2nln t(2n− 1; p, 2a(p)n)

2n

2n− 1

= limn→∞

1

nln t(n; p, 2a(p)

(n

2+ 1)

)

≤ limn→∞

1

nln t(n; p, a′(p)n)

< lnµt = lnµot . (5.54)

203

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.22: A prime pattern which induces a crossing.

Corollary 5.3.10. On average, a prime pattern p will occur O(n) times in a random SAPTof length n as n→ ∞.

Proof. Theorem 5.3.9 shows that p occurs at least a linear number of times in a randomSAPT. However it cannot occur more than n times in any SAPT of length n, so it occurs alinear number of times.

We first use the pattern theorem to tell us the number of crossings in a random SAPT.

Lemma 5.3.11. The expected number of crossings in a random SAPT of length n grows likeO(n).

Proof. Firstly we note that the number of crossings in an SAPT must be less than or equalto n

2. This is obvious because the trail must visit each crossing twice, and it visits n (possibly

non-distinct) points altogether.Now consider the trail segment in Figure 5.22. It is a bridge, and since it has no cutting

line, it is a prime bridge. Therefore we can apply the pattern theorem, and deduce that theexpected number of times this pattern will occur in a random SAPT of length n is linear inn. But every time this pattern occurs, it includes one crossing in the corresponding knot.Therefore the expected number of crossings must be linear in n.

The pattern theorem can also be applied to a different trail segment to gain anotherview of how the SAPT is knotted. Firstly consider the trail segment in Figure 5.23(a). If weallocate the crossings in the right way (as in Figure 5.23(b), for example), then this generatesa trefoil in the knot. There are exactly 2 ways (out of 8 possibilities) that the crossings cancreate a trefoil, so the probability of this occurring is 1

4. We call such a segment a trefoil

segment.

Lemma 5.3.12. The number of trefoil segments in a random SAPT of length n grows likeO(n).

Proof. The proof is identical to that of Lemma 5.3.11; it is a prime bridge (and thereforeusable for the pattern theorem), so it must occur linearly in n.

204

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a)

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b)

Fig. 5.23: Trefoil segment; (a) without crossings and (b) with crossings.

5.4 Fourier transforms

We continue in our quest to find boundaries for the mean unknotting time. In addition toKesten’s pattern theorem, we also present a bit of background on Fourier transforms in thissection.

Definition 5.4.1. Let f be a discrete function defined at the points ∆k, k = 0, 1, . . . , N−1.If we let fk = f(∆k), then the Fourier transform of f is a function defined on the points∆l, l = 0, 1, . . . , N − 1, with the values

f(∆l) = fl =

N−1∑

k=0

fke−2πilk/N . (5.55)

The inverse transform has a very similar form:

fk =1

N

N−1∑

l=0

fle2πilk/N . (5.56)

For our purposes, we can make a few simplifications to the Fourier transform. Thefunction that we will be transforming will be multivariate with n variables, so we will applythe transform in all the variables consecutively. Each variable will be either 0 or 1, so N = 2and ∆ = 1. Therefore fk = f(k). The Fourier transform will then also be multivariate in nvariables:

fl =1∑

kn=0

1∑

kn−1=0

· · ·1∑

k1=0

fke−πil1k1e−πil2k2 . . . e−πilnkn

=∑

k

fke−πil·k =

∑

k

fk(−1)l·k. (5.57)

The inverse Fourier transform is

fk =1

2n

∑

l

fl(−1)l·k. (5.58)

205

Now suppose we have a function that has two n-variable arguments, x and y, which havecomponents of 0 or 1. Suppose further that f depends only on the positions of x and yrelative to each other. We can then redefine f as a function of an n-variable vector:

f(x,y) = f(|y − x|), (5.59)

where we take |y − x| to be the vector whose components are the absolute values of thecorresponding components of y − x. Now if we fix x on an n-dimensional unit cube and lety run over all possible points on that cube, we see that |y − x| also runs over all possiblepoints on that cube. Therefore we can take the multivariate Fourier transform of this newfunction to get

fk =∑

y

f(|y − x|)(−1)k·|y−x| =∑

y

f(x,y)(−1)k·(y−x) (5.60)

since the elements of y− x are integral. Note in particular that although this expression forfk contains x, it is independent of x. The inverse of this transformation is

f(x,y) = f(|y − x|) =1

2n

∑

k

fk(−1)k·|y−x| =1

2n

∑

k

fk(−1)k·(y−x). (5.61)

5.5 A walk on an n-cube

The pattern theorem tells us the number of times a random SAPT will contain crossings ortrefoil segments. Now, the crossings can be arranged in both a crossing-inducing segmentand a trefoil segment in such a manner as to induce two natural states, which can be derivedfrom each other by reversing crossings. For the crossing-inducing segment, the two statesoccur when each strand in the crossing is the overpass. For the trefoil segment, the twostates occur when the trefoil is knotted or unknotted.

If we call these two states ‘0’ and ‘1’, then since we have a number of these segments,we can take the tuple of all the states as a point on a hyper-dimensional unit cube. Byreversing crossings, we move between various points on this cube, but only by changing onecoordinate at a time.

To model this behaviour, we consider the model of a random walk on an n-dimensionalunit cube (or n-cube). This walk starts at the origin, and when it reaches a vertex, it choosesa new direction at random. In particular, this means that it can double back immediately.

We define the points 0 and 1 to be the origin and the point (1, 1, . . . , 1) on the cube,respectively. With this model, we are firstly interested in how long, on average, it takesthe walker to reach the point 1 for the first time. We call walks that do not touch theirendpoint before the last step first-passage walks. Firstly, we find the generating function ofthe number of these walks (this result was given to us by Gordon Slade).

Theorem 5.5.1. The generating function for first-passage walks from 0 to 1 on the unit

206

n-cube (without probability weighting) is

W (z) =

∑nj=0

(−1)j(nj)1−zn+2zj

∑nj=0

(nj)1−zn+2zj

. (5.62)

Proof. We define the function Cx,y(z) to be the generating function of all possible paths onthe n-cube that start from point x and end at point y:

Cx,y(z) =∑

w:x→y

z|w| (5.63)

where w is summed over all possible paths from x to y, and |w| is the length of w. Notethat w may touch y more than once, so it is not necessarily a first-passage walk (althoughthe sum does include all first-passage walks).

Now the n-cube possesses a great deal of symmetry, which we can exploit to our advan-tage. It is clear that the number of paths from x to y is the same as the number of paths onan appropriately translated n-cube from 0 to y−x. If we now reflect this translated n-cubein the appropriate hyperplanes, we see that this is in turn equal to the number of paths ona regular n-cube from the origin to |y − x|. Therefore C satisfies the conditions that weoutlined above for our Fourier transforms, and it has the transform

Ck(z) =∑

y

(−1)k·(y−x)Cx,y(z) =∑

y

(−1)k·(y−x)

δx,y + z∑

u:〈x,u〉Cu,y(z)

(5.64)

where the sum over u runs over all neighbours of x. This is because the walk is either empty,which is only possible if x = y, or it must first step to a neighbour of x.

If we define P (x,y) to be the transition probability from x to y (which is 1n

if x and yare neighbours, and 0 otherwise), then

Ck(z) =∑

y

(−1)k·(y−x)

(

δx,y + zn∑

u

P (x,u)Cu,y(z)

)

. (5.65)

Since the only value of y where δx,y is non-zero is x,

Ck(z) = (−1)k·(x−x) + zn∑

y

∑

u

(−1)k·(u−x)P (x,u)(−1)k·(y−u)Cu,y(z)

= 1 + zn∑

u

(−1)k·(x−u)P (x,u)Ck(z)

= 1 + znPkCk(z) (5.66)

207

since P (x,u) also depends only on |x − u|. Rearranging, we get

Ck(z) =1

1 − znPk

. (5.67)

Now since Pk does not depend on x, we can set x to be 0 for the purposes of calculatingit. From the definition,

Pk =∑

u

P (0,u)(−1)k·u

=

n∑

j=1

1

n(−1)k·ej =

n∑

i=1

1

n(−1)kj

=1

n(n− 2|k|1) = 1 − 2

n|k|1 (5.68)

and so

Ck(z) =1

1 − zn(1 − 2n|k|1)

=1

1 − zn + 2z|k|1. (5.69)

We can now apply our inverse transformation to derive the original C; setting the firstargument to 0 gives us

C0,y(z) =1

2n

∑

k

(−1)k·y1

1 − zn + 2z|k|1. (5.70)

This gives us the generating function of all paths that start at 0 and end at 1, whetherthey are first-passage or not. What we would like to find is the generating function of justfirst-passage paths. Since each walk counted in C consists of a first-passage walk, followedby a (possibly trivial) loop centred at 1, we have

C0,1(z) = W (z)C1,1(z) = W (z)C0,0(z) (5.71)

and therefore

W (z) =C0,1(z)

C0,0(z)=

12n

∑nj=0

(−1)j(nj)1−zn+2zj

12n

∑nj=0

(nj)1−zn+2zj

. (5.72)

Here we have replaced the sum over k by a sum over j, where j = |k|1. For any value of j,there are

(

nj

)

values of k which have that value of j. Simplifying the above expression givesus the desired result.

We can now find the expected time that it will take the walker to first reach 1 from theorigin; we call this the mean first-passage time.

208

Theorem 5.5.2. On the n-cube, the mean first-passage time from 0 to 1 is

n

n∑

i=1

1 − (−1)i

2i

(

n

i

)

= n

n∑

i=1i odd

1

i

(

n

i

)

. (5.73)

Proof. All paths of length m have probability(

1n

)m, so W ( 1

nz) gives us the probability-

weighted generating function of the first-passage paths that we want. Therefore the expectedlength of such paths is

zd

dzW (

1

nz)

∣

∣

∣

∣

z=1

= zW ′(z)|z=1/n . (5.74)

Differentiating our expression for W (z) from Theorem 5.5.1 gives

zW ′(z) = z

[

∑nj=0

(−1)j(nj)(n−2j)

(1−(n−2j)z)2

] [

∑nj=0

(nj)1−(n−2j)z

]

−[

∑nj=0

(nj)(n−2j)

(1−(n−2j)z)2

] [

∑nj=0

(−1)j(nj)1−(n−2j)z

]

[

∑nj=0

(nj)1−(n−2j)z

]2

= z

∑ni,j=0((−1)i − (−1)j) n−2i

(1−(n−2i)z)2(1−(n−2j)z)

(

ni

)(

nj

)

[

∑nj=0

11−(n−2j)z

(

nj

)

]2

= z

∑ni,j=0((−1)i − (−1)j) (n−2i)(1−nz)2

(1−(n−2i)z)2(1−(n−2j)z)

(

ni

)(

nj

)

[

∑nj=0

1−nz1−(n−2j)z

(

nj

)

]2 (5.75)

Now when z = 1n, the only non-zero terms in the numerator occur when i = 0. For the

denominator, the only non-zero term in the sum occurs when j = 0. Furthermore, this termis 1. We now get

zW ′(z)|z=1/n = z

n∑

j=0

(1 − (−1)j)n(1 − nz)2

(1 − nz)2(1 − (n− 2j)z)

(

n

j

)

∣

∣

∣

∣

∣

z= 1

n

= z

n∑

i=1

(1 − (−1)i)n

1 − (n− 2i)z

∣

∣

∣

∣

∣

z= 1

n

(

n

i

)

=1

n

n∑

i=1

(1 − (−1)i)n

2i/n

(

n

i

)

= nn∑

i=1

1 − (−1)i

2i

(

n

i

)

. (5.76)

To find how the mean first-passage time grows asymptotically with n, we construct its

209

generating function, weighted according to n.

Theorem 5.5.3. The generating function for the mean first-passage time from 0 to 1 on ann-cube is

1

1 − 2z− 1

1 − z− z ln(1 − 2z)

2(1 − z)2. (5.77)

Therefore the mean first-passage time grows asymptotically like 2n.

Proof. Let the mean first-passage time from 0 to 1 on an n-cube be f(n). We will first provethe recurrence

n(n+ 1)f(n+ 2) − n(3n + 4)f(n+ 1) + 2(n+ 1)2f(n) = 0 (5.78)

for n ≥ 0. Substituting the formula we derived for f(n) in Theorem 5.5.2 on the left-handside gives

n(n+ 1)f(n+ 2) − n(3n+ 4)f(n+ 1) + 2(n+ 1)2f(n)

= n(n+ 1)(n+ 2)n+2∑

i=1

1 − (−1)i

2i

(

n + 2

i

)

− n(n+ 1)(3n+ 4)n+1∑

i=1

1 − (−1)i

2i

(

n+ 1

i

)

+2n(n+ 1)2

n∑

i=1

1 − (−1)i

2i

(

n

i

)

= n(n+ 1)

[

n+2∑

i=1

1 − (−1)i

2i

(

n + 2

i

)(

(n+ 2) − (3n+ 4)(n+ 2 − i)

n+ 2+

2(n+ 1 − i)(n + 2 − i)

n + 2

)

]

= n(n+ 1)

[

n+2∑

i=1

1 − (−1)i

2i

(

n + 2

i

)

(n+ 2)2 − (3n+ 4)(n+ 2 − i) + 2(n+ 1 − i)(n+ 2 − i)

n+ 2

]

= n(n+ 1)

[

n+2∑

i=1

−1 − (−1)i

2

n− 2i+ 2

n + 2

(

n+ 2

i

)

]

= −n(n + 1)

2(n+ 2)

[

n+2∑

i=1

(1 − (−1)i)(n− 2i+ 2)

(

n+ 2

i

)

]

(5.79)

To prove the recurrence, we need merely show that

n∑

i=1

(1 − (−1)i)(n− 2i)

(

n

i

)

=

n∑

i=0

(1 − (−1)i)(n− 2i)

(

n

i

)

= 0 (5.80)

210

for n ≥ 2. Expanding the left hand side,

n∑

i=0

(1 − (−1)i)(n− 2i)

(

n

i

)

= n

n∑

i=0

(

n

i

)

− 2

n∑

i=0

i

(

n

i

)

− n

n∑

i=0

(−1)i(

n

i

)

+ 2

n∑

i=0

(−1)ii

(

n

i

)

= n (1 + x)n|x=1 − 2 xd

dx(1 + x)n

∣

∣

∣

∣

x=1

− n (1 − x)n|x=1 − 2 xd

dx(1 − x)n

∣

∣

∣

∣

x=1

= n2n − 2n2n−1 = 0. (5.81)

Next we define the generating function of f :

F (x) =

∞∑

n=0

f(n)xn. (5.82)

Multiplying both sides of the recurrence by xn and summing over n gives the equation

xd2

dx2

[

F (x) − x

x

]

− 3

(

xd

dx

)2 [F (x)

x

]

− 4xd

dx

[

F (x)

x

]

+2

x

(

xd

dx

)2

[xF (x)] = 0. (5.83)

A quick check with Maple shows that 11−2z

− 11−z −

z ln(1−2z)2(1−z)2 does indeed satisfy this equation.

As the first two coefficients of our proposed generating function are identical to the meanfirst-passage times for n = 0 and 1, F (x) is the proposed function.

In some cases, we may want the walk to end at a point other than 1; as might be expected,it turns out that out of all the points on the n-cube, 1 has the longest mean first-passagetime from 0.

Theorem 5.5.4. Let x be a vertex of the n-cube. The mean first-passage time from 0 to xis less than or equal to the mean first-passage time from 0 to 1.

Proof. Because x lies at a vertex of the n-cube, it must contain |x|1 1s. There are(

n|x|1)

points on the n-cube with the same number of 1s, including x itself. Any first-passage pathfrom 0 to 1 must pass through at least one of these points.

Firstly, we separate all such walks into sets according to which of these points it touchesfirst. It can be seen that there exists a bijection between any one of these sets and anyother by permuting the coordinates so that the defining point of the first set is transformedinto the defining point of the second. Since any permutation of coordinates leaves 0 and 1unaltered, the result is still a first-passage path from 0 to 1.

As an illustration, suppose that n = 3 and x = (1, 1, 0). There are 2 other pointswith two 1s, which are x1 = (1, 0, 1) and x2 = (0, 1, 1). Then a bijection between all first-passage walks that pass through x (before passing through x1 and x2, if at all) and all

211

1

1

1

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

xx1

(a) Path passing through(1, 1, 0) first.

1

1

1

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

x

x1

(b) Transformed path passingthrough (1, 0, 1) first.

Fig. 5.24: Bijection between paths on the n-cube that pass through certain points.

first-passage walks that first pass through x1 can be constructed by interchanging the lasttwo coordinates of all the points in the path. For instance, as shown in Figure 5.24, the path(0, 0, 0), (0, 1, 0), (1, 1, 0), (1, 0, 0), (1, 0, 1), (1, 1, 1) becomes the path(0, 0, 0), (0, 0, 1), (1, 0, 1), (1, 0, 0), (1, 1, 0), (1, 1, 1).

It is easy to see that this bijection is length-preserving (and therefore also preserves theprobability of each path occurring), so the expected length of the walks in each set is equalto the expected length of all the walks, i.e. all first-passage walks from 0 to 1, which isthe quantity that we calculated above. Now since the n-cube is directionless, we can applythe same reasoning to first-passage walks from 1 to 0 and get the same result. Thus theexpected length of first-passage walks from 1 to 0 which first touches x out of all points with|x|1 1s is equal to the mean first-passage time from 0 to 1.

Now we divide all such walks at the point where they first touch x. Every such divisionresults in a first-passage walk from 1 to x that does not touch any other point with |x|1 1s,followed by a first-passage walk from x to 0. Conversely, if we join any first-passage walkfrom 1 to x that does not touch any other point with |x|1 1s with any first-passage walk fromx to 0, we will always get a first-passage walk from 1 to 0 that touches x before touchingany point with the same number of 1s.

Therefore the expected length of first-passage walks from 0 to 1 is equal to the sum ofthe expected lengths of first-passage walks from 1 to x (that do not touch the appropriatepoints) and first-passage walks from x to 0. Since all walks have non-negative length, thismeans that the expected length of first-passage walks from x to 0 is less than or equal to theexpected length of first-passage walks from 0 to 1. By merely observing that we can againreverse the direction and say that the length of first-passage walks from x to 0 is equal tothe length of first-passage walks from 0 to x, the result is proved.

212

5.6 Bounds on the mean unknotting time

In this section, we use the results found in Sections 5.3 and 5.5 to construct bounds on themean unknotting time of a knot, in particular as it relates to the length of the self-avoidingpolygon trail used to generate the knot. Firstly, we find an upper bound for the meanunknotting time.

Theorem 5.6.1. Asymptotically, the mean unknotting time for a random SAPT of lengthn grows no faster than

√2n.

Proof. Suppose that we are given an knot with m crossings. As mentioned in Section 5.5,we can relate the unknotting process to the model of a walk on an m-cube in the followingfashion. Order the crossings on the knot in an arbitrary manner. Each crossing has twostates, which correspond to having a particular strand on top of the other or vice versa.Label the state that each crossing is originally in as ‘0’, and label the opposite state as ‘1’.Reversing a crossing will take it from state 0 to state 1 or vice versa.

Now, looking at the states as an m-tuple, we see that the set of all possible points formsan m-cube. We start at the origin by construction, and every random reversal is equivalentto walking in a random direction on the cube. Now, at least one way of allocating strands inthe crossings must result in the unknot, and so at least one point on the m-cube is equivalentto the unknot. The mean unknotting time must therefore be less than or equal to the meanfirst-passage time from the origin to that point. From Theorem 5.5.4, this time grows nofaster than 2m.

However, as we noted in the proof of Lemma 5.3.11, any SAPT of length n cannot havemore than n

2crossings, and so the mean unknotting time of such a self-avoiding polygon trail

cannot grow faster than√

2n.

To find a lower bound on the mean unknotting time, we can use our model of a walkon an m-cube again, this time with trefoil segments. Suppose that we are given a knotwith m trefoil segments whose crossings are arranged so as to form trefoils in the knot (i.e.knotted). For each of these segments, we assign the state ‘0’ if the segment is knotted, and‘1’ otherwise. A careful analysis of the possible crossing reversals of a trefoil, as shown inFigure 5.25, shows that if a trefoil segment is in state 0, a crossing reversal has probability 1of unknotting it (and therefore sending it to state 1), but if the segment is unknotted, onlyone of the three crossings knots the segment when reversed.

If we again construct an m-tuple according to the state of the trefoil segments, we arriveat a very similar model of a walker which starts from the origin and travels in a randomdirection at every vertex. However, if the direction it chooses to travel in is towards theorigin (i.e. has more 0s than before), then it only has a 1

3chance of succeeding — if it fails,

it stays at the same vertex that it started off at.The entire knot cannot possibly be unknotted if any of the trefoil segments are knotted,

so we must reach the vertex 1 on the m-cube. On the other hand, even if all the trefoil

213

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.25: All possible crossing reversals of a trefoil segment. The two knots in the centre areknotted; the outer knots are unknotted. The highlighted crossings are the ones whichdiffer from the connected centre knot.

segments are in state 1, the entire knot may still be knotted, so this finds a lower bound onthe mean unknotting time to go with our upper bound derived above.

To do this, we try to find the mean first-passage time from 0 to 1. We were unable to findthis quantity for this model with the same method that we used for a directionless n-cube,because that method only works when all the vertices are functionally identical. In fact, wewere unable to prove a formula for the generating function at all, although we were able toproduce an expression which is almost surely the generating function.

To guess the generating function, we first calculated the mean first-passage times forthe first few dimensions. We did this by calculating the mean first-passage time from anypoint with k 1s to 1, which we denote by E(n, k) for an n-dimensional cube. To do this, weobserved that if the walk starts at a point with k 1s, the first step has a probability of k

3n

of moving to a point with k − 1 1s, a probability of 2k3n

of staying at the same point, and aprobability of n−k

nof moving to a point with k+1 1s. The first step takes 1 unit of time, and

if we move to a point with k′ 1s, the walk will then take (on average) E(n, k′) more steps tofirst reach 1. This gives the equation

E(n, k) = 1 +k

3nE(n, k − 1) +

2k

3nE(n, k) +

n− k

nE(n, k + 1) (5.84)

which applies for k ≤ n − 1. Naturally E(n, n) = 0, so this gives us a linear system ofequations that we can use to find E(n, 0) for any fixed n.

Next, we used the Maple package ‘GFUN’ (see [120]) to guess the generating function.Using this package, we analysed the series using the method of differential approximants.An extension of the Pade approximants we described in Section 3.6.2, this method involvesfitting the generating function to a linear differential equation with polynomial coefficients— so if the generating function is f(z), we fit the first few series terms to the equation

K∑

i=0

Qi(z)

(

zd

dz

)i

f(z) = P (z) (5.85)

where P (z) and Qi(z) are polynomials, and solve for P (z) and Qi(z). For more informationon differential approximants, see [44].

This analysis indicated that the generating function for the mean first-passage time from0 to 1 in this model is

3z

4(1 − z)2

(

1

3 − 4z− ln(3 − 4z) + ln 3 + 1

)

. (5.86)

The fact that this function is similar in form to the generating function of the model weanalysed in Section 5.5 suggests that it is indeed the correct generating function. Thisfunction shows that the mean first-passage time grows asymptotically like

(

43

)n. We use this

in our following result.

Proposition 5.6.2. Asymptotically, the mean unknotting time for a random SAPT of length

215

n must grow at least exponentially in n.

Proof. Since the number of trefoil segments in the SAPT is O(n), of which an average of14

form trefoils, we model the unknotting process by a walk on an O(n)-cube as describedabove. If the analysis of the model above is correct, the mean unknotting time must be at

least(

43

)O(n), which is exponential in n.

As an interesting aside, in the course of our attempts to prove the generating functionfor the mean first-passage time from 0 to 1 in this model, we noticed that the average timetaken to return to 1 (for the first time) if the walker starts at 1 appears to be exactly 4

3

n,

although we could not prove this. In fact, if we look at a more general model which has a 1p

chance of turning a 1 state into a 0 state, and also has a 1q

chance of turning a 0 state into

a 1 state (it stays at 0 otherwise), then the mean first-passage time from 0 to 1 appears tosatisfy the recurrence

f(n+ 1) − 2f(n) + f(n− 1) = q

(

q

p+

1

n

)(

1 +q

p

)n−1

. (5.87)

Furthermore, the mean first-passage time from 1 to 1, not counting the starting point,appears to be exactly (1 + q

p)n. Unfortunately, we could not prove either of these formulas.

Returning to the bounds on the mean unknotting time, we can put Theorem 5.6.1 andProposition 5.6.2 together to arrive at the following result.

Proposition 5.6.3. The mean unknotting time for a random SAPT of length n must beexponential in n, with the growth constant being less than or equal to

√2.

5.7 The pivot algorithm

Proposition 5.6.3 shows that the unknotting time should be exponential in the length of theself-avoiding polygon trail, but we have very little idea of the value of the growth constant.In the next few sections, we will estimate this value through simulation. To do this, wegenerate random SAPTs, and then observe how long it takes for them to unknot.

The first thing to do is to find a way of generating SAPTs uniformly at random, i.e. sothat given a fixed length n, every SAPT of length n has an identical probability of beinggenerated. For this purpose, we have adapted an algorithm that is used to generate randomself-avoiding walks and self-avoiding polygons called the pivot algorithm.

Informally, the pivot algorithm works by taking an arbitrary SAPT of the length thatwe want and making length-preserving changes to it. If we make enough of these changes,it turns out (as we will show in the next section) that we will end up with a random SAPT.For the pivot algorithm, the changes take the form of removing a segment of the polygon,and then either reflecting or rotating it. Sometimes this transformation produces a resultwhich is not a SAPT; if so, the original polygon does not change.

216

More precisely, suppose that we start off with a SAPT p of length n, so that the pointson the polygon are p0, p1, . . . , pn−1, pn, with pn = p0. Then at each step we perform thefollowing operations:

1. Choose randomly two points on p, say pi and pj, with 0 ≤ i, j ≤ n− 1 and i 6= j.

2. We transform the smaller part of p that lies between pi and pj, or a random part ifboth segments have equal length. There are two possible transformations, each withprobability 1

2:

(a) Rotation through an angle of π, centred at the midpoint of the line connecting piand pj.

(b) Reflection through the perpendicular bisector of the line connecting pi and pj.(This only works if this line is vertical or has a slope of 0 or ±1.)

3. If the pivot results in a SAPT, we keep the polygon as the next iteration. Otherwise,we set the next iteration of the polygon to be p.

To start off, we choose an arbitrary SAPT of length n — in our case, we have used thepolygon which is as close to a square as possible, though this is by no means necessary.We then apply the above procedure n times, so as to make sure that the new polygon issufficiently distinct from the original polygon. The result is then taken as one sample point.Now, instead of going back to an arbitrary polygon, we use this sample point as a startingpoint for the next iteration, applying the transformation n times again to it to generateanother sample point, and so on. We repeated this process until we generated 1,000,000samples.

We would like to ensure that the correlation between sample points has as little effectas possible on our estimate of the error. When the pivot algorithm has been used for self-avoiding walks and polygons ([96]), it was usually found that taking n transformations, wheren is the length of the walk/polygon, generally sufficed to give an independent sample. Thisis what we did; in addition, we also took into account autocorrelation times when computingthe error bars on our estimate. Again from [96], the (integrated) autocorrelation time is

τint,g =1

2+

∞∑

k=1

Cg(k)

Cg(0)(5.88)

where Cg(k) is the covariance of sample points that are k observations apart. Loosely speak-ing, 2τint,gn is the number of sample points needed to get the equivalent of n independentsample points. Thus, we multiplied the variance by 2τint,g when calculating our error bars(which we took to be twice the standard error on either side of the estimate). We used aprogram made available to us by E. J. Janse van Rensburg.

217

5.8 Validity of the pivot algorithm

As for any Monte Carlo method, we need to be sure that the pivot algorithm produces arandom SAPT. In this context, this means that for any length n, the stationary distributionof the pivot algorithm must be uniform, i.e. all SAPTs of length n must have an equalstationary-state probability.

To show that this is indeed the case, we use Theorem 4 of [46]. In Markov chain terms,this states that the pivot algorithm has a uniform stationary distribution if it satisfies threeconditions. The first is that there must be a polygon which can be transformed into itself,which is trivial since many of the transformations do not work, resulting in no change tothe SAPT. The second is that for all states, the number of possible transformations mustbe equal, and transformations must be chosen equally at random from among these choices.This is also true, as the pivot algorithm is length preserving, so at any stage there are

(

n2

)

ways of choosing the vertices to operate on, and 2 ways to choose a transformation. Thereforethere are 2

(

n2

)

equally likely possible transformations at each stage, which is independent ofthe original SAPT.

The last condition is that the algorithm is ergodic, which means that any state is reachablefrom any other in a finite number of steps. Formally, this means that for any two states pand q, there must exist a finite m such that

Pm(p, q) > 0 (5.89)

where Pm is the probability of being in state q after m steps, given that the process iscurrently in state p. We will show that this is indeed the case; but first, we need to definesome notation.

Definition 5.8.1. Suppose that we are given a self-avoiding polygon trail p. We define astraight line in the plane to be a support line for p if it touches p at more than one point, butdoes not divide p (i.e. p lies totally on one side of the line). A support line is external if itcontains three points a, b, c in that order, such that a and c lie on p but b does not. Supportlines are called horizontal or vertical if they are horizontal or vertical as a line; otherwise,they are called diagonal. Diagonal support lines are always external.

We define pi to be the ith point of p, counting from 0. Using this, we call an indexi ∈ 0, 1, . . . , n− 1 a turn of p if the vector −−−→pi−1pi is not equal to −−−→pipi+1 — essentially, the twosteps adjacent to pi must go in different directions. Since the endpoints of the polygon areidentical, we also identify the indices −1 with n− 1 and 0 with n for this purpose. Becausea polygon must meet its own starting point, every non-trivial SAPT must have at least 4turns. We call SAPTs that have exactly 4 turns rectangles.

We call an index i ∈ 0, 1, . . . , n− 1 a self-intersection of p if there exists another index jsuch that j 6= i and pj = pi.

These terms are illustrated in Figure 5.26.

We can now use these definitions to prove that the pivot algorithm is ergodic.

218

External vertical support line

Horizontal support line

Diagonal support line

Self−intersection

Turn

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

Fig. 5.26: Terminology for SAPTs.

Theorem 5.8.1. Given two polygons p and q, we can apply transformations used in thepivot algorithm to turn p into q.

Proof. We use the same idea as the proof of Dubins et al. ([46]) for the pivot algorithm forself-avoiding polygons.

From the definitions, we observe that if p is not a rectangle, then it must have an externalsupport line, which is either horizontal, vertical or diagonal. Therefore there exists twoindices i and j such that pi and pj lie on the support line, but the segment between themdoes not touch the line.

We assume (without loss of generality) that i < j, and that this segment is the segmentpi, pi+1, . . . , pj. To arrive at a new SAPT, we will rotate this segment through an angle of π.Because the segment is rotated by π, the result will lie in Z

2. Since the rotated segment willalways lie on the other side of the support line from p, and never touch it except at pi andpj, the transformed polygon is indeed a SAPT. This process is illustrated in Figure 5.27.

Firstly, we look at the case where the support line is horizontal or vertical. Because thesegment between pi and pj does not touch the support line, the steps −−−→pipi+1 and −−−→pj−1pj areat right angles to the line. Since the SAPT can only lie on one side of the support line, iand j must both be turns of p. If we rotate the segment, they are still turns. Therefore theresult is a SAPT with the same number of turns as p. The rotated segment is on the otherside of the support line to the rest of p, so if it intersected the rest of p before the rotation,the new SAPT has fewer self-intersections than p. Otherwise it has the same number ofself-intersections as p.

We claim that in the latter case, the new SAPT has a diagonal support line. To showthis, first assume without loss of generality that the support line is vertical and that p liesto the right of it. Since the two segments of p separated by i and j do not touch, one mustcompletely enclose the other with respect to the support line, as shown in Figure 5.28. Theenclosed segment must therefore have less ‘height’ (the vertical length from highest vertexto the lowest vertex). Therefore, when one of the segments is rotated, the heights of the

219

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) Turns = 10, self-intersections = 1, no di-agonal support line.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) Turns = 10, self-intersections = 1, diagonalsupport line.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) Turns = 10, self-intersections = 0.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(d) Turns = 4, self-intersections = 0.

Fig. 5.27: Transforming a SAPT into a rectangle through pivots. There are 3 pivots between (c)and (d).

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) Before rotation withvertical support line.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) After rotation with diagonal support line.

Fig. 5.28: SAPT with vertical support line and non-intersecting segments. After rotation a diagonalsupport line can be drawn.

segments on either side of the support line are unequal. Thus either the lowest points of theSAPT on either side of the support line have different y-coordinate, or the highest pointshave different y-coordinate. In either case, there must exist a diagonal support line, so ourclaim is proved.

Next, we look at the case where the support line of p is diagonal. Again, both i and jmust be turns of p, and we rotate the segment between them through an angle of π. If therotated segment previously intersected the rest of p, we can use the same argument as aboveto show that the new SAPT resulting from the rotation has at most the same number ofturns as p, but has fewer self-intersections than p.

The other possibility is that the segment does not intersect the rest of p. In this case,joining the rotated segment with the support line completely encloses the rest of p, or viceversa. Since both i and j are turns, the steps −−−→pi−1pi and −−−→pj−1pj must be identical, as mustthe steps −−−→pipi+1 and −−−→pjpj+1. This is illustrated in Figure 5.29. Therefore, when we rotate thesegment, i and j will no longer be turns of p. So rotating the segment through π producesa SAPT with less turns than p and the same number of self-intersections as p.

So to reach q from p, starting from p, we start by performing the following steps:

1. If p is a rectangle, stop.

2. If p has no diagonal support line, it must have a horizontal or vertical external supportline. Then we can repeatedly pivot until p has a diagonal support line. This will eventu-ally happen because each pivot either strictly reduces the number of self-intersections,or produces a diagonal support line, and p has a finite number of self-intersections.

3. Now p must have a diagonal support line, so we pivot so that p has either fewer turnsor fewer self-intersections.

221

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) If one segment encloses the other, thesteps must be equal.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) If the segments cross each other, thesteps may be unequal.

Fig. 5.29: Comparing the steps −−−→pi−1pi and −−−→pj−1pj in a SAPT with a diagonal support line.

4. Return to step 1.

Because at each step p loses either turns or self-intersections, and does not gain either,this process will eventually end. At this point p must be a rectangle. We illustrated theentire process in Figure 5.27. Using the same steps, we can transform q into a rectangle.However, each transformation is its own inverse, so we can also transform that rectangle intoq. So all we need to show now is that any rectangle can be transformed into any other.

This turns out to be fairly simple: since the rectangles must have the same length,either the width or the height of the desired rectangle must be smaller than that of theoriginal. Suppose (without loss of generality) that it is the height. Then we rotate a segmentcontaining the lower left corner so that the left-hand side now has the desired height, andso that the bottom side also has the same length as the left-hand side. By reflecting thesegment from the top right corner to the vertex that was the bottom left corner, we achievethe rectangle of desired dimension. This process is shown in Figure 5.30.

5.9 Results

Having ensured that the pivot algorithm is valid, we can now actually calculate our estimatesof the mean unknotting time. Using the algorithm above, we calculated data for polygonlengths up to 3000. The results are shown in Figure 5.31.

Interestingly, although our theoretical results indicate that the mean unknotting time isexponential in the length, our results do not seem to show this. In fact there seems to bemore of a power law relationship. Our best guess as to why this is so is that the exponentialrelationship must only apply for extremely large SAPT lengths, and up to that point thereis a power law. Since all our theoretical results are asymptotic in nature, this may well bethe case.

222

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(a) Original rectangle.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(b) Rectangle with rotated seg-ment.

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

ln ulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

(c) Final rectangle.

Fig. 5.30: Transforming one rectangle to another via a rotation and a reflection.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 500 1000 1500 2000 2500 3000

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

n

Lgrowth constant

hn1

mean unknotting time

Fig. 5.31: Mean unknotting time vs. length of generating SAPT.

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

5.5 6 6.5 7 7.5 8 8.5

PSfrag replacements

Vψa

b

c

d

σ1

σ2

σ3

σ4

σ5

FXYAω

F (σ1, σ2)F (σ2, σ3)F (σ3, σ4)F (σ4, σ5)

J1

J2

H

ξ1

ξ2

ξc

u

v

(0,√

2 − 1)

lnulnm(u, 0)m(0.42, 0)


uv

m(u, 0)(1

2−m)8

nL

growth constanthn1

ln(mean unknotting time)

lnn

Fig. 5.32: Log-log plot of mean unknotting time vs. length.

In Figure 5.32 we show a log-log plot of the data. By fitting lines to the data, we get theestimate

mean unknotting time ∼ αnγ (5.90)

where n is the length of the SAPT and

α = 3.423 × 10−7 ± 7.14 × 10−8 (5.91)

γ = 1.908 ± 0.027 (5.92)

However, it must be noted that these estimates are probably not very accurate, as the slopeof the line seems to be increasing slightly as n becomes larger.

5.10 Conclusion

In this section, we have looked at the problem of mean unknotting times using a variety ofmethods. Firstly we looked at small (with few crossings), fixed embeddings, and generatedsome data for that. Then, using a more complicated approach, we used self-avoiding polygontrails to generate random knot embeddings. This also allows us to ‘measure’ the complexityof a knot. We proved theoretically that the mean unknotting time must be exponential inthe length of the self-avoiding polygon trail, by relating the mean unknotting time to themodel of a walk on an n-cube. However, when we used the pivot algorithm to generateSAPTs of length up to 3000, we found that the mean unknotting time up to that length

224

seems to obey a power law.To extend our research, it would be interesting to extend our generated numbers to see

if an exponential relationship really does hold at higher lengths. We would also like to see ifwe can find better theoretical (or even practical) bounds on the growth constant (assumingit exists): at the moment all we know is that it is less than

√2 (and more than 1).

A couple of other directions that we could take this line of enquiry along is to look atother possible ways of generating knots. Certainly using 3-dimensional self-avoiding polygonsseems quite feasible, and would be a better (though more computationally intense) model ofDNA untangling since DNA is a 3-dimensional object. Alternatively, we could look at otherways of unknotting the knot rather than just crossing reversals — perhaps we could use acombination of reversals and Reidemeister moves. It also seems possible that the DNA mayhave a separating force that partially untangles it, so the topoisomerase would be less likelyto convert it into a more complex knot. Finally, if we wanted to be really experimental, itmay be interesting to study actual DNA and see how closely our model relates to the realobject, and what physical usefulness the time to unknot may have.

225

APPENDIX

A. GENERATING FUNCTIONS FOR WALKS IN A STRIP

In this appendix, we give the generating functions for p walkers (which are n-friendly) in astrip of width L, ordered by increasing L. We do not include generating functions which wehave proved. A friendliness of ∞ denotes TK ∞-friendly walkers. If the polynomials becometoo long, we have expressed them as a sequence of numbers, the numbers being coefficientsof increasing powers of x, with the last power of x written out for reference. If they becomeeven longer, we have written the numerator and denominator out as a sequence — N standsfor numerator and D for denominator.

p = 2

L = 5 :n Generating function0 1−x

1−2x−x2+x3

1 1−2x2−x3+x4

1−2x−3x2+4x3+x4−x5

2 1−3x3−2x4+2x5

1−2x−3x2+3x3+2x5−2x6+x7

3 1−5x4−4x5+3x6−x7+x8

1−2x−3x2+3x3−4x4+6x5+4x6−3x7+3x8−3x9

4 1+x4−9x5−7x6+gx7−x8+5x9−3x10

1−2x−3x2+3x3−4x4+2x5+3x6+7x7−6x8+4x9−8x10+6x11

5 1,0,0,0,1,1,−16,−13,10,−3,7,−9,8x12

1,−2,−3,3,−4,2,−8,15,13,−10,9,−11,17,−14x13

6 1,0,0,0,1,1,3,−29,−23,19,−4,15,−12,22,−17x14

1,−2,−3,3,−4,2,−8,0,15,23,−19,14,−24,23,−39,31x15

7 1,0,0,0,1,1,3,4,−52,−42,33,−9,24,−27,30,−48,39x16

1,−2,−3,3,−4,2,−8,0,−18,42,42,−33,28,−38,51,−53,87,−70x17

∞ 1−x−2x2+x3+x4

1−3x−3x2+11x3−3x4−3x5+x6

L = 6 :n Generating function

0 1+x−5x2−2x3+4x4

1−8x2+8x4

1 1+2x−6x2−11x3+7x4+13x5−x6−4x7

1−11x2+27x4−20x6+4x8

2 1+2x−4x2−8x3−9x4+x5+18x6−3x7+5x8+9x9+6x10

1−11x2+21x4−x6+3x8−9x10

3 1,2,−3,−3,−12,−28,5,5,30,50,18,18x11

1,0,−10,0,4,0,49,0,4,0,−50,0,−18x12

4 1,2,−4,−5,−3,−14,−26,−1,58,26,24,−96,46,100,60,60,36x16

1,0,−11,0,14,0,25,0,11,0,−36,0,108,0,−100,0,−60x16

∞ 1+2x−8x2−13x3+15x4+20x5−5x7

1−15x2+60x4−75x6+25x8


0 1−2x−3x2+5x3+x4−x5

1−3x−3x2+11x3−3x4−3x5+x6

1 1,−1,−7,3,15,−2,−9,0,x8

1,−3,−6,19,5,−26,0,10,0,−x9

2 1,−1,−5,−1,2,19,2,−8,1,3,1,−2x11

1,−3,−6,18,3,−11,−7,−2,6,−7,−2,1,x12

3 1,−1,−5,2,−6,3,40,2,4,−2,−36,−1,−7,−1,5x14

1,−3,−6,18,−3,3,17,−54,0,−1,−13,39,1,2,2x15

4 1,−1,−5,2,0,−8,3,61,13,4,−7,−21,36,17,14,22,2,−14x17

1,−3,−6,18,−3,−1,14,−9,−30,−15,−10,9,40,−70,−12,0,−20,0,10x18

∞ 1,−2,−7,12,12,−15,−3,1,x8

1,4,6,−35,8,67,−37,−28,13,3,−x10


0 1+x−12x2−9x3+35x4+20x5−20x6−10x7

1−15x2+60x4−75x6+25x8

1 1,1,−15,−12,67,44,−108,−61,75,19,−17x10

1,−1,−18,18,97,−97,−186,186,105,−105,−17,17x11

2 1,2,−12,−24,21,71,90,−32,−148,27,−2,−86,−40,20,36,−16x15

1,0,−19,0,107,0,−195,0,54,0,−44,0,104,0,−16,0,16x16

∞ 1+2x−15x2−27x3+60x4+99x5−43x6−75x7+3x8+9x9

1−22x2+154x4−396x6+297x8−54x10

p = 3

L = 4:n Generating function

1 1+2x+x2

1−3x2

2 1+2x+5x2−5x4

1−5x2+x4

3 1+2x+5x2+6x3+3x4+15x6

1−5x2−9x4−3x6

4 1+2x+5x2+6x3+15x4−15x6−45x8

1−5x2−15x4+3x6+9x8

5 1,2,5,6,15,18,9,0,45,0,135x10

1,0,−5,0,−15,0,−27,0,−9,0,−27x10

6 1,2,5,6,15,18,45,0,−45,0,−135,0,−405x12

1,0,−5,0,−15,0,−45,0,9,0,27,0,81x12

7 1,2,5,6,15,18,45,54,27,0,135,0,405,0,1215x14

1,0,−5,0,−15,0,−45,0,−81,0,−27,0,−81,0,−243x14

∞ 1+2x+2x2

1−8x2

L = 5:

n Generating function0 1

1−x1 1+3x+x2−x3

1−x−4x2+x3

2 1+2x+6x2−2x3+6x4+6x5−5x6

1−2x−3x2−5x5+3x7

3 1,2,6,10,−9,−11,−29,−35,34,−29,15,0,3x12

1,−2,−3,−5,−12,21,4,35,5,−13,7,−15,0,−3x13

4 1,2,6,10,30,−10,30,28,158,154,−169,122,−112,15,−14,3x15

1,−2,−3,−5,−12,−1,−20,−17,−39,−153,1,+58,−28,70,0,14x15

5 1,2,6,10,30,51,−45,−56,−142,−179,−796,−794,882,−700,616,−451,266,70,0,14x20

1,−2,−3,−5,−12,−27,−58,106,14,182,172,802,19,−285,173,−309,141,−266,0,−70,0,−14x21

∞ 1+2x−2x3−x4+x5

1−2x−9x2+7x3+11x4−7x5−2x6+x7

228

n = 6 n = 7

Power N D N D

0 1 1 1 1

1 2 −2 2 −2

2 6 −3 6 −3

3 10 −5 10 −5

4 30 −12 30 −12

5 51 −27 51 −27

6 151 −58 151 −58

7 −50 −6 258 −138

8 152 −104 −227 −291

9 142 −82 −283 535

10 803 −210 −714 67

11 780 −754 −905 922

12 4097 −877 −4004 847

13 3920 −4008 −4015 4067

14 −4438 5 −20681 4369

15 3513 1410 −19848 20327

16 −3220 −839 22498 66

17 2328 1542 −18018 −7089

18 −2002 −700 16412 4359

19 323 1330 −13014 −7635

20 −350 0 10697 4065

21 85 350 −7842 −5951

22 −70 0 4620 2661

23 17 70 0 −4620

24 1330 0

25 0 −1330

26 350 0

27 0 −350

28 70 0

29 −70

L = 6:

n Generating function0 1+x

1−4x2

1 1+4x+x2−17x3−8x4+9x5+2x6

1−13x2+26x4−10x6

2 1,4,8,−6,−54,29,58,−65,−5,−23,19,−37,−14,2,2x14

1,0,−16,0,20,0,−26,0,81,0,24,0,38,0,−4x14

3 1,4,8,14,−19,−144,−32,−254,−168,796,32,1310,768,−480,408,−1296,−328,−512,−224,−32,−32x20

1,0,−16,0,−32,0,242,0,406,0,−932,0,−1652,0,320,0,1360,0,608,0,64x20

∞ 1+4x−2x2−26x3−21x4+46x5+63x6−17x7−32x8

1−26x2+138x4−235x6+104x8

n = 4 n = 5

Power N D N D

0 1 1 1 1

1 4 0 4 0

2 8 −16 8 −16

3 14 0 14 0

4 35 −46 35 −46

5 −46 0 90 0

6 −390 134 −127 −195

7 135 0 −965 0

8 −254 71 −228 1532

9 1394 0 −1764 0

10 2182 −1542 1366 2514

11 −3245 0 −9408 0

12 994 3582 −9676 12812

13 −5984 0 27966 0

14 −1104 7112 −1676 −30556

15 −6304 0 51064 0

16 −2128 7280 −2048 −56352

17 −6448 0 120288 0

18 −336 6464 76032 −143424

19 −11872 0 −4720 0

20 −2848 12032 59904 −12256

21 −7872 0 −87648 0

22 −2752 8064 55040 72512

23 768 0 −244608 0

24 768 −1536 −55808 253184

25 512 0 −177664 0

26 512 −1024 −51200 190464

27 −108544 0

28 −54272 129024

29 −12288 0

30 −12288 24576

31 −8192 0

32 −8192 16384

229

L = 7:

n Generating function

0 1−x−x2

1−2x−3x2+x3+x4

1 1+3x−6x2−31x3−16x4+53x5+61x6+9x7−8x8−2x9

1−x−16x2+x3+67x4+26x5−89x6−59x7+18x8+15x9−x10−x11

2 1,1,−1,−26,−48,87,3,−57,299,110,205,−204,71,255,−37,−95,−105,37,8,4x19

1,−3,−13,26,19,−17,16,11,116,−308,115,−27,141,−199,−126,124,40,−56,−24,16x19

∞ 1,1,−16,−23,69,159,−69,−353,−57,258,85,−61,−21,4,x14

1,−3,−28,66,237,−470,−743,1229,942,−1220,−576,510,175,−84,−24,4,x16

n = 3 n = 4

Power N D N D

0 1 1 1 1

1 1 −3 1 −3

2 −1 −13 −1 −13

3 −8 21 −8 21

4 −89 −27 −15 −27

5 −135 148 −183 94

6 178 342 −406 148

7 224 −559 630 −5

8 1469 606 35 488

9 2483 −2546 2629 −1422

10 −1961 −2611 −2142 822

11 2324 3034 −3771 2506

12 −9618 −6143 21727 8495

13 −10239 16062 −11215 −17599

14 −1172 5434 76829 28979

15 −20513 3120 24161 −85876

16 13100 20078 116254 54073

17 2089 −28881 −2887 −105037

18 11027 2497 81115 73192

19 30472 −22810 −233760 −63844

20 8569 −20264 49640 271383

21 20175 6510 −79170 −225145

22 4853 −10389 35275 179897

23 −371 10588 268826 −321016

24 −154 1579 −220229 −166512

25 −3328 1418 −206543 268740

26 −60 1256 −550056 49684

27 −544 −840 −554926 217263

28 0 184 −654009 352099

29 −16 −240 −29376 −311864

30 −100020 −115723

31 111555 −61200

32 15692 −70650

33 46597 44940

34 −1800 −10650

35 6750 13500

36 0 0

37 450 900

L = 8:


0 1+x−11x2−5x3+15x4

1−15x2+25x4

1 1,4,−21,−94,114,688,−156,−2223,−214,3572,675,−2910,−534,1140,154,−169,−11x16

1,0,−35,0,386,0,−1916,0,4832,0,−6517,0,4692,0,−1688,0,237x16

∞ 1,4,−34,−137,352,1623,−1077,−8561,−1668,22057,14120,−28171,−25166,16568,17524,−3346,−4004x16

1,0,−58,0,1159,0,−10799,0,51796,0,−132128,0,176025,0,−110782,0,24206x16

230

n = 2 n = 3

Power N D N D

0 1 1 1 1

1 4 0 4 0

2 −13 −37 −13 −37

3 −76 0 −53 0

4 −133 351 −142 241

5 274 0 −498 0

6 1389 −955 1044 1572

7 −349 0 4809 0

8 −1487 771 5688 −11604

9 3913 0 16327 0

10 3006 −5477 −7248 −34037

11 −3630 0 −92937 0

12 −5140 7958 −78003 172666

13 546 0 −242832 0

14 5021 −3186 −93126 414538

15 −7725 0 648205 0

16 −7891 10643 328970 −1029892

17 8944 0 1746475 0

18 9691 −16151 986158 −2639353

19 −4807 0 −1465314 0

20 −7413 11301 273635 2115818

21 3680 0 −5428040 0

22 2771 −7072 −2114300 7829198

23 −883 0 −386603 0

24 −730 1608 −3289378 812984

25 21 0 6307857 0

26 66 −63 −919168 −9170849

27 3537715 0

28 2944189 −5804794

29 −1640925 0

30 2988105 2121920

31 −2104845 0

32 331875 3320610

33 −708475 0

34 −507325 1236275

35 −90125 0

36 −159375 181125

37 −2625 0

38 −10500 7875

L = 9:n = 0 n = 1 n = 2 n = ∞

Power N D N D N D N D

0 1 1 1 1 1 1 1 1

1 −2 −3 3 −1 0 −4 0 −4

2 −12 −13 −30 −40 −26 −34 −54 −62

3 13 28 −116 12 −38 122 −20 229

4 33 37 239 579 128 340 1118 1485

5 −23 −67 1372 102 772 −1109 750 −5118

6 −17 −8 −202 −3999 445 −996 −11634 −18070

7 10 35 −6842 −2078 −4685 3104 −10838 58340

8 2 −6 −4367 14331 −346 223 67236 124283

9 −1 −4 15455 11082 9636 −3448 79393 −375807

10 1 17372 −27003 −13315 −3953 −224419 −507751

11 −14296 −26466 −12000 24317 −322499 1434717

12 −24557 25581 16859 8996 435069 1252071

13 3026 30602 54844 −49788 745989 −3307957

14 14539 −11009 −31388 12605 −487813 −1842362

15 1328 −16980 −50641 38255 −983666 4625949

16 −4167 2107 95041 −6600 330328 1547921

17 −620 4720 81520 −162781 738659 −3921431

18 597 −173 −53011 −18756 −155469 −675061

19 79 −653 −190090 146018 −313901 2009527

20 −40 5 245865 −68466 54901 107658

21 −3 42 99987 −184456 73732 −614077

22 1 0 −125590 4143 −13232 19391

23 −1 −278821 401532 −8993 107839

24 262295 −138464 1820 −9704

25 76748 −145821 486 −10111

26 −251842 94478 −114 1267

27 −6268 190185 −7 446

28 100611 −137123 2 −63

29 30793 −72001 0 −7

30 −90981 76206 0 1

31 10179 26358

32 28836 −32158

33 −7082 −6321

34 −5494 7560

35 1309 510

36 745 −1385

37 −325 −125

38 50 125

231

p = 3, b = 3


1 1+2x+x2

1−3x2

2 1+2x+7x2+4x3−4x4+2x5+2x6

1−5x2−2x4

3 1,2,6,18,14,−8,27,−32,−31,8,−22,12,19,−4,−3x14

1,0,−6,0,−18,0,9,0,19,0,−2,0,−5,0,x14

4 1,2,5,16,53,32,−53,66,−116,160,268,−8,−95,64,15,8,−16,6,−24,−22x20

1,0,−7,0,−23,0,−17,0,−32,0,−84,0,21,0,11,0,8,0,−2,0,2x20

∞ 1+2x−3x2+5x4

1−15x2+25x4

n = 5

Power N D

1 1 1

2 2 0

3 6 −6

4 18 0

5 52 −36

6 148 0

7 159 −165

8 64 0

9 367 −67

10 −438 0

11 1118 304

12 −2386 0

13 −1221 1621

14 −2168 0

15 −2898 1500

16 −116 0

17 −6727 777

18 6412 0

19 1982 −2514

20 5088 0

21 3795 −2021

22 1584 0

23 8511 −1525

24 −6716 0

25 −2797 1879

26 −3692 0

27 −2157 1007

28 −1056 0

29 −3724 740

30 2928 0

31 1764 −732

32 816 0

33 480 −192

34 144 0

35 480 −96

36 −432 0

37 −324 108

L = 5:


1 1+3x+x2−x3

1−x−4x2+x4

2 1+2x+8x2+10x4+16x5+3x6+x7+5x8

1−2x−3x2−5x4−7x5−6x6+3x7+2x8+x9

3 1,2,8,22,−4,−5,−76,−251,−222,108,276,19,244,39,−96,−30,−69,−11,15,5,6,0,−x23

1,−2,−3,−6,−28,34,18,106,149,−65,−19,−258,−227,24,−12,151,123,4,14,−23,−20,−1,−2,1,x24

∞ 1+x−6x2−3x3+14x4+2x5−8x6+x8

1−3x−13x2+28x3+37x4−67x5−8x6+35x7−6x8−4x9+x10

232

n = 4

Power N D

1 1 1

2 2 −2

3 8 −3

4 20 −8

5 77 −24

6 36 10

7 145 −78

8 136 −36

9 646 −152

10 2043 −481

11 464 −845

12 2918 191

13 1442 −508

14 606 902

15 1414 1650

16 4630 813

17 4723 −709

18 2902 −233

19 516 167

20 −1174 1882

21 −1743 2057

22 −839 590

23 −95 −574

24 −605 −604

25 −1041 −340

26 −626 −34

27 −288 15

28 −188 −111

29 −40 −28

30 11 −182

31 217 −155

32 32 −9

33 −27 35

34 25 64

35 19 31

36 9 6

37 −18 0

38 −6 −6


1 1+4x+x2−17x3−8x4+9x5+2x6

1−13x2+26x4−10x6

2 1,4,10,0,−65,17,41,−93,−16,−67,82,−122,73,−67,28,−20,−2,−2,−2x18

1,0,−16,0,15,0,−12,0,112,0,52,0,106,0,36,0,6x16

∞ 1+4x−10x2−49x3+7x4+188x5+156x6−252x7−324x8+96x9+144x10

1−36x2+363x4−1408x6+2376x8−1728x10+432x12

233

n = 3

Power N D

1 1 1

2 4 0

3 9 −17

4 27 0

5 −20 −43

6 −251 0

7 −155 451

8 −740 0

9 −418 909

10 2946 0

11 147 −3623

12 7260 0

13 4303 −7442

14 −10952 0

15 4581 10201

16 −25557 0

17 −8416 21562

18 13025 0

19 −8808 −10439

20 28504 0

21 8055 −20564

22 −8756 0

23 5215 6433

24 −12753 0

25 −4015 7487

26 3182 0

27 −671 −1878

28 2372 0

29 726 −1180

30 −522 0

31 −116 272

32 −151 0

33 6 58

34 30 0

35 −12

p = 4


1 1+4x+4x2−5x3−2x4+2x5

1−9x2+9x4

2 1,4,24,19,−18,79,−114,45,−559,185,240,−82,250,−90,12,−4x15

1,0,−13,0,−36,0,−68,0,151,0,−213,0,80,0,62,0,4x16

∞ 1+4x−x2−25x3−31x4+33x5+80x6−9x7−60x8

1−38x2+282x4−695x6+476x8

n = 3

Power N D

0 1 1

1 4 0

2 21 −16

3 63 0

4 59 −134

5 −157 0

6 34 321

7 −1917 0

8 −158 3178

9 64 0

10 −9539 −91

11 20018 0

12 −8395 −24328

13 6941 0

14 72552 −22765

15 −78053 0

16 7880 83020

17 −41030 0

18 −178078 82740

19 138930 0

20 59274 −141724

21 55378 0

22 141576 −95744

23 −106088 0

24 −77128 113752

25 −25976 0

26 −25064 37320

27 30976 0

28 19200 −34304

29 5056 0

30 −1856 −5056

31 −3072 0

32 3072

234

L = 7:n = 1 n = 2

Power N D

0 1 1 1 1

1 7 −1 5 −3

2 6 −13 24 −13

3 −18 −1 −28 6

4 −15 27 −43 −2

5 7 10 86 62

6 3 −9 −3 −115

7 −1 −2 363 37

8 1 −1701 572

9 275 828

10 2154 −400

11 −999 −513

12 −408 1169

13 −593 34

14 2320 −1215

15 835 −1071

16 −1065 550

17 −1103 1211

18 −279 339

19 515 −207

20 153 −226

21 −14 17

22 −19 34

23 3 −10

24 7 −8

25 −5 −1

26 −1 4

235

B. CRITICAL POINTS FOR WALKS IN A STRIP

In this appendix, we present estimates for the smallest zeros of the denominators of thegenerating functions (i.e. the inverses of the growth constants) for p walkers which are n-friendly in a strip of width L. For the generating functions given in Appendix A, the zeroscan of course be calculated exactly, so we do not cover those cases.

p = 2:

L\n 1 2 3 4

8 0.30347319989618537624 0.30142185274497455475

9 0.30192618936654659137 0.29702055450997588652 0.29422341381276333485 0.29267918993779732224

10 0.29312083987297982411 0.28940194905165598034 0.28724719368839763074 0.28605255682254553489

11 0.28642991712507078632 0.28353808916517951117 0.28184030751956644744 0.28089542725627648441

12 0.28121549892077712641 0.27891908095845673192 0.27755597350158580627 0.27679476283940804327

13 0.27706625397786022061 0.27521019621608846973 0.27409821694919907048

14 0.27370642126032907132 0.27218354806165794208 0.27126392933441121365

15 0.27094504064109384015 0.26967921573126446065 0.26890957774642922121

16 0.26864620637999163090 0.26758205900680259802

17 0.26580733411018022250

18 0.26429146550349245687

p = 3:L\n 1 210 0.18903356290805598415 0.1781736605686895366611 0.17826434208586277046 0.1699547063031495705812 0.17010060090463156316 0.1635862700272215736413 0.1637454478319250509714 0.1586891979592972794015 0.1545925098841653396116 0.15122172621583856124

BIBLIOGRAPHY

[1] J. W. Alexander. Topological invariants of knots and links. Trans. Amer. Math. Soc.,30:275–306, 1928.

[2] N. Alves Jr. and J. R. D. de Felıcio. Short-time dynamic exponents of an Ising modelwith competing interactions. Mod. Phys. Lett. B, 17(5-6):209–218, 2003.

[3] W. Arnoldi. The principle of minimized iterations in the solution of the matrix eigen-value problem. Quart. Appl. Math., 9:17–29, 1951.

[4] D. K. Arrowsmith, P. Mason, and J. W. Essam. Vicious walkers, flows and directedpercolation. Physica A, 177(1-3):267–272, 1991.

[5] J. Baik. Random vicious walks and random matrices. Comm. Pure Appl. Math.,53(11):1385–1410, 2000.

[6] G. A. Baker Jr. and P. Graves-Morris. Pade approximants, volume I: Basic theory.Addison-Wesley Publishing Company, Inc., 1981.

[7] G. A. Baker Jr. and P. Graves-Morris. Pade approximants, volume II: Extensions andapplications. Addison-Wesley Publishing Company, Inc., 1981.

[8] M. N. Barber. Non-universality in the Ising model with nearest and next-nearestneighbour interactions. J. Phys. A: Math. Gen., 12(5):679–688, 1979.

[9] R. J. Baxter. Dimers on a rectangular lattice. J. Math. Phys., 9(4):650–654, 1968.

[10] R. J. Baxter. Corner transfer matrices of the eight-vertex model. I. Low-temperatureexpansions and conjectured properties. J. Stat. Phys., 15(6):485–503, 1976.

[11] R. J. Baxter. Corner transfer matrices of the eight-vertex model. II. The Ising modelcase. J. Stat. Phys., 17(1):1–14, 1977.

[12] R. J. Baxter. Variational approximations for square lattice models in statistical me-chanics. J. Stat. Phys., 19(5):461–478, 1978.

[13] R. J. Baxter. Hard hexagons: exact solution. J. Phys. A: Math. Gen., 13:L61–L70,1980.

[14] R. J. Baxter. Exactly solved models in statistical mechanics. Academic Press, 1982.

[15] R. J. Baxter. Corner transfer matrices of the chiral Potts model. J. Stat. Phys.,63(3-4):433–53, 1991.

[16] R. J. Baxter. Chiral Potts model: corner transfer matrices and parametrizations. Int.J. Mod. Phys. B, 7(20-21):3489–3500, 1993.

[17] R. J. Baxter. Corner transfer matrices of the chiral Potts model. II. The triangularlattice. J. Stat. Phys., 70(3-4):535–582, 1993.

[18] R. J. Baxter. Planar lattice gases with nearest-neighbour exclusion. Ann. Comb.,3(2-4):191–203, 1999.

[19] R. J. Baxter and I. G. Enting. Series expansions from corner transfer matrices - thesquare lattice model. J. Stat. Phys., 21(1):103–123, 1979.

[20] R. J. Baxter, I. G. Enting, and S. K. Tsang. Hard-square lattice gas. J. Stat. Phys.,22(4):465–489, 1980.

[21] R. J. Baxter and P. J. Forrester. A variational approximation for cubic lattice modelsin statistical mechanics. J. Phys. A: Math. Gen., 17(13):2675–2685, 1984.

[22] R. J. Baxter and S. K. Tsang. Entropy of hard hexagons. J. Phys. A: Math. Gen.,13:1023–1030, 1980.

[23] A. N. Berker and K. Hui. Phase diagram of the Ising model on the square lattice withcrossed diagonal bonds. Phys. Rev. B, 48(17):12393–12398, 1993.

[24] J. A. Bernhard. Unknotting numbers and minimal knot diagrams. J. Knot TheoryRamifications, 3(1):1–5, 1994.

[25] K. Binder and D. P. Landau. Phase diagrams and critical behavior in Ising squarelattices with nearest- and next-nearest-neighbor interactions. Phys. Rev. B, 21(5):1941–1962, 1980.

[26] K. G. Binmore. Mathematical Analysis: a straightforward approach. Cambridge Uni-versity Press, 2nd edition, 1982.

[27] S. A. Bleiler. A note on unknotting number. Math. Proc. Camb. Phil. Soc., 96(3):469–471, 1984.

[28] H. W. J. Blote, A. Compagner, and A. Hoogland. The simple quadratic Ising modelwith crossing bonds. Physica A, 141(2-3):375–402, 1987.

[29] H. W. J. Blote and M. P. Nightingale. Critical behaviour of the two-dimensional Pottsmodel with a continuous number of states; a finite size scaling analysis. Physica A,112(3):405–465, 1982.

[30] M. Bousquet-Melou. Three osculating walkers. arXiv:math.CO/0504153, 2005.

238

[31] R. Brak, J. W. Essam, and A. L. Owczarek. New results for directed vesicles andchains near an attractive wall. J. Stat. Phys., 90(1-2):155–192, 1998.

[32] R. Brak, J. W. Essam, and A. L. Owczarek. Exact solution of N directed non-intersecting walks interacting with one or two boundaries. J. Phys. A: Math. Gen.,32(19):2921–2929, 1999.

[33] R. Brak and A. L. Owczarek. On the analyticity properties of scaling functions inmodels of polymer collapse. J. Phys. A: Math. Gen., 28(17):4709–4725, 1995.

[34] T. W. Burkhardt. Interface free-energy and critical line for Ising-model with nearestand next-nearest-neighbor interations. Z. Phys. B, 29(2):129–132, 1978.

[35] J. L. Cardy and F. Colaiori. Directed percolation and generalized friendly randomwalkers. Phys. Rev. Lett., 81(11):2232–2235, 1999.

[36] Y. Chan and A. J. Guttmann. Some results for directed lattice walkers in a strip. Disc.Math. and Theoret. Comp. Sci., AC:27–38, 2003.

[37] A. R. Conway, I. G. Enting, and A. J. Guttmann. Algebraic techniques for enumeratingself-avoiding walks on the square lattice. J. Phys. A: Math. Gen., 26(7):1519–1534,1993.

[38] A. R. Conway and A. J. Guttmann. Square lattice self-avoiding walks and correctionsto scaling. Phys. Rev. Lett., 77(26):5284–5287, 1996.

[39] N. W. Dalton and D. W. Wood. Critical point behavior of the Ising model withhigher-neighbor interactions present. J. Math. Phys., 10(7):1271–1302, 1969.

[40] T. de Neef. Some applications of series expansions in magnetism. PhD thesis, Eind-hoven University of Technology, 1975.

[41] T. de Neef and I. G. Enting. Series expansions from the finite lattice method. J. Phys.A: Math. Gen., 10(5):801–805, 1977.

[42] C. Domb and M. S. Green, editors. Phase transitions and critical phenomena, volume 1.Academic Press, 1972.

[43] C. Domb and M. S. Green, editors. Phase transitions and critical phenomena, volume 2.Academic Press, 1972.

[44] C. Domb and J. L. Lebowitz, editors. Phase transitions and critical phenomena, vol-ume 13. Academic Press, 1989.

[45] C. Domb and R. B. Potts. Order-disorder statistics. IV. A two-dimensional model withfirst and second interactions. Proc. R. Soc. Lond. A, 210(1100):125–141, 1951.

239

[46] L. E. Dubins, A. Orlitsky, J. A. Reeds, and L. A. Shepp. Self-avoiding random loops.IEEE Trans. Inf. Theory, 34(6):1509–1516, 1988.

[47] U. Dudley. Elementary number theory. W. H. Freeman and Company, 1969.

[48] I. G. Enting. Analysis of the second-neighbour Ising model. J. Phys. C: Solid StatePhys., 7(7):1237–1241, 1974.

[49] I. G. Enting. Crystal growth models and Ising models IV. Graphical solutions forcorrelations. J. Phys. A: Math. Gen., 11(10):2001–2013, 1978.

[50] I. G. Enting. Series expansions from the finite lattice method. Nuclear Physics B(Proc. Suppl.), 47:180–187, 1996.

[51] I. G. Enting and R. J. Baxter. An investigation of the high-field series expansions forthe square lattice Ising model. J. Phys. A: Math. Gen., 13(12):3723–3734, 1980.

[52] J. W. Essam and A. J. Guttmann. Vicioius walkers and directed polymer networks ingeneral dimensions. Phys. Rev. E, 52(6):5849–5862, 1995.

[53] C. Fan and F. Y. Wu. Ising model with second-neighbor interaction. I. Some exactresults and an approximate solution. Phys. Rev., 179(2):560–570, 1969.

[54] M. E. Fisher. Walks, walls, wetting, and melting. J. Stat. Phys., 34(5-6):667–729,1984.

[55] P. J. Forrester. Probability of survival for vicious walkers near a cliff. J. Phys. A:Math. Gen., 22(13):L603–L613, 1989.

[56] P. J. Forrester. Exact solution of the lock step model of vicious walkers. J. Phys. A:Math. Gen., 23(7):1259–1273, 1990.

[57] D. P. Foster and C. Pinettes. A corner transfer matrix renormalization group inves-tigation of the vertex-interacting self-avoiding walk model. J. Phys. A: Math. Gen.,36:10279–10298, 2003.

[58] D. P. Foster and C. Pinettes. Corner-transfer-matrix renormalization-group method fortwo-dimensional self-avoiding walks and other O(n) models. Phys. Rev. E, 67(045105),2003.

[59] A. Gendiar and T. Nishino. Latent heat calculation of the three-dimensional q =3, 4, and 5 Potts models by the tensor product variational approach. Phys. Rev. E,65(4):046702, 2002.

[60] I. Gessel. Private communication.

[61] R. W. Gibberd. Next-nearest-neighbor Ising model. J. Math. Phys., 10(6):1026–1029,1969.

240

[62] D. J. Grabiner. Random walk in an alcove of an affine Weyl group, and non-collidingrandom walks on an interval. J. Combin. Theory Ser. A, 97(2):285–306, 2002.

[63] M. D. Grynberg and B. Tanatar. Square Ising model with second-neighbor interactionsand the Ising chain in a transverse field. Phys. Rev. B, 45(6):2876–2883, 1992.

[64] A. J. Guttmann and A. R. Conway. Square lattice self-avoiding walks and polygons.Annals of Combinatorics, 5:319–345, 2001.

[65] A. J. Guttmann, A. L. Owczarek, and X. G. Viennot. Vicious walkers and Youngtableaux I: without walls. J. Phys. A: Math. Gen., 31(40):8123–8135, 1998.

[66] A. J. Guttmann and M. Voge. Lattice paths: vicious walkers and friendly walkers. J.Stat. Plan. and Inf., 101:107–131, 2002.

[67] B. Haible. CLN - Class library for numbers. http://www.ginac.de/CLN/.

[68] A. Hankey and H. E. Stanley. Systematic application of generalized homogeneousfunctions to static scaling, dynaic scaling, and universality. Phys. Rev. B, 6(9):3515–3542, 1972.

[69] G. H. Hardy and S. Ramanujan. Asymptotic formulae for the distribution of integersof various types. Proc. Lond. Math. Soc. (2), 16:112–132, 1917.

[70] J. Hijmans and J. de Boer. An approximation method for order-disorder problems.Physica, 21:471–484, 1955.

[71] B. D. Hughes. Random Walks and Random Environments, volume 1: Random walks.Oxford University Press, 1995.

[72] E. Ising. Beitrag zur theorie des ferromagnetismus. Z. Phys., 31:253–258, 1925.

[73] E. W. James and C. E. Soteros. Critical exponents for square lattice trails with a fixednumber of vertices of degree 4. J. Phys. A: Math. Gen., 35(44):9273–9307, 2002.

[74] I. Jensen. Enumeration of self-avoiding walks on the square lattice. J. Phys. A: Math.Gen., 37(21):5503–5524, 2004.

[75] I. Jensen. Self-avoiding walks and polygons on the triangular lattice. J. Stat. Mech.,page P10008, 2004.

[76] I. Jensen, A. J. Guttmann, and I. G. Enting. Low-temperature series expansions forthe spin-1 Ising model. J. Phys. A: Math. Gen., 27(21):6987–7005, 1994.

[77] I. Jensen, A. J. Guttmann, and I. G. Enting. Low-temperature series expansions for thesquare lattice Ising model with spin S > 1. J. Phys. A: Math. Gen., 29(14):3805–3815,1996.

241

[78] V. F. R. Jones. A polynomial invariant for knots via von Neumann algebras. Bull.Am. Math. Soc., 12(1):103–111, 1985.

[79] Y. M. Kao, M. Chen, and K. Y. Lin. Low-temperature series expansions for square-lattice Ising model with first and second neighbour interactions. Int. J. Mod. Phys. B,16(32):4911–4917, 2002.

[80] T. Kawamura. The unknotting numbers of 10139 and 10152 are 4. Osaka J. Math.,35(3):539–546, 1998.

[81] W. H. Kazez, editor. Geometric topology, volume 2. American Mathematical Society,1993.

[82] S. B. Kelland. Estimates of the critical exponent beta for Potts model using a varia-tional approximation. Can. J. Phys., 54:1621–1626, 1976.

[83] H. Kesten. On the number of self-avoiding walks. J. Math. Phys., 4(7):960–969, 1963.

[84] D. E. Knuth. The Art of Computer Programming, volume 1. Addison-Wesley Publish-ing Company, 2nd edition, 1973.

[85] N. Koblitz. A course in number theory and cryptography. Springer-Verlag, 1987.

[86] H. A. Kramers and G. H. Wannier. Statistics of the two-dimensional ferromagnet. PartI. Phys. Rev., 60:252–262, 1941.

[87] H. A. Kramers and G. H. Wannier. Statistics of the two-dimensional ferromagnet. PartII. Phys. Rev., 60:263–276, 1941.

[88] C. Krattenthaler. Permutations with restricted patterns and Dyck paths. Adv. Appl.Math., 27(2-3):510–530, 2001.

[89] C. Krattenthaler, A. J. Guttmann, and X. G. Viennot. Vicious walkers, friendly walkersand Young tableaux: II. With a wall. J. Phys. A: Math. Gen., 33:8835–8866, 2000.

[90] C. Krattenthaler, A. J. Guttmann, and X. G. Viennot. Vicious walkers, friendly walkersand Young tableaux. III. Between two walls. J. Stat. Phys., 110(3-6):1069–1086, 2003.

[91] M. Lal. ‘Monte Carlo’ computer simulation of chain molecules I. Mol. Phys., 17:57–64,1969.

[92] D. P. Landau. Phase transitions in the Ising square lattice with next-nearest-neighborinteractions. Phys. Rev. B, 21(3):1285–1297, 1980.

[93] Z. B. Li, Z. Shuai, Q. Wang, H. J. Luo, and L. Schulke. Critical exponents of thetwo-layer Ising model. J. Phys. A: Math. Gen., 34(31):6069–6079, 2001.

242

[94] W. B. R. Lickorish. The unknotting number of a classical knot. Contemp. Math.,44:117–121, 1985.

[95] W. B. R. Lickorish. An introduction to knot theory. Springer-Verlag, 1997.

[96] N. Madras and G. Slade. The Self-Avoiding Walk. Birkhauser, 1993.

[97] N. Madras and A. Sokal. The pivot algorithm: A highly efficient Monte Carlo methodfor the self-avoiding walk. J. Stat. Phys., 50(1-2):109–186, 1988.

[98] J. P. J. Michels and F. W. Wiegel. On the topology of a polymer ring. Proc. R. Soc.Lond. A, 403:269–284, 1986.

[99] K. Minami and M. Suzuki. Non-universal critical behaviour of the two-dimensionalIsing model with crossing bonds. Physica A, 192(1-2):152–166, 1993.

[100] J. L. Moran-Lopez, F. Aguilera-Granja, and J. M. Sanchez. Phase transitions inIsing square antiferromagnets with first- and second-neighbour interactions. J. Phys.:Condens. Matter, 6(45):9759–9772, 1994.

[101] M. Nauenberg and B. Neinhuis. Renormalization-group approach to the solution ofgeneral Ising models. Phys. Rev. Lett., 33(27):1598–1601, 1974.

[102] M. P. Nightingale. Non-universality for Ising-like spin systems. Phys. Lett. A,59(6):486–488, 1977.

[103] T. Nishino. Density matrix renormalization group method for 2D classical models. J.Phys. Soc. Jpn., 64(10):3598–3601, 1995.

[104] T. Nishino, Y. Hieida, K. Okunishi, N. Maeshima, Y. Akutsu, and A. Gendiar. Two-dimensional tensor product variational formulation. Progress of Theoretical Physics,105(3):409–417, 2001.

[105] T. Nishino and K. Okunishi. Corner transfer matrix renormalization group method.J. Phys. Soc. Jpn., 65(4):891–894, 1996.

[106] T. Nishino and K. Okunishi. Corner transfer matrix algorithm for classical renormal-ization group. J. Phys. Soc. Jpn., 66(10):3040–3047, 1997.

[107] T. Nishino and K. Okunishi. A density matrix algorithm for 3D classical models. J.Phys. Soc. Jpn., 67(9):3066–3072, 1998.

[108] T. Nishino and K. Okunishi. Numerical latent heat observation of the q = 5 pottsmodel. J. Phys. Soc. Jpn., 67(4):1492–1493, 1998.

[109] T. Nishino and K. Okunishi. Kramers-Wannier approximation for the 3D Ising model.Progress of Theoretical Physics, 103(3):541–548, 2000.

243

[110] T. Nishino, K. Okunishi, Y. Hieida, N. Maeshima, and Y. Akutsu. Self-consistenttensor product variational approximation for 3D classical models. Nuclear Physics B,575(3):504–512, 2000.

[111] T. Nishino, K. Okunishi, and M. Kikuchi. Numerical renormalization group at criti-cality. Phys. Lett. A, 213(1-2):69–72, 1996.

[112] J. Oitmaa. The square-lattice Ising model with first and second neighbour interactions.J. Phys. A: Math. Gen., 14(5):1159–1168, 1981.

[113] K. Okunishi, Y. Hieida, and Y. Akutsu. Universal asymptotic eigenvalue distributionof density matrices and corner transfer matrices in the thermodynamic limit. Phys.Rev. E, 59(6):R6227–R6230, 1999.

[114] L. Onsager. Crystal statistics. I. A two-dimensional model with a order-disorder tran-sition. Phys. Rev., 65:117–149, 1944.

[115] W. J. C. Orr. Statistical treatment of polymer solutions at infinite dilution. Trans.Faraday Soc., 43:12–27, 1947.

[116] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vettering. Numerical recipesin C: the art of scientific computing. Cambridge University Press, 2nd edition, 1997.

[117] K. Reidemeister. Knotten und gruppen. Abh. Math. Sem. Univ. Hamburg, 5:7–23,1927.

[118] J. Roca. The mechanisms of DNA topoisomerases. TIBS, 20(4):156–160, 1995.

[119] Y. Saad. Iterative methods for sparse linear systems. PWS Publishing Company, 2edition, 2000.

[120] B. Salvy and P. Zimmermann. GFUN: a Maple package for the manipulation of gen-erating and holonomic functions in one variable. ACM Trans. Math. Soft. (TOMS),20(2):163–177, 1994.

[121] U. Schollwock. The density-matrix renormalization group. Rev. Mod. Phys., 77(1):259–315, 2005.

[122] D. Smith and K. Teo. Linear algebra. New Zealand Mathematical Society (Inc.), 1989.

[123] A. Soehianie and J. Oitmaa. Extension of the high temperature susceptibility expansionfor general Ising systems. Mod. Phys. Lett. B, 11(14):609–614, 1997.

[124] R. Stanley. Enumerative Combinatorics. Cambridge University Press, New York, 1997.

[125] A. Stoimenow. Some examples related to 4-genera, unknotting numbers and knotpolynomials. J. Lond. Math. Soc. (2), 63(2):487–500, 2001.

244

[126] R. H. Swendsen and S. Krinsky. Monte Carlo renormalization group and Ising modelswith n ≥ 2. Phys. Rev. Lett., 43(3):177–180, 1979.

[127] K. Tanaka, T. Horiguchi, and T. Morita. Critical indices for the two-dimensional Isingmodel with nearest-neighbor and next-nearest-neighbor interactions. Phys. Lett. A,165(3):266–270, 1992.

[128] C. J. Thompson. Mathematical statistical mechanics. Princeton University Press, 1972.

[129] C. J. Thompson. Classical equilibrium statistical mechanics. Oxford Science Publica-tions, 1988.

[130] T. Tsuchiya and M. Katori. Chiral Potts models, friendly walkers and directed perco-lation problem. J. Phys. Soc. Jpn., 67(5):1655–1666, 1998.

[131] N. Tsutshima and T. Horiguchi. Phase diagrams of spin-3/2 Ising model on a squarelattice in terms of corner transfer matrix renormalization group method. J. Phys. Soc.Jpn., 67(5):1574–1582, 1998.

[132] B. L. van der Waerden. Die lange Reichweite der regelmassigen Atomanordnung inMischkristallen. Z. Phys., 118:473–479, 1941.

[133] J. M. J. van Leeuwen. Singularities in the critical surface and universality for Ising-likespin systems. Phys. Rev. Lett., 34(16):1056–1058, 1975.

[134] E. J. J. van Rensburg, E. Orlandini, D. W. Sumners, M. C. Tesi, and S. G. Whittington.Entanglement complexity of lattice ribbons. J. Stat. Phys., 85(1-2):103–130, 1996.

[135] M. Voge. Topics in statistical mechanics: the n-friendly walker model and benzenoidsystems. PhD thesis, The University of Melbourne, 2002.

[136] J. C. Wang. DNA toposiomerases. Sci. Am., 247:94–109, 1982.

[137] J. C. Wang. Cellular roles of DNA topoisomerases: a molecular perspective. NatureReview Molecular Cell Biology, 3:430–440, 2002.

[138] S. A. Wasserman, J. M. Dungan, and N. R. Cozzarelli. Discovery of a predicted DNAknot substantiates a model for site-specific recombination. Science, 229(4709):171–174,1985.

[139] S. R. White. Density matrix formulation for quantum renormalization groups. Phys.Rev. Lett., 69(19):2863–2866, 1992.

[140] S. R. White. Density-matrix algorithms for quantum renormalization groups. Phys.Rev. B, 48(14):10345–10356, 1993.

[141] H. S. Wilf. generatingfunctionology. Academic Press, 2nd edition, 1994.

245

[142] J. H. Wilkinson. The algebraic eigenvalue problem. Clarendon Press, 1965.

[143] F. Y. Wu. Knot theory and statistical mechanics. Rev. Mod. Phys., 64(4):1099–1131,1992.

[144] C. N. Yang. The spontaneous maagnetization of a two-dimensional Ising model. Phys.Rev., 85(5):808–816, 1952.

[145] D. Zhao and T. Lookman. Critical exponents for simple non-uniform polymer networks.J. Phys. A: Math. Gen., 26(5):1067–1076, 1993.

246

Selected problems in lattice statistical mechanics - CiteSeerX

Documents

Transcript of Selected problems in lattice statistical mechanics - CiteSeerX