1 Factorisation of Integers

241
1 1 Factorisation of Integers Definition N = {1, 2, 3,...} are the natural numbers. Definition Z = {..., -2, -1, 0, 1, 2,...} are the integers. Closed under the binary operations +, ×, - Definition α R then bαc is the greatest integer which is less than or equal to α. Ex b3c =3, 2 = 1, b-πc = -4 Then bαc 6 α< bαc +1 Proposition 1 If a and b are two integers with b> 0 then there are integers q and r with 0 6 r<b and a = qb + r. Proof. Let α = a b . 0 6 a b - a b < 1 0 6 a - b a b < b so if r = a - b a b then a = qb + r with q = a b . Definition If a = cb (a, b, c Z) we say a is a multiple of b, or b divides a and write

Transcript of 1 Factorisation of Integers

1

1 Factorisation of Integers

Definition N = {1, 2, 3, . . .} are the natural numbers.

Definition Z = {. . . , −2, −1, 0, 1, 2, . . .} are the integers.

Closed under the binary operations +, ×, −

Definition α ∈ R then bαc is the greatest integer which is less than or equal to α.

Ex b3c = 3,⌊√

2⌋

= 1, b−πc = −4

Then bαc 6 α < bαc+ 1

Proposition 1 If a and b are two integers with b > 0 then there are integers q and rwith 0 6 r < b and a = qb+ r.

Proof. Let α = ab.

⇒ 0 6 ab−⌊ab

⌋< 1

⇒ 0 6 a− b⌊ab

⌋< b

so if r = a− b⌊ab

⌋then a = qb+ r with q =

⌊ab

⌋. �

Definition If a = cb (a, b, c ∈ Z) we say a is a multiple of b, or b divides a and write

2

b |a.

Proposition 2 If b 6= 0, c 6= 0 then

(a) b |a and c |b ⇒ c |a(b) b |a ⇒ bc |ac(c) c |d and c |e ⇒ ∀m,n ∈ Z, c |dm+ en.

Proposition 3 Let a, b > 0. If b |a and b 6= a then b < a.

Definition If b |a and b 6= 1 or a then we say b is a proper divisor of a. If b does notdivide a write b - a.

Definition P = {p ∈ N : p > 1 and the only divisors of p are 1 and p} are the primenumbers. Then N \ (P ∪ {1}) are the composite numbers.P = {2, 3, 5, 7, 11, 13, 17, 19, 23, . . .}.

Theorem 1 Every n > 1, n ∈ N, is a product of prime numbers.

Proof. If n ∈ P we are done. If n is not prime, let q1 be the least proper divisor of n.Then q1 is prime (since otherwise, by Prop 3, it would have a smaller proper divisor). Letn = q1n1, 1 < n1 < n. If n1 is prime we are done. If not n1 = q2n2, 1 < n2 < n1 < n.This process must terminate in less than n steps. Hence n = q1q2 . . . qs with s < n. �

Ex 10725 = 3 · 5 · 5 · 11 · 13

3

In a prime factorization of n arrange the primes so that p1 < p2 < · · · < pk and exponentsαi ∈ N, 1 6 i 6 k so

n = pα11 p

α22 · · · p

αkk

=k∏j=1

pαjj

is the standard factorisation of n.

Prime Numbers

We can use the sieve of Eratosthenes to list the primes 2 6 p 6 N .

If n 6 N and n is not prime, then n must be divisible by a prime p 6√N (if p1 >

√N

and p2 >√N ⇒ p1p2 > N).

List all of the integers between 2 and N

2, 3, 4, 5, . . . , N

successively remove

(i) 4, 6, 8, 10, . . . even integers from 22 on(ii) 9, 15, 21, 27, . . . multiples of 3 from 32 on(iii) 25, 35, 55, 65, . . . multiples of 5 from 52 on

4

etc.

i.e. remove all integers which are multiples of a prime p <√N . We are left with all

primes up to N .

Ex N = 16,√N = 4

{2, 3, 64, 5, 66, 7, 68, 69, 610, 11, 612, 13, 614, 615, 616}

Theorem 2 |P| =∞, i.e. there are an infinite number of primes.

Proof. Let P = {p1, p2, . . . , pn} with p1 < p2 < · · · < pn and let q =∏n

j=1 pj + 1. Thenq > pj ∀j ⇒ q 6∈ P so q is composite. But pi | q ⇒ pi | q −

∏nj=1 pj = 1 ⇒ pi = 1

which is false. Hence |P| =∞. �

How many primes are there ?

Note:

∞∑n=1

1

n= ∞

∞∑n=1

1

n2=

π2

6<∞.

5

We can show∞∑j=1

1

pj=∞

so the primes are denser than the squares.

If x > 0, let S(x) = #{n ∈ N : n2 6 x}. Then S(x) = b√xc. We can show

π(x) = #{p ∈ P : p 6 x}∼ x

log(x)

Definition A modulus is a set of integers closed under ±. The zero modulus is just{0}. If a ∈ Z then M = {na : n ∈ Z} is a modulus.

Proposition 4 If M is a modulus with a, b ∈M and m,n ∈ Z then ma+ nb ∈M .

Proof. a ∈ M ⇒ a + a = 2a ∈ M ⇒ 2a + a = 3a ∈ M etc. so ma ∈ M and so is nb,thus ma+ nb ∈M . �

Proposition 5 If M 6= {0} is a modulus, it is the set of multiples of a fixed positiveinteger.

Proof. Let d be the least positive integer in M with 0 < d.

Claim: every element of M is a multiple of d. If not (???) let n ∈ M have d - n. Thenn = dq + r with 1 6 r < d. But r = n− dq ∈M (!!!). �

6

Definition Let a, b ∈ Z and let M = {ma+ nb : m,n ∈ Z} then M is generated by d inthat M = {nd : n ∈ Z}. We call d the greatest common divisor or GCD of a and b,and write (a, b) = d.

Proposition 6(i) ∃x, y ∈ Z so ax+ by = (a, b)(ii) ∀x, y ∈ Z, (a, b) |ax+ by(iii) If e |a and e |b then e |(a, b)

Definition If (a, b) = 1 we say a and b are coprime.

Ex The GCD (greatest common divisor) is normally computed using the EuclideanAlgorithm. From Proposition 5: (a = 323, b = 221)

323 = 221 · 1 + 102 so 102 ∈M221 = 102 · 2 + 17 so 17 ∈M102 = 17 · 6 + 0

so 17 is the least positive integer in M ⇒ (323, 221) = 17. Reading back:

17 = 221− 2 · 102

= 221− 2 · (323− 221)

= 3 · 221− 2 · 323

so (a, b) = xa+ yb ⇒ x = −2, y = 3.

7

Proposition 7 If p ∈ P and p |ab then p |a or p |b.Proof. If p - a then (a, p) = 1. By Prop 6(i) ∃x, y ∈ Z so

xa+ yp = 1⇒ xab+ ybp = b

But p |ab so ab = qp. Hence (xq + yb)p = b so p |b. �

Proposition 8 If c > 0 and (a, b) = d then (ac, bc) = dc.

Proof. ∃x, y ∈ Z soxa+ yb = d

⇒ x(ac) + y(bc) = dc

⇒ (ac, bc) | dc. Also d | a ⇒ cd | ca (and similarly cd | cb) ⇒ dc | (ac, bc). Hencedc = (ac, bc). �

Theorem 3 (Fundamental Theorem of Arithmetic) The standard factorisation ofa number n ∈ N is unique.

Proof. If p | ab · · ·m, by Proposition 7, p must divide one of the factors. If each of

these is prime, then p must be one of them. If n = pα11 · · · p

αii = qβ1

1 · · · qβjj are two

standard factorizations of n, each p must be a q and each q a p. Hence i = j. Sincep1 < p2 < · · · < pk and q1 < q2 < · · · < qk, p` = q` for 1 6 ` 6 k. If β1 < α1, divide n bypβ1

1 to get pα1−β1

1 pα22 · · · = pβ2

2 · · · ⇒ α1 = β1 etc. �

8

Proposition 9 Let a, b ∈ N have non-standard factorisations

a =m∏j=1

pαjj

and

b =m∏j=1

pβjj

with αj > 0, βj > 0 then

(a, b) =m∏j=1

pmin (αj , βj)j .

Exa = 223451

b = 213051

⇒ (a, b) = 213051

Definition Let a, b ∈ Z+ = {0, 1, 2, . . .} = N ∪ {0}. The least common multiple orLCM of a and b is the smallest common multiple of a and b and is written {a, b}.

Ex {3, 4} = 12

9

Proposition 10 With the same notation as for Proposition 9,

{a, b} =m∏j=1

pmax (αj , βj)j .

Proposition 11 Any common multiple of a and b is a multiple of the least commonmultiple.

Proposition 12 {a, b} (a, b) = ab

Proof.

LHS =m∏j=1

pmax (αj , βj)+min (αj , βj)j .

But ∀x, y max (x, y) + min (x, y) = x+ y. Hence

LHS =m∏j=1

pαj+βjj =

m∏j=1

pαjj ·

m∏j=1

pβjj = ab.

Alternative Characterisation of the GCD

By Proposition 6 (ii), (a, b) |ax+ by.

10

Let x = 1, y = 0 ⇒ (a, b) |a.Let x = 0, y = 1 ⇒ (a, b) |b.

So g = (a, b) is a common divisor of a and b. By Proposition 6 (iii), if e |a and e |b thene |g i.e. g is divisible by every common divisor. Hence it is the greatest. This property:“being a common divisor divisible by every common divisor” characterises the GCD upto sign.

Proof. If g1 and g2 satisfy this property, then g1 and g2 are both common divisors withg1 | g2 and g2 | g1. Hence g2 = αg1 = αβg2 ⇒ αβ = 1 if g2 6= 0. Hence α = ±1. Sog1 = ±g2. The GCD, so defined by the above property, is made unique by fixing the sign,g > 0. �

Exdivisors of 12 = {±1, ±2, ±3, ±4, ±6, ±12} = D12

divisors of 18 = {±1, ±2, ±3, ±6, ±9, ±18} = D18

common divisors = {±1, ±2, ±3, ±6} = D12 ∩D18

So ±6 satisfies the property. Hence, fixing the sign, 6 = (12, 18).

Linear Equations in Z

Proposition 13 Given a, b, n ∈ Z, the equation ax + by = n has an integer solutionx, y ⇔ (a, b) |n.

11

Proof. (⇐) By Proposition 6 (i) ∃x, y such that ax+ by = (a, b). Since (a, b) |n, ∃c suchthat (a, b) c = n Hence a(xc) + b(yc) = (a, b) c = n and xc, yc is the solution.(⇒) By Proposition 6 (ii), (a, b) |ax+ by = n. �

Proposition 14 Let (a, b) = 1 and let x0, y0 be a solution to ax + by = n (a solutionexists by Proposition 13). Then all solutions are given by

x = x0 + bty = y0 − at

, t ∈ Z.

Proof.

a(x0 + bt) + b(y0 − at) = ax0 + abt+ by0 − bat= n

so each such x and y is a solution. If ax0 + by0 = n and ax + by = n also, thena(x − x0) + b(y − y0) = n − n = 0. But (a, b) = 1. Hence b |x − x0 ⇒ bt = x − x0 sox = x0 + bt ⇒ abt+ b(y − y0) = 0 ⇒ y − y0 = −at if b 6= 0. �

Theorem 4 If (a, b) = 1, a > 0, b > 0 then every integer n > ab−a−b is representableas n = ax+ by, x > 0, y > 0 and ab− a− b is not.

Proof. By Proposition 14,

x = x0 + bt

y = y0 − at

12

Choose t so that 0 6 y0 − at < a ⇒ 0 6 y0 − at 6 a− 1. But

(x0 + bt)a = n− (y0 − at)b > ab− a− b− (a− 1)b = −a

⇒ (x0 + bt) > −1

⇒ (x0 + bt) > 0.

Hence n is representable. Finally suppose ax+ by = ab− a− b (???) x > 0, y > 0.

⇒ a(x+ 1) + b(y + 1) = ab.

But (a, b) = 1, hence a |y + 1 (a(x+ 1− b) = b(−y− 1)) and b |x+ 1. ⇒ a 6 y + 1 andb 6 x+ 1 so ab = (x+ 1)a+ (y + 1)b > ba+ ab = 2ab (!!!). �

Definition n ∈ N

σ(n) = sum of the divisors of n

=∑d|n

d

Ex σ(12) = 1 + 2 + 3 + 4 + 6 + 12 = 28σ(6) = 1 + 2 + 3 + 6 = 12 = 2(6).

Perfect Numbers

13

Definition A perfect number is equal to the sum of its proper divisors

n =∑d | n

1 6 d < n

d

or σ(n) = 2n.

Ex 6, 28

Proposition 15 If n =∏m

j=1 pαjj then

σ(n) =m∏j=1

pαj+1j − 1

pj − 1

Proof. All divisors of n have the form d = px11 · · · pxmm with 0 6 xj 6 αj. Hence

σ(n) =

α1∑x1=0

· · ·αm∑xm=0

px11 · · · pxmm

=

(α1∑x1=0

px11

)· · ·

(αm∑xm=0

pxmm

)= RHS above.

14

Definition A function f : N→ N is called multiplicative if a, b ∈ N and (a, b) = 1 ⇒f(ab) = f(a)f(b)

Proposition 16 (a, b) = 1 ⇒ σ(ab) = σ(a)σ(b) i.e. σ is a multiplicative function.

Proof. This follows from Proposition 15. �

Theorem 5 Let p = 2n − 1 be prime. Then m = 12p(p + 1) = 2n−1(2n − 1) is perfect.

Every even perfect number has this form.

Proof. m = 12p(p+ 1) = 2n−1p1 and p is odd. By Proposition 15

σ(m) =2n − 1

2− 1· p

2 − 1

p− 1

= (2n − 1)(p+ 1)

= p(p+ 1)

= 2m

so m is perfect. Let a be an even perfect number. a = 2n−1u, u > 1, 2 - u. (Note thatσ(2α) = 2α+1 − 1 6= 2 · 2α, so no power of 2 is perfect.) Since σ is multiplicative,

σ(a) =2n − 1

2− 1σ(u) = 2a = 2nu

since a is perfect. Hence

σ(u) =2nu

2n − 1= u+

u

2n − 1.

15

But u |u and u2n−1|u so u has just two divisors hence u ∈ P and u

2n−1= 1 ⇒ u = 2n− 1.

Conjecture There are no odd perfect numbers.

Definition If p = 2n − 1 ∈ P we say p is a Mersenne Prime.

Theorem 6 If n > 1 and an − 1 is prime then a = 2 and n is prime.

Proof. If a > 2 then a− 1 |an− 1 (an− 1 = (a− 1)(an−1 + an−2 + · · ·+ 1)) so an− 1 6∈ P.If a = 2 and n = j`, where j is a proper divisor of n, then 2n − 1 = (2j)` − 1 is divisibleby 2j − 1 (a = 2j in the equation above). Hence n ∈ P. �

web: http://www.utm.edu/research/primes/mersenne.shtml

Theorem 7 If 2m + 1 ∈ P then m = 2n.

Proof. If m = qr, where q is odd, then2qr + 1 = (2r)q + 1 = (2r + 1)(2r(q−1)− 2r(q−2) + · · ·+ 1) and 1 < 2r + 1 < 2qr + 1 so 2qr + 1cannot be prime. Hence m has no odd prime factor. Hence m = 2n, n ∈ N. �

Note The factorization

an − bn = (a− b)(an−1 + an−2b+ an−3b2 + · · ·+ bn−1)

16

works here for odd n since

an + 1 = an − (−1)n

= (a− (−1))(an−1 + an−2(−1) + an−3(−1)2 + · · · (−1)n−1)

= (a+ 1)(a+ 1)(an−1 − an−2 + an−3 − · · ·+ 1)

Fermat Numbers

Definition The nth Fermat number, Fn = 22n + 1

F0 = 3, F1 = 5, F2 = 17, F3 = 257, F4 = 65537.

Fi ∈ P for 0 6 i 6 4. No other Fermat prime is known.

F5 6∈ P.

(Euler, 1732): 641 |225+ 1 = 641 · 6700417.

17

Proof. Let

a = 27

b = 5

a− b3 = 3

1 + ab− b4 = 1 + 5 · 3 = 24

Therefore

225

+ 1 = (28)4 + 1

= (2a)4 + 1

= 24a4 + 1

= (1 + ab− b4)a4 + 1

= (1 + ab)a4 + 1− a4b4

= (1 + ab)a4 + (1− a2b2)(1 + a2b2)

= (1 + ab)[a4 + (1− ab)(1 + a2b2)]

and 1 + ab = 641. �

Theorem 8 (Lagrange) If p ∈ P, the exact power α of p dividing n! (pα ‖n!) is

α =

⌊n

p

⌋+

⌊n

p2

⌋+

⌊n

p3

⌋+ · · ·

18

Proof.

n! = 1 · 2 · · · (p− 1)

·p(p+ 1) · · · 2p · · · (p− 1)p

·p2

· · ·

There are⌊np

⌋multiples of p,

⌊np2

⌋multiples of p2, etc.

Each multiple of p contributes 1 to α. Each multiple of p2 has already contributed 1,being a multiple of p, so contributes 1 more to α leading to⌊

n

p

⌋+

⌊n

p2

⌋etc. Hence

α =

⌊n

p

⌋+

⌊n

p2

⌋+

⌊n

p3

⌋+ · · ·+

⌊n

pr

⌋where r is the first N such that pr+1 > n. So

⌊npβ

⌋= 0 ∀β > r + 1. �

19

Ex n = 12, p = 3 so

α =

⌊12

3

⌋+

⌊12

9

⌋+

⌊12

27

⌋= 4 + 1 + 0

= 5.

12! = 12 · 11 · 10 · 9 · 8 · 7 · 6 · 5 · 4 · 3 · 2 · 1↓ ↓ ↓ ↓1 2 1 1

and 35 ‖12! .

20

2 Congruences

Definition a ≡ b (mod m) if m |a− b, m 6= 0, a, b,m ∈ Z. If so we say a is congruentto b modulo m. We call m the modulus.

Proposition 17 ≡ is an equivalence relation on Z and the set of equivalence classesforms a ring (Zm, +, ·, [ 1 ]m) where

[ a ]m + [ b ]m = [ a+ b ]m[ a ]m · [ b ]m = [ a · b ]m

Proposition 18

a1 ≡ b1 (mod m)a2 ≡ b2 (mod m)

}⇒ a1 · a2 ≡ b1 · b2 (mod m)

a2 + a2 ≡ b1 + b2 (mod m)

Proposition 19

ac ≡ bd (mod m)c ≡ d (mod m)

(c, m) = 1

⇒ a ≡ b (mod m)

21

Proof. (a− b)c+ b(c− d) = ac− bd ≡ 0 (mod m) ⇒ m |(a− b)c ⇒ m |a− b so a ≡ b(mod m). �

If m ∈ P then (c, m) = 1 ∀c ∈ Z with m - c, c 6= 0 and ∃x, y ∈ Z so that cx + my =(c, m) = 1 so cx ≡ 1 (mod m). Hence [ c ]m has a multiplicative inverse class [x ]m and(Zm, +, ·, [ 1 ]m) is a field GF(m) called a Galois field.

Note [ c ]m is called a residue class with representative c. Each class has a smallestnon-negative representative.

Ex m = 5GF (5) = {[ 0 ]5 , [ 1 ]5 , [ 2 ]5 , [ 3 ]5 , [ 4 ]5}Proof. If c ∈ Z and m > 0, ∃q, r so that c = mq + r, 0 6 r < m and c ≡ r (mod m) ⇒[ c ]m = [ r ]m �

Euler’s Phi Function φ

Definition φ(n) = #{i 6 n : 1 6 i and (i, n) = 1} is the number of natural numbersless than n and coprime to n.

Ex φ(1) = 1, φ(2) = 1, φ(4) = 2 since (1, 4) = 1, (2, 4) = 2, (3, 4) = 1, (4, 4) = 4.

Ex p ∈ P ⇒ φ(p) = p− 1 since (p, 1) = 1, (p, p) = p and (p, j) = 1, 1 < j < p.

22

Consider m > 1. In Zm, [ c ]m will have an inverse class ⇔ (c, m) = 1.(⇐) cx+my = (c, m) = 1 ⇒ cx ≡ 1 (mod m).Hence the number of classes which have inverses is φ(m).

Definition A reduced residue system is a complete set of representatives for thoseclasses with inverses.

Ex {1, 3} is such a system for Z4.

Proposition 20 If a1, . . . , aφ(m) is a reduced residue system and (m, k) = 1 thenka1, . . . , kaφ(m) is also a reduced residue system.

Proof. (ai, m) = 1 ⇒ (kai, m) = 1. If kai ≡ kaj (mod m) ⇒ ai ≡ aj (mod m) ⇒i = j. Hence the kai represent distinct residue classes, and each is coprime with m. �

Theorem 9 (Euler) (a, m) = 1 ⇒ aφ(m) ≡ 1 (mod m).

Proof. The {aai : 1 6 i 6 φ(m)} and {ai : 1 6 i 6 φ(m)} represent the same classes(albeit in a different order). Hence

φ(m)∏j=1

(aaj) ≡φ(m)∏j=1

aj (mod m)

⇒ aφ(m)

φ(m)∏j=1

aj

φ(m)∏j=1

aj

(mod m)

23

and so aφ(m) ≡ 1 (mod m) since (aj, m) = 1 means we can cancel. �

Corollary (Fermat’s Little Theorem) (a, p) = 1 ⇒ ap ≡ a (mod p).

Proof. φ(p) = p− 1 so ap−1 ≡ 1 (mod p) ⇒ ap ≡ a (mod p). �

Note Simple probabilistic primality test: Check q ∈ N through considering aq ≡ a(mod q) for random a with (a, q) = 1.

Note Euler’s aφ(m) ≡ 1 (mod m) is the basis of RSA public key cryptography.

Proposition 21 Let (m, m′) = 1, let x run over a complete residue system (mod m)and x′ over a complete system (mod m′). Then mx′+m′x runs over a complete system(mod mm′).

Proof. Consider the mm′ numbers mx′ + m′x. If mx′ + m′x ≡ my′ + m′y (mod mm′)then

mx′ ≡ my′ (mod m′)m′x ≡ m′y (mod m)

}⇒ x′ ≡ y′ (mod m′)

x ≡ y (mod m)

since (m, m′) = 1. So each class is distinct. The result follows since there are mm′ classes(mod mm′). �

Proposition 22 Same as before but ‘complete’ → ‘reduced’.

Proof. Claim: (mx′ +m′x, mm′) = 1. If not (???) Let p ∈ P have p |(mx′ +m′x, mm′).If p |m then p |m′x. But (m, m′) = 1 so p - m′ hence p | x and p | (m, x) which is false(!!!). This proves the claim.

24

Claim: Every a ∈ Z, (a, mm′) = 1 satisfies a ≡ mx′ + m′x (mod mm′) for x, x′

with (x, m) = (x′, m′) = 1. By the above ∃x, x′ so a ≡ mx′ + m′x (mod mm′). If(x, m) = d 6= 1 then (a, m) = (mx′ +m′x, m) = (m′x, m) = (x, m) = d 6= 1 which isfalse. Similarly (x′, m′) = 1.

By the above, the numbers mx′ +m′x are incongruent. hence we have a reduced residuesystem of this form. �

Theorem 10 φ is a multiplicative function.

Proof. If (m, m′) = 1,

φ(mm′) = #{RRS(mm′)}= #{RRS(m)} ·#{RRS(m′)}= φ(m) · φ(m′)

25

Since φ = ϕ is multiplicative, if n =∏m

j=1 pαjj is the standard factorisation,

φ(n) =m∏j=1

φ(pαjj ).

Theorem 11

φ(pα) = pα(

1− 1

p

)so

φ(n) = n∏p|n

(1− 1

p

).

Proof. Consider the natural numbers in the interval 1 6 j 6 pα. There are⌊pα

p

⌋= pα−1

multiples of p and the rest are coprime with p, (j, p) = 1 hence (j, p) = 1. Thereforeφ(pα) = pα − pα−1 = pα(1− 1

p). �

26

Ex

φ(100) = φ(22 · 52)

= 100

(1− 1

2

)(1− 1

5

)= 100

(1

2

)(4

5

)= 40 ⇒ 40% are coprime with 100

Theorem 12 (Wilson) If p ∈ P, (p− 1)! ≡ −1 (mod p).

Proof. In Zp, f(x) = xp−1 − 1 has degree p− 1 and roots [ 1 ]p , [ 2 ]p , . . . , [ p− 1 ]p since

ap−1 ≡ 1 (mod p). x = 0⇒ −1 ≡ (−1)p−1(p− 1)! (mod p) ⇒ (p− 1)! ≡ −1 (mod p)for p odd and for p = 2, (2− 1)! = 1! = 1 ≡ −1 (mod 2). �

Note The converse also holds.

Note on Fermat Numbers

These can be defined as

F0 = 3

Fn+1 = F 2n − 2Fn + 2, n > 0

27

since then

Fn − 1 = (Fn−1 − 1)2

= (Fn−2 − 1)22

...

= (F0 − 1)2n

= 22n

so Fn = 22n + 1 ∀n > 0.

Proposition 23 (Fn, Fm) = 1 ∀n 6= m.

28

3 Mobius Function and Mobius Inversion

(Mathematica: MoebiusMu[n])

Definition

µ(n) =

1 if n = 1

(−1)m if n is a product of m distinct p ∈ P0 if ∃p ∈ P with p2 |n

Ex µ(1) = 1, µ(2) = −1, µ(6) = (−1)2 = 1, µ(p) = −1, µ(4) = 0, µ(12) = µ(223) = 0

Proposition 24 µ is multiplicative.

Proof. Let (a, b) = 1, a =∏m

i=1 pαii , b =

∏nj=1 q

βjj . If ∃αi or βi > 2 then µ(ab) = 0 and

µ(a) or µ(b) = 0 so µ(ab) = 0 = µ(a)µ(b). If not

µ(ab) = (−1)n+m = (−1)m(−1)n = µ(a)µ(b)

so µ is multiplicative. �

Definition

I(n) =

{1 if n = 10 if n > 1

.

29

Proposition 25 If f(n) is multiplicative and not identically zero, then f(1) = 1.

Proof. (1, a) = 1 ⇒ f(1 · a) = f(1)f(a) so f(a) = f(1)f(a). If we choose a so f(a) 6= 0then 1 = f(1). �

Theorem 13 Let g(n) and h(n) be multiplicative. Then the function

f(n) =∑d|n

g(d)h(nd

)is also multiplicative.

Proof. Let (a, b) = 1. Then

f(ab) =∑d|ab

g(d)h

(ab

d

)

=∑d = uvu |a, v |b

g(d)h

(ab

d

)

=∑u|a

∑v|b

g(uv)h

(ab

uv

)

=∑u|a

∑v|b

g(u) g(v)h(au

)h

(b

v

)

30

since (u, v) =(au, bv

)= 1.

∴ f(ab) =∑u|a

∑v|b

g(u)h(au

)g(v)h

(b

v

)

=

∑u|a

g(u)h(au

)∑v|b

g(v)h

(b

v

)= f(a)f(b).

Proposition 26 Let f be multiplicative and not identically zero. Then∑d|n

µ(d)f(d) =∏p|n

(1− f(p)) (1)

where the product includes one term for each prime divisor of n.

Proof. Let g(n) = µ(n)f(n) and h(n) = 1 in Theorem 13. Then LHS of equation (1) is∑d|n g(d)h

(nd

)so is multiplicative. The RHS of (1) is also multiplicative since if n = ab

then (a, b) = 1, p |n ⇔ p |a or p |b.

At n = 1

LHS = µ(1)f(1) = 1

RHS = empty product = 1 (by definition).

31

At n = pα

LHS =∑d|pα

µ(d)f(d)

= µ(1)f(1) + µ(p)f(p) + µ(p2)f(p2) + · · ·= 1 + (−1)f(p) + 0 + 0 + · · ·= 1− f(p)

RHS =∏p|pα

(1− f(p))

= 1− f(p)

= LHS

Hence they are equal, since multiplicative functions are determined by their values at 1and prime powers. �

Proposition 27 If n > 0,∑d|n

µ(d) = I(n) =

{1 if n = 10 if n > 1

.

Proof. Let f(d) = 1 in Proposition 26 and note∑d|1

µ(d) = µ(1) = 1.

32

Theorem 14 ∑d|n

φ(d) = n

Proof. Let S = {1, 2, · · · , n}. If d | n let A(d) = {k : (k, n) = d, 1 6 k 6 n}. ThenS =

⊔d|nA(d) (i.e. disjoint union

⊔6=⋃

) ⇒ #S =∑

d|n #A(d) or n =∑

d|n f(d) where

f(d) = #A(d).But

(k, n) = d ⇔(k

d,n

d

)= 1 and

0 < k 6 n ⇔ 0 <k

d6n

d

so if q = kd

there is a 1-1 correspondence between q ∈ N satisfying 0 < q 6 nd

and(q, n

d

)= 1.

i.e. f(d) = φ(nd)

Hence n =∑

d|n φ(nd)

But as d runs through the divisors of n, so does nd. Hence

n =∑d|n

φ(d).

33

Ex Divisors of 6 are {1, 2, 3, 6} and

φ(1) + φ(2) + φ(3) + φ(6) = 1 + (2− 1) + (3− 1) + 6

(1− 1

2

)(1− 1

3

)= 1 + 1 + 2 + 2

= 6

Dirichlet Multiplication

Definition If f and g are two real functions on N then define their Dirichlet product(or convolution) h(n) as

h(n) =∑d|n

f(d)g(nd

)= (f ∗ g)(n).

Proposition 28 I ∗ f = f ∗ I = f where

I(n) =

{1 if n = 10 if n > 1

34

Proposition 29

f ∗ g = g ∗ f (commutative law)(f ∗ g) ∗ k = f ∗ (g ∗ k). (associative law)

Definition The function u(n) = 1 ∀n ∈ N.

Then for Proposition 27: ∑d|n

µ(d) = I(n) is µ ∗ u = I (2)

and for Theorem 14: ∑d|n

φ(d) = n is φ ∗ u = N

where N(n) = n is the identity. If f(1) 6≡ 0 there is a unique function f−1 with f ∗ f−1 =f−1 ∗ f = I.

Ex By (2) u = µ−1, u−1 = µ.

Theorem 13 says if f and g are multiplicative then so is f ∗ g, their Dirichlet product.

Theorem 15 (Mobius Inversion Formula)

f(n) =∑d|n

g(d) ⇔ g(n) =∑d|n

µ(d)f(nd

)

35

Proof. (⇒) f = g ∗ u⇒

f ∗ µ = (g ∗ u) ∗ µ= g ∗ (u ∗ µ)

= g ∗ I= g.

(⇐) g = f ∗ µ⇒

g ∗ u = (f ∗ µ) ∗ u= f ∗ (µ ∗ u)

= f ∗ I= f.

Ex Theorem 14: ∑d|n

φ(d) = n

φ ∗ u = N

36

⇒ φ(n) = (µ ∗N)(n)

=∑d|n

µ(d)n

d

= n∑d|n

µ(d)

d

Liouville’s Function

Definition

n =m∏i=1

pαii ⇒ λ(n) = (−1)∑m

1 αi

Then λ is completely multiplicative.

Theorem 16 ∀n > 1, ∑d|n

λ(d) =

{1 if n is a square0 otherwise.

37

Proof. Let g(n) =∑

d|n λ(d). Then g = λ ∗ u is multiplicative as the Dirichlet product of

multiplicative functions. So we need to compute g(pα) for p ∈ P and α = 1, 2, 3, . . .

g(pα) =∑d|pα

λ(d)

= λ(1) + λ(p) + λ(p2) + · · ·+ λ(pα)

= 1− 1 + 1− 1 + · · ·+ (−1)α

=

{0 if α is odd1 if α is even.

If n =∏m

i=1 pαii and n is not a square, then ∃j so αj is odd, hence g(n) =

∏mi=1 g(pαii ) = 0

since the jth term is zero. If n is a square each αi is even, hence g(pαii ) = 1 ∀i ⇒ g(n) = 1.�

38

4 Averages of Arithmetic Functions

Definitiond(n) =

∑d |n

1 6 d 6 n

1 = # of divisors of n ∈ N

is the “divisor function”.

Then, as a function of n, d is very irregular. d(p) = 2 ∀p ∈ P but d(n) can be very large.Averages are smoother

d(n) =1

n

n∑j=1

d(j)

indeed (later)

limn→∞

d(n)

log(n)= 1.

Need the partial sums

D(x) =∑

16j6x

d(j)

where we define D(x) = 0 for 0 < x < 1. So D(x) = d(1) + d(2) + · · ·+ d(bxc), x > 1.

Later we prove Dirichlet’s theorem:

x > 1 ⇒ D(x) = x log(x) + (2γ − 1)x+O(√

x)

39

(γ is Euler’s constant) where f(x) = O(g(x)) if ∃x0, ∃M > 0 such that ∀x > x0, |f(x)| 6Mg(x) defines the ‘big-Oh’ notation, and f(x) = h(x) + O(g(x)) ⇔ f(x) − h(x) =O(g(x)).

Ex x = O(x2) , x2 + 7x+ 20 = O(x2)

Normally, f(x) is number theoretic, like D(x), h(x) is ‘nice’ and smooth, g(x) is a nicepower, or other simple ‘mop up’ for the ‘random variation’ in f(x) e.g. D(x) = x log(x)+O(x).

Definition We say f(x) is asymptotic to g(x) as x→∞ if

limx→∞

f(x)

g(x)= 1

and write f(x) ∼ g(x), x→∞.

So D(x) ∼ x log(x) as x→∞ since

D(x)

x log(x)=x log(x)

x log(x)+

(2γ − 1)x

x log(x)+O

( √x

x log(x)

)⇒ D(x)

x log(x)→ 1. From this it follows that d(n) ∼ log(n).

Theorem 17 (Euler Summation) If f has a continuous derivative f ′ on [y, x] ∈ R

40

where 0 < y < x, then

S =∑y<n6x

f(n) =

∫ x

y

f(t) dt +

∫ x

y

(t− btc)f ′(t) dt

+ f(x)(bxc − x)− f(y)(byc − y). (3)

Proof. Let m = byc , k = bxc. If n, n− 1 ∈ [y, x]:∫ n

n−1

btc f ′(t) dt =

∫ n

n−1

(n− 1)f ′(t) dt

= (n− 1)(f(n)− f(n− 1))

= {nf(n)− (n− 1)f(n− 1)} − f(n)

Summing from n = m+ 2 to n = k, the sum in braces ({· · · }) telescopes to give∫ k

m+1

btc f ′(t) dt = kf(k)− (m+ 1)f(m+ 1)−k∑

n=m+2

f(n)

= kf(k)−mf(m+ 1)−∑y<n6x

f(n)

Hence

S = −∫ k

m+1

btc f ′(t) dt + kf(k)−mf(m+ 1)

= −∫ x

y

btc f ′(t) dt + kf(x)−mf(y). (4)

41

Integrating∫ xyf(t) dt (by parts) gives∫ x

y

f(t) dt = xf(x)− yf(y)−∫ x

y

tf ′(t)dt . (5)

Then (4)− (5) ⇒ (3). �

Theorem 18 ∑n6x

1

n= log(x) + γ +O

(1

x

)where

γ = 1−∫ ∞

1

t− btct2

dt = limx→∞

(∑n6x

1

n− log(x)

).

42

Proof. Let f(t) = 1t

in Theorem 17 with y = 1 so f ′(t) = − 1t2

and∑0<n6x

1

n= 1 +

∑1<n6x

1

n

= 1 +

∫ x

1

1

tdt +

∫ x

1

t− btc−t2

dt +bxc − x

x− f(1)(b1c − 1)

=

∫ x

1

dt

t−∫ x

1

t− btct2

dt + 1− x− bxcx

= log(x)−∫ x

1

t− btct2

dt + 1 +O

(1

x

)= log(x) +

{1−

∫ ∞1

t− btct2

dt

}+

∫ ∞x

t− btct2

dt +O

(1

x

)Now

0 6∫ ∞x

t− btct2

dt 6∫ ∞x

1

t2dt =

1

x.

So ∑16n6x

1

n= log(x) + γ +O

(1

x

)where γ = {· · · }. �

Note: γ = 0.577215776 . . . is Euler’s constant (EulerGamma in Mathematica). It could

43

be rational, but probably is not. By Theorem 18,

limx→∞

( ∑16n6x

1

n− log(x)

)= γ + 0 = γ.

Since log(x)→∞ as x→∞,∑∞

n=11n

=∞, quoted earlier.

Theorem 19 (Dirichlet)

D(x) =∑

16n6x

d(n) = x log(x) + (2γ − 1)x+O(√

x)

(6)

Proof.

d(n) =∑d|n

1

⇒ D(x) =∑n6x

d(n) =∑n6x

∑d|n

1

Now d |n ⇒ n = qd so we can express the double sum as

D(x) =∑q, dqd 6 x

1

44

This is a sum over a set of lattice points in the q− d plane with (q, d) such that qd = nand n = 1, 2, 3, . . . , bxc. We sum these horizontally:

D(x) =∑d6x

∑q6x

d

1

But ∑16i6x

1 = x+O(1) (Ex)

so

D(x) =∑d6x

{xd

+O(1)}

= x∑d6x

1

d+O(x)

= x

(log(x) + γ +O

(1

x

))+O(x)

= x log(x) +O(x) (7)

This is weaker than (6). To prove (6) we use the symmetry of the set of points:

D(x) = 2∑d6√x

{⌊xd

⌋− d}

+⌊√

x⌋

= 2#(below line q = d) + #(on q = d) (8)

45

But ∀y ∈ R, byc = y +O(1) so (8) ⇒

D(x) = 2∑d6√x

{xd− d+O(1)

}+O

(√x)

= 2x∑d6√x

1

d− 2

∑d6√x

d+O(√

x)

= 2x

(log(√

x)

+ γ +O

(1√x

))− 2

(x2

+O(√

x))

+O(√

x)

= x log(x) + (2γ − 1)x+O(√

x)

where we have use Lemma 1 below for the middle sum. �

Lemma If α > 0, ∑n6x

nα =xα+1

α + 1+O(xα) .

46

Proof. In Theorem 17 (Euler Summation), let f(t) = tα, f ′(t) = αtα−1 ⇒∑0<n6x

nα = 1 +∑

1<n6x

=

∫ x

1

tα dt + α

∫ x

1

tα−1(t− btc) dt + 1− (x− bxc)xα

=xα+1

α + 1− 1

α + 1+O

∫ x

1

tα−1 dt

)+O(xα)

=xα+1

α + 1+O(xα)

Note: Improvements in the error term O(√x) in Dirichlet’s theorem for d(n) have come

at great cost:

1903 Voronoi O(x1/3 log(x)

)1922 van der Corput O

(x33/100

)1969 Kolesnik O

(xε+12/37

)∀ε > 0

1915 Hardy and Landau O(xθ)⇒ θ > 1

4

The Distribution of Primes

47

Let

Li(x) =

∫ x

2

dt

log(t)

for x > 2 be the “logarithmic integral” and π(x) = #{p ∈ P : 2 6 p 6 x}. Consider thefollowing data:

So π(x) ; xlog(x)

but π(x) ; Li(x) is better, and π(x)x→ 0 as x → 0 apparently. Indeed

π(x) ∼ xlog(x)

∼ Li(x).

This distribution is the subject of the famous Prime Number Theorem, which tookall of the 19th century to prove.

Because log(10n) = n log(10)

in [2, 100] : about 12

the numbers are primein [2, 1000] : 1

3

in [2, 1, 000, 000] : 16

etc.

so they progressively thin out with a local density 1log(t)

since if a < b

#{p ∈ P : a 6 p 6 b} = π(b)− π(a) ∼∫ b

2

dt

log t−∫ a

2

dt

log t=

∫ b

a

dt

log(t).

Theorem 20 For n > 2,1

86

π(n)

n/ log n6 12.

48

Note: This is as close as we will get to proving the Prime Number Theorem.

Lemma (Chebyshev) If H(n) =∑n

j=21j

then

1

86 π(n)

H(n)

n6 6.

Proof. Proof of Theorem 20 assuming Chebychev’s Lemma:

For n > 2,

log(n

2

)=

∫ n

2

dt

t<

1

2+

1

3+ · · ·+ 1

n<

∫ n

1

dt

t= log(n) .

For n > 4,1

2log(n) 6 log

(n2

).

Hence 12

log(n) 6 H(n) 6 log(n) so, by the RHS of Chebychev’s Lemma,

π(n) log(n)

n6π(n) 2H(n)

n6 12

and by the LHS of Chebychev’s Lemma

1

86π(n)H(n)

n6π(n) log(n)

n

49

using Lemma 2 when n > 4.

If n = 2, π(2) = 1 and1

86

1

2/ log(2)︸ ︷︷ ︸0.34

6 6.

If n = 3, π(3) = 2 and1

86

2

3/ log(3)︸ ︷︷ ︸0.73

6 6.

This completes the proof of the theorem.

Proof of Lemma 2 :

Claim : ∀k > 0, π(2k+1) 6 2k (9)

Proof: If x > 9, π(x) 6 x2

since all even numbers greater than 2 are composite. Sinceπ(21) = 1 = 20, π(4) = 2 = 21 and π(8) = r = 22, (1) is true ∀k > 0.

Claim :1

2` 6 H(2`) 6 ` (10)

50

where H(n) = 12

+ · · ·+ 1n.

H(2`) =1

2=

(1

3+

1

4

)+

(1

5+

1

6+

1

7+

1

8

)+ · · ·+

(1

2`−1 + 1+ · · ·+ 1

2`

)>

1

2+

(1

4+

1

4

)+

(1

8+

1

8+

1

8+

1

8

)+ · · ·+

(1

2`+

1

2`

)=

`

2

and

H(2`) =

(1

2+

1

3

)+

(1

4+

1

5+

1

6+

1

7

)+ · · ·+ 1

2`

6

(1

2+

1

2

)+

(1

4+

1

4+

1

4+

1

4

)+ · · ·+

(1

2`−1+ · · · 1

2`−1

)+

1

2`

6 `

This proves the claim.

If p ∈ P has n < p < 2n ⇒ p |2n! and p - n! ⇒

p

∣∣∣∣ (2nn

)=

2n!

n!n!⇒

∏n<p<2n

p

∣∣∣∣∣(

2nn

)(11)

51

By Lagrange, the power of p in

(2nn

)is

r∑m=1

(⌊2n

pm

⌋− 2

⌊n

pm

⌋)(12)

where pr 6 2n < pr+1 and the sum is ≤ r since ∀x, b2xc − 2 bxc 6 1 (See below). Hence

(2nn

) ∣∣∣∣∣∣∏

pr62n<pr+1

pr

By (11) and (12)

nπ(2n)−π(n) <∏

n<p<2n

p 6

(2nn

)6

∏pr62n<pr+1

pr 6 (2n)π(2n) (13)

Now (2nn

)6 (1 + 1)2n = 22n

and(2nn

)=

2n(2n− 1) · · · (n+ 1)

n(n− 1) · · · 1= 2

(2 +

1

n− 1

)(2 +

2

n− 2

)· · ·(

2 +n− 1

1

)> 2n

52

so

2n 6

(2nn

)6 4n (14)

Using LHS of (13) we get nπ(2n)−π(n) < 22n and the RHS gives 2n < (2n)π(2n), n > 1.Now let n = 2k, k = 0, 1 , 2, . . . so these two inequalities translate to

2k(π(2k+1)−π(2k)) 6 22k+1

, 22k 6 2(k+1)π(2k+1), k > 0

ork(π(2k+1)− π(2k)) 6 2k+1, 2k 6 (k + 1)π(2k+1). (15)

Hence

(k + 1)π(2k+1)− kπ(2k) = k(π(2k+1)− π(2k)) + π(2k+1)

6 2k+1 + π(2k+1)

< 2k+1 + 2k by (9)

= 3 · 2k

Apply this for k = 0, 1, 2, . . . , k and add (π(20) = π(1) = 0):

⇒ (k + 1)π(2k+1) < 3(20 + 21 + · · ·+ 2k) < 3 · 2k+1. (16)

By (15) and (16), ∀k > 01

2

2k+1

k + 16 π(2k+1) < 3

2k+1

k + 1.

53

If n ∈ N, n > 1 choose k so 2k+1 6 n < 2k+2. By (10) (π is increasing)

π(n) 6 π(2k+2) < 32k+2

k + 26 6

2k+1

H(2k+2)6

6n

H(n)

(H is increasing) and

π(n) > π(2k+1) >1

2

2k+1

k + 1

=1

8

2k+2

12(k + 1)

>1

8

2k+2

H(2k+1)

>1

8

n

H(n)

⇒ 1

86

π(n)

n/H(n)6 6

as claimed. �

Ex ∀x ∈ R, 0 6 b2xc − 2 bxc 6 1

Proof.bxc 6 x ⇒ 2 bxc 6 2x

and 2 bxc ∈ Z ∴ 2 bxc 6 b2xc

54

so 0 6 b2xc − 2 bxc

If x ∈ Z, then b2xc − 2 bxc = 2x− 2 · x = 0 6 1.

If x 6∈ Z, ∃n ∈ Z so n < x < n+1 and x = n+ 12+ε where |ε| < 1

2. Then 2x = 2n+1+2ε.

b2xc = b2n+ 1 + 2εc= 2n+ 1 + b2εc= 2n+ 1

bxc = n

Hence b2xc − 2 bxc = (2n+ 1)− 2n = 1 6 1. �

Note we have used several times the result

by + nc = byc+ n ∀n ∈ Z . (Ex)

55

5 Primes in Gaps

• primes can be close together: {11, 13}, {29, 31}, {101, 103}, . . .

• there can be long stretches of N with no primes :

a1 = n! + 2a2 = n! + 3

...an−1 = n! + n

are n− 1 composite and consecutive

numbers, so none are primeand n can be as large as you like.

• we will prove the celebrated Bertrand’s Hypothesis: ∀n ∈ N, ∃p ∈ P withn 6 p < 2n.

• ∀n ∈ N does there exist a p ∈ P with n2 < p < (n+ 1)2?

56

57

58

59

60

61

Aron, Potter, Young

limn→∞

[size of gap n] =∞.

If n ∈ N

a1 = (n+ 1)! + 2

a2 = (n+ 1)! + 3

a3 = (n+ 1)! + 4...

an = (n+ 1)! + (n+ 1)

Then {a1, . . . , an} are consecutive and i+ 1 |ai ⇒ composite.

But [1993] best gap length = 804 at p ≈ 1015.

n = 804 ⇒ n! ≈ 0.771× 101977.

Proof of Bertrand’s Postulate

62

Proof. Claim:

x > 2 ⇒∏p6x

p 6 4x−1 (17)

If q is the largest prime less than or equal to x∏p6x

p =∏p6q

p and 4q−1 6 4x−1

so we can assume x = q is prime. If q = 2, 2 6 42−1 so let q = 2m+ 1 be odd. Then

∏p62m+1

p =

( ∏p6m+1

p

( ∏m+1<p62m+1

p

)= A ·B

By induction A 6 4m. Also (2m+ 1m

)=

(2m+ 1)!

m! (m+ 1)!

so all primes in B divide the numerator and are not cancelled so

B 6

(2m+ 1m

)=

(2m+ 1m+ 1

)6

1

2(1 + 1)2m+1

Hence A ·B 6 4m22m = 42m+1−1 = 4x−1, which proves the claim.

63

Legendre’s Theorem Implications

n! contains the prime factor p exactly∑

j>1

⌊npα

⌋times.

Ex 24! = 222 · 310 · 54 · 73 · 112 · 13 · 17 · 19 · 23

p = 23 :

⌊24

23

⌋= 1,

⌊24

232

⌋= 0, · · ·

p = 7 :

⌊24

7

⌋= 3,

⌊24

72

⌋= 0, · · ·

Claim:

pr∣∣∣∣(2n

n

)⇒ pr 6 2n.(

2nn

)contains p

∑j>1

(⌊2npj

⌋− 2

⌊npj

⌋)times. But

0 6

⌊2n

pj

⌋− 2

⌊n

pj

⌋<

2n

pj− 2

(n

pj− 1

)= 2

⇒ each summand is 0 or 1 and is 0 for pj > 2n

⇒∑j>1

(⌊2n

pj

⌋− 2

⌊n

pj

⌋)6 max{j : pj 6 2n}

64

⇒ if p2 > 2n, p occurs at most once in

(2nn

).

If 23n < p 6 n ⇒ p does not appear in

(2nn

).

23n < p ⇒ 2n < 3p ⇒ p, 2p are the only multiples of p in the numerator of (2n)!

n!n!.

p 6 n ⇒ there are two in the denominator. So they cancel.

Ex n = 24 (4824

)= 22 · 32 · 52 · 13 · 29 · 31 · 37 · 41 · 41 · 43 · 47

√2× 24 16 < p 6 24

Grand Finale

Assume that for some n ∈ N there is no p in n < p < 2n (???). Now

2n∑j=0

(2nj

)= 22n ⇒

(2nn

)>

4n

2n

65

Hence

4n

2n6

(2nn

)6

A︷ ︸︸ ︷ ∏p6√

2n

2n

·B︷ ︸︸ ︷ ∏

√2n<p6 2

3n

p

·C︷ ︸︸ ︷( ∏

n<p62n

p

)

A 6 (2n)√

2n, C = 1 by (???), B 6 423n by (17). Thus

4n 6 (2n)1+√

2n42n3

⇒ n3

log(4) 6 (1 +√

2n) log(2n) ⇒ n < 468.

But n < p < 2n ⇔ pn+1 < 2pn.

Consider the primes qj

{2, 3, 5, 7, 13, 23, 43, 83, 163, 317, 631}

qj+1 < 2qj so Bertrand’s Postulate is true for n < 468 (!!!), hence it is true ∀n > 2. �

Prime Number Theorem Implications

limx→∞

π(x) log(x)

x= 1

66

or

π(x) =x

log(x)+ o

(x

log(x)

)Number of primes in (x, x(1 + ε)], ε > 0, is

π(x+ εx)− π(x) =εx

log(x)+ o

(x

log(x)

)> 0 for x > xε

⇒ ∃p with x < p 6 (1 + ε)x Let ε = 1 ⇒ ∀n > N2 ∃p

n < p < 2n

Bertrand’s PostulateChebyshev [1850]RamanujanErdos at age 19 years

Progress beyond Bertrand

∃θ < 1 with π(x+ xθ)− π(x) ∼ xθ

log(x)

67

1930 Hoheisel θ = 1− 133,000

+ ε (∀ε > 0)

1937 Ingham θ = 58

+ ε

1961 Montgomery θ = 35

+ ε

1972 Huxley θ = 712

+ ε

1979 Iwaniec, Jutlia θ = 1323

+ ε

1984 Iwaniec, Printz θ = 12

+ 121

+ ε = 0.547 . . .+ ε

1994 Lou and Yeo θ = 713

+ ε = 0.538 . . .+ ε

1998 Baker and Herman θ = 0.535 . . .+ ε

Exercise for John in his Retirement

[Hardy and Wright, 1979]: There is a prime p with n2 < p < (n+ 1)2

Note: θ = 12⇒ ∃p x < p < x+

√x.

x = n2 ⇒ n2 < p < n2 + n < (n+ 1)2.

Degree of difficulty for the student:

Other Results on the Distribution of Primes

68

Theorem 21 (Bertrand’s Postulate) ∀n ∈ N, ∃p ∈ P with n 6 p < 2n.

Theorem 22 There are infinitely many primes of the form 4n− 1.

Proof. Assume there are only a finite number and let p be the largest. Let

N = 22 ·n︷ ︸︸ ︷

3 · 5 · · · p−1

The product n = 3 · 5 · · · p contains all the odd primes less than or equal to p as factors.Since N > p and N = 4n − 1, it cannot be prime. No prime less than or equal to pdivides N (since it would divide 1). Thus all the prime factors of N must exceed p.

If x = 4m+ 1 and y = 4`+ 1 then

xy = 16m`+ 4m+ 4`+ 1 = 4(4m`+m+ `) + 1 = 4k + 1

If two factors of N are of the form 4n+1, so is their product. But N has the form 4n−1,so at least one prime factor must be of the form p = 4m − 1. This contradiction provesthe theorem. �

Can also show there are an infinite number of primes of each of the forms 4n+1, 5n− 1,8n− 1, 8n− 3 and 8n+ 3.

Note All numbers of the form 4n or 4n + 2 are composite. Every prime p ∈ P is of theform 4n+ 1 or 4n+ 3.

69

Theorem (Dirichlet) If k > 0 and (h, k) = 1 then ∀x > 1,∑p 6 x

p ≡ h (mod k)

log(p)

p=

1

φ(k)log(x) +O(1)

Corollary Since x → ∞ ⇒ log(x) → ∞, there are an infinite number of primesin every arithmetic progression nk + h, n = 0, 1, 2, 3, . . . since p = nk + h for somen ⇔ p ≡ h (mod k).

Theorem (Dirichlet) Let

πh(x) =∑p 6 x

p ≡ h (mod k)

1.

Then πh(x) counts the number of primes in nk + h, n = 0, 1, 2, 3, . . . .

πh(x) ∼ π(x)

φ(k)∼ 1

φ(k)

x

log(x)as x→∞

Corollary For each h (mod k), πh(x) has the same asymptotic value i.e. the numberof primes in each class [h ]k is asymptotically the same.

Note All attempts to extend this result to more complex subsets of N than arithmeticprogressions have failed.

70

1. Are there an infinite number of primes of the form p = n2 + 1?

Ex There are an infinite number of composites xy = n2 + 1.

2. Are there an infinite number of primes p such that q = 2p+1 is also prime? (SophieGermain primes.)

3. Are there an infinite number of primes p such that q = p+ 2 is also prime? (Twinprimes conjecture.)

Ex If n > 3 one of {n, n+ 2, n+ 4} is divisible by 3, and is hence composite. (Notriple primes conjecture.)

Proof.

n ≡ 0 (mod 3) ⇒ 3 |nn ≡ 1 (mod 3) ⇒ n+ 2 ≡ 3 ≡ 0 (mod 3) ⇒ 3 |n+ 2

n ≡ 2 (mod 3) ⇒ n+ 4 ≡ 6 ≡ 0 (mod 3) ⇒ 3 |n+ 4

4. Find a quadratic polynomial f(n) = an2 + bn+ c with an infinite number of primevalues.

71

0

~r

f,J

~'"'t>

~ ~J

I::

--- - ~('f)

.~

~

0 0 0 00 0 0 0-=:It C"1 N rl

72

73

74

6 Sums of Squares

Sums of Two Squares

Which n can be expressed as n = x2 + y2?

1 = 12 + 02

2 = 12 + 12

4 = 22 + 02

5 = 22 + 12

8 = 22 + 22

But 3, 6, 7 cannot be written in this form.

Proposition If n ≡ 3 (mod 4) then n = x2 + y2 is impossible.

Proof. x2 ≡ 0 or 1 (mod 4) only ⇒ x2 + y2 ≡ 0, 1 or 2 (mod 4) only ⇒ x2 + y2 ≡ 3(mod 4) is impossible. �

Ex

3 6≡ x2 + y2

7 6≡ x2 + y2

15 6≡ x2 + y2

75

Proposition If n is representable (as the sum of two squares) so is k2n ∀k ∈ N.

Proof. x2 +y2 = n ⇒ k2x2 +k2y2 = k2n ⇒ k2n = (kx)2 +(ky)2 so k2n is representable.�

Theorem n is not representable ⇔ ∃pα ‖n where α is odd and p ≡ 3 (mod 4).

Proof. (⇐) Let n = x2 + y2 and d = (x, y) (the GCD), x1 = xd, y1 = y

d, n1 = n

d2then(

xd

)2+(yd

)2= n

d2⇒ d2 |n and x2

1 + y21 = n1 and (x1, y1) = 1.

If pβ ‖d ⇒ pα−2β |n1 and α − 2β > 1 since α is odd. Hence p |n1. But (x1, y1) = 1 sop - x1 and there is a u ∈ Z so ux1 ≡ y1 (mod p).

Hence 0 ≡ n1 ≡ x21 + y2

1 ≡ x21 + (ux1)

2 ≡ x21(1 + u2) (mod p). But (p, x1) = 1 also so

1 + u2 ≡ 0 (mod p). But p ≡ 3 (mod 4) ⇒ (−1 | p) = (−1)p−12 = (−1)

2+4`2 = −1 so

u2 ≡ −1 (mod p) is impossible. This contradiction shows n 6= x2 + y2. �

Proposition a, b, c, d ∈ Z ⇒ (a2 + b2)(c2 + d2) = (ac+ bd)2 + (ad− bc)2.

Proof. LHS = a2c2 + a2d2 + b2c2 + b2d2

RHS = a2c2 + b2d2 + 2acbd+ a2d2 + b2c2 − 2adbc = LHS.

OR

76

z = a− ib, w = c+ id, |z|2 · |w|2 = |zw|2. �

Note: If n1 = x21 + y2

1 and n2 = x22 + y2

2 then n1n2 = z21 + z2

2 where z1 = x1x2 + y1y2 andz2 = x1y2 − x2y1. Hence the product of any two representable numbers is representable.

Ex 5 = 22 + 12, 13 = 32 + 22 ⇒ 65 = 5 · 13 = (2 · 3 + 1 · 2)2 + (2 · 2− 3 · 1)2.

Theorem Every prime p ≡ 1 (mod 4) can be written as the sum of two squares.

Proof. Outline: Show x2 + y2 = kp. Then if 1 < k there is a k1 < k.

p ≡ 1 (mod 4) ⇒ (−1 | p) = (−1)p−12 = 1 os u2 ≡ −1Modp has a solution. Hence

u2 + 1 = kp for some k ∈ N. Let x = u, y = 1. So x2 + y2 = kp.

Define r, s byr ≡ x (mod k) −k

2< r 6 k

2

s ≡ y (mod k) −k2< s 6 k

2

}Then r2 + s2 ≡ x2 + y2 ≡ 0 (mod k) ⇒ r2 + s2 = k1k for some k1 > 1. ⇒ (rx+ sy)2 +(ry − sx)2 = (r2 + s2)(x2 + y2) = (k1k)(kp) = k1k

2p from the Proposition above.

But rx+ sy ≡ r2 + s2 ≡ 0 (mod k) and ry− sx ≡ rs− sr ≡ 0 (mod k). So k2 |(rx+ sy)2

and k2 |(ry − sx)2 and we can write(rx+ sy

k

)2

+

(ry − sx

k

)2

= k1p ⇒ x21 + y2

1 = k1p.

77

r2 + s2 6 k2

4+ k2

4= k2

2but r2 + s2 = k1k ⇒ k1k 6 k2

2⇒ k1 6 k

2⇒ k1 < k and we

are done. �

Notes:

1. n = x2 + y2 + z2 ⇔ n 6= 4e(8k + 7) and only 15 numbers less than 100 cannot bewritten as the sum of three squares

{7, 15, 23, 28, 31, 39, 47, 55, 60, 63, 71, 79, 87, 92, 95}

2. Every integer can be written as the sum of four squares.

3. 3 6= x3+3 but every integer can be written as the sum of 9 cubes (of positiveintegers).

4. Let g(k) be the smallest value of s ∈ N such that every integer can be written asthe sum of s kth powers. g(2) = 4, g(3) = 9

• 1770 Waring guessed {g(2), g(3), g(4) = 19}• 1909 Proved g(k) exists for all k.

• Much later

g(k) = 2k +

⌊(3

2

)k⌋− 2

for 6 6 k 6 200, 000 and thought to be true for all k.

78

5. Goldbach’s Conjecture (1742): Every even integer n > 2 can be written as thesum of two primes.

4 = 2 + 2

6 = 3 + 3

8 = 5 + 3

10 = 5 + 5

12 = 7 + 5...

100 = 97 + 3

• Known 2n = p1 + p2 + · · ·+ pk, k 6 2× 1010.

• Vinograou n > n0, 2n = p1 + p2 + p3 + p4.

• Chen (1966) n > n0, 2n = p1 + p2p3.

Sums of Four Squares

Bachet (1621): Stated ∀n ∈ N, n = x2 + y2 + z2 + w2 where x, y, z, w > 0. Verified upto n = 325.

Fermat claimed he had a proof.

79

Descartes: “The theorem is true, but so difficult I dare not undertake it.”

Euler (1743): Product of a sum of four squares is again a sum of four squares.

(1751): 1 + x2 + y2 ≡ 0 (mod p) ∀p ∈ P.

Lagrange (1770): Proof.

Euler (1773): Simpler proof—after 43 years!

Proposition The product of four squares is a sum of four squares.

Proof.

(a2 + b2 + c2 + d2)(r2 + s2 + t2 + u2)

= (ar + bs+ ct+ du)2 + (as− br + cu− dt)2

+ (at− bu− cr + ds)2 + (au+ bt− cs− dr)2

Check by multiplying out each side. �

We now need only show every prime p is the sum of four squares.

Proposition If p is an odd prime then

1 + x2 + y2 ≡ 0 (mod p)

has a solution with 0 < x, y < p2.

80

Proof. Let

S1 =

{02, 12, . . . ,

(p− 1

2

)2}.

Then x2 ≡ y2 (mod p) ⇒ (x+ y)(x− 7) ≡ 0 (mod p) ⇒ p |x+ y or p |x− y ⇒ x = yif 0 6 x, y 6 p−1

2so the numbers in S1 are distinct mod p. So are the numbers in

S2 =

{−1− 02, −1− 12 − 1− 22, . . . , −1−

(p− 1

2

)2}.

S1 ∪ S2 contains(p−12

+ 1 + p−12

+ 1)

numbers i.e. p + 1 numbers. Hence one number in

S1 is congruent to one number in S2 or x2 ≡ −1 − y2 (mod p) and 0 6 x, y 6 p−12⇒

1 + x2 + y2 ≡ 0 (mod p). �

Approach to Lagrange’s theorem: Express some multiple of p as the sum of four squares,then prove there is a smaller multiple. The Proposition above implies kp = x2+y2+12+02.

Proposition If p is an odd prime, there is an odd integer m < p such that

mp = x2 + y2 + z2 + w2.

Proof. By the abovekp = x2 + y2 + 12 + 02

where 0 < x, y < p2. ⇒ kp = x2 + y2 + 1 < p2

4+ p2

4+ 1 < p2 ⇒ k < p.

81

Claim: We can choose k odd. If k is even let

kp = x2 + y2 + z2 + w2

then all of x, y, z, w are odd, all are even, or two are odd and two are even. So arrangeterms so x ≡ y (mod 2) and z ≡ w (mod 2).

⇒ kp

2=

(x− y

2

)2

+

(x+ y

2

)2

+

(z − w

2

)2

+

(z − w

2

)2

+

(z + w

2

)2

If k2

is even repeat this process, until eventually, we obtain an odd multiple of p expressedas the sum of four squares. �

Proposition If m, p are odd, 1 < m < p and mp = x2 + y2 + z2 + w2 then there is apositive integer m1 with m1 < m and

m1p = x21 + y2

1 + z21 + w2

1

Proof. Choose A,B,C,D in −m2< A,B,C,D < m

2with A ≡ x, B ≡ y, C ≡ z, D ≡ w

(mod m).

⇒ A2 +B2 + C2 +D2 ≡ x2 + y2 + z2 + w2 (mod m) ≡ 0 (mod m)

⇒ A2+B2+C2+D2 = km for some k. But A2+B2+C2+D2 < m2

4+m2

4+m2

4+m2

4= m2 ⇒

0 < k < m. (k 6= 0 since k = 0 ⇒ A = B = C = D = 0 so m |x, y, z, w ⇒ m2 |mp.)

82

Hence m2kp = (x2 + y2 + z2 + w2)(A2 +B2 + C2 +D2)

⇒ m2kp = (xA+ yB + zC + wD)2 + (xB − yA+ zD − wC)2

+ (xC − yD − zA+ wB)2 + (xD + yC − zB − wA)2

Each term in parentheses on the RHS is divisible by m:

xA+ yB + zC + wD ≡ x2 + y2 + z2 + w2 ≡ 0 (mod m)xB − yA+ zD − wC ≡ xy − yx+ zw − wz ≡ 0 (mod m)xC − yD − zA+ wB ≡ xz − yw − zx+ wy ≡ 0 (mod m)xD + yC − zB − wA ≡ xw + yz − zy − wx ≡ 0 (mod m)

So put

x1 =xA+ yB + zC + wD

m

y1 =xB − yA+ zD − wC

m

z1 =xC − yD − zA+ wB

m

w1 =xD + yC − zB − wA

m

⇒ x21 + y2

1 + z21 +w2

1 = m2kpm2 = kp and k < m (from above) so the proposition is proved.

83

Theorem Every positive integer can be written as the sum of four integer squares.

Proof. 2 = 12 + 12 and pi = x2i + y2

i + z2i + w2

i for odd pi ∈ P. So given

n = 2α0

m∏i=1

pαii

apply the above results as many times as necessary to show that n can be expressed asthe sum of four squares. �

Euler’s Conjecture (1769): ∀k > 3 a non-zero kth power is not equal to the sum ofk − 1 non-zero kth powers.

(1966) Lander and Perkin k = 5

1445 = 275 + 845 + 1105 + 1335

(1988) Elkies (using elliptic curves)

20, 615, 6734 = 2, 682, 4404 + 15, 365, 6394 + 18, 796, 7604

Notes:

1. Every integer can be expressed as the “algebraic” sum of three squares i.e. n =

84

±x2 ± y2 ± z2

2n+ 1 = (n+ 1)2 − n2 + 02 (2n+ 1 odd)

2n = (n+ 1)2 − n2 − 12 (2n even)

2. Every integer can be expressed as the sum of five cubes: If 6 |n then

(x+ 1)3 + (x− 1)3 − 2x3 = 6x = n

∀n 6 |n− n3 so x→ n−n3

6

⇒(n− n3

6+ 1

)3

+

(n− n3

6

)3

− 2

(n− n3

6

)3

= n− n3

⇒ n = x31 + x3

2 + x33 + x3

4 + x35

7 Diophantine Equations

Ex The Pythagorean equation x2 + y2 = z2, (x, y, z) = 1 has a solution

x = p2 − q2, y = 2pq, z = p2 + q2

for p ∈ N, q ∈ N with (p, q) = 1, one being even and one being odd. Conversely everysolution of the Pythagorean equation in coprime positive integers has this form. Exp = 2, q = 1, p = 3, q = 2.

85

Ex The equation 2x2 + 3y2 = z2 is insoluble: Assume (x, y, z) = 1 ⇒ 3 - x. But2x2 ≡ z2 (mod 3) ⇒ (zx−1)2 ≡ 2 (mod 3). But 02 ≡ 0, 12 ≡ 1, 22 ≡ 1 (mod 3) so thisequation has no solution.

Fermat’s Last Theorem (FLT, 1994, Wiles, Ribet & Taylor) For n > 3 theequation

xn + yn = zn

has no solution in (strictly) positive integers.

Theorem 23 There are no non-trivial solutions to x4 + y4 = z2

Corollary FLT is true for n = 4: If false there is a solution a4 + b4 = c4 = (c2)2 sothere would be a solution x = a, y = b, z = c2.

Proof. (of Theorem 23 ) (“method of infinite descent”) Suppose a4 + b4 = c2 is a solutionwith c2 as small as possible. Then

Claim: (a, b) = 1. If not ∃p ∈ P, p |aandp |b ⇒ p4 |c2 so p2 |c and(a

p

)4

+

(b

p

)4

=

(c

p2

)2

would be a solution with a smaller value for c2.

Claim: a and b cannot both be odd. If so a = 2n + 1, b = 2m + 1 ⇒ a4 + b4 ≡ 2(mod 4) but 02 ≡ 0, 12 ≡ 1, 22 ≡ 0, 32 ≡ 1 (mod 4) so c2 ≡ 2 (mod 4) is impossible.

86

Claim: a and b cannot both be even. Since 2 |a, 2 |b ⇒ 2 |(a, b) = 1.

Hence one is even and one is odd. Call the even one a. Now we have a solution tox2 + y2 = z2, (x, y) = 1.

(a2)2 + (b2)2 = c2

(a2, b2) = 1, a2 is even and b2 is odd.

Hence, ∃m,n ∈ Z, (m, n) = 1 not both odd so

a2 = 2mnb2 = m2 − n2

c = m2 + n2

(18)

Claim: n is even. If n is odd and m even, b2 ≡ m2 − n2 (mod 4) ⇒ b2 ≡ −n2

(mod 4) ⇒ x2 ≡ −1 (mod 4) but this is impossible so n is even, hence m is odd by(18).

Say n = 2q.

(18)⇒ a2 = 4mq ⇒(a

2

)2

= mq. (19)

Claim (m, q) = 1: If not, ∃p |m and p | q ⇒ p |m and p | n ⇒ (m, n) 6= 1. By (19)∃t, v ∈ N with m = t2, q = v2 and (t, v) = 1. Now

n2 + (m2 − n2) = m2 (20)

87

we know

n = 2q = 2v2

m2 − n2 = b2

m = t2

so (20) ⇒ (2v2)2 + b2 = (t2)2 and no two of 2v2, b and t2 have a common factor.Therefore, by the Pythagorean theorem again 2v2 = 2AB, t2 = A2 + B2, A > 0, B >0, (A, B) = 1. v2 = AB and (A, B) = 1 ⇒ A = r2, B = s2, r > 0, s > 0 and sor4 + s4 = t2.

But t 6 t2 = m 6 m2 < m2 + n2 = c by (18) and so c is not the least member of asolution (!!!). �

Corollary x4n + y4n = z4n, n = 1, 2, 3, . . . has no non-trivial solution.

Proof. If it did, say a4n + b4n = c4n, a, b, c > 1 ⇒ (an)4 + (bn)4 = (c2n)2 would give asolution to x4 + y4 = z2, which is impossible. �

Theorem 24 FLT(p) for any odd prime p ⇒ FLT(n) ∀n > 3.

Proof. Let n > 3 and an + bn = cn, a, b, c > 1. Let n not be divisible by any odd prime.Then n = 2m, m > 2 so a2m + b2

m= c2

m

⇒ (a2m−2

)4 + (b2m−2

)4 = (c2m−1

)2

solves x4 + y4 = z2 which is impossible.

88

Let n be divisible by some odd prime p > 3 so n = pm. Then

an + bn = cn

⇒ amp + bmp = cmp

⇒ (am)p + (bm)p = (cm)p

which is impossible by FLT(p) �

Further reading Fermat’s Last Theorem—by Simon Singh (4th Estate)Video—Fermat’s Last Theorem—BBC Horizon

Proof based on the Frey curve y2 = x(x − Ap)(x − Bp), which, if Ap + Bp = Cp, has a“discriminant” (ABC)p, should not, and does not exist.

89

Ex Equations like y2 = x3 +7 are called “elliptic curves”. They arise in solving integralsfor, say, the period of a body in a planetary orbit.

(Lebesgue, 1869) The equation y2 = x3 + 7 is insoluble over Z.

Proof. If x is even, x = 2α ⇒ RHS = 8α3 + 7 = 8β + 7, where β = α3. But02 ≡ 0, 12 ≡ 1, 22 ≡ 4, 32 ≡ 1, 42 ≡ 0, 52 ≡ 1, 62 ≡ 4 and 72 ≡ 1 (mod 8) so y2 ≡ 7(mod 8) has no solution. Hence x is odd. Write

y2 + 1 = x3 + 8

= (x+ 2)(x2 − 2x+ 4)

= (x+ 2)((x− 1)2 + 3)

If x = 2n + 1 (odd) then (x− 1)2 + 3 = 4n2 + 3 = 4m + 3, m = n2 so (see back) musthave a prime factor of the form p = 4` + 3. But then y2 + 1 ≡ qp ≡ 0 (mod p) But(lemma later) p ≡ 3 (mod 4) ⇒ y2 ≡ −1 (mod p) has no solution. �

We frequently need to know the answer to the following: When does x2 ≡ r (mod p)have a solution x? Or, more generally, x2 ≡ α (mod m). The answer is given by thetheory of quadratic reciprocity due to Gauss. This will be developed later.

90

8 Pell’s Equation

x2 −Ny2 = 1

Trivial solution x = 1, y = 0, x, y > 0.

N = −1 ⇒ (x, y) = (1, 0) or (0, 1) are trivial solutions only.

N 6 −2 ⇒ (x, y) = (1, 0).

Let N > 0 and not a square: If N = M2, M ≥ 1, x2 − Ny2 = x2 − (My)2 = (x −My)(x+My) = 1 ⇒ x−My = 1 and x+My = 1 so we can get all solutions. Indeed(x, y) = (1, 0) for x, y > 0. So we always assume N ≥ 2.

Note: Solutions to Pell’s equation provide good rational approximations for square roots,since x2 = Ny2 + 1

⇒(x

y

)2

= N +1

y2

⇒ xy≈√N if y is large.

Note: This type of equation has a long and interesting history, and has lots of applica-tions, especially to fields F = Q(

√N).

Ex (Euler, 1770) A triangular number has the form n(n+1)2

. Which numbers are both

91

triangular and square?m2 = n(n+ 1)/2⇒ 8m2 + 1 = 4n2 + 4n+ 1 = (2n+ 1)2

⇒ x2 − 2y2 = 1 where x = 2n+ 1, y = 2m.So solutions to this Pellian equation produce (all) square triangular numbers.

Definition A fundamental solution to x2−dy2 = 1 is (r, s) where any other positivesolution satisfies r < x and s < y.

Theorem 25 (Lagrange) Let (r, s) be the least positive (or fundamental) solution tox2 − dy2 = 1, where d is not a square. Then every solution to this equation is given by(xn, yn) where

xn +√dyn = (r + s

√d)n

for n = 1, 2, 3, . . .

Proof.

x2n − dy2

n = (xn + yn√d)(xn − yn

√d)

= (r + s√d)n(r − s

√d)n

= (r2 − s2d)n = 1n = 1

Hence (xn, yn) is a solution.

Let (a, b) be a solution. Suppose ∀n = 1, 2, 3, . . . , (a, b) 6= (xn, yn). Then there is a

92

positive integer m with

(r + s√d)m < a+ b

√d < (r + s

√d)m+1 (21)

But (r + s√d)−m = (r − s

√d)m so (21) ⇒

1 < (a+ b√d)(r − s

√d)m < (r + s

√d) (22)

Let u+ v√d = (a+ b

√d)(r − s

√d)m so

u2 − v2d = (u+ v√d)(u− v

√d)

= (a+ b√d)(r − s

√d)m(a− b

√d)(r + s

√d)m

= (a2 − b2d)(r2 − s2d)m = 1 · 1m = 1

Thus (u, v) is a solution.

But 1 < u+ v√d ⇒ 0 < u− v

√d < 1 so

2u = (u+ v√d) + (u− v

√d) > 1 + 0 > 0

And 2v√d = (u+ v

√d)− (u− v

√d) > 1− 1 = 0 so u > 0, v > 0 and u+ v

√d < r+ s

√d

by (22), contradiction the assumption that (r, s) is the fundamental solution. Hence(a, b) = (xn, yn) for some n. �

Finding the least positive solution is not easy however and requires the theory of continuedfractions of J. L. Lagrange. Frenicle’s table for non-square d up to 50 is given below.

93

Pell's equation"f,,'

~l:lPuler, after a cursory reading of Wallis's Opera Mathematica, mistakenlyr~buted the first serious study of nontrivial solutions to equations of the

'~J;f°!In x2 - dy2 = 1, where x ~ 1 and y ~ 0, to Cromwell's mathematician~a,.John Fell. However, there is no evidence that Fell, who taught at the'~;~niversity of Amsterdam, had ever considered solving such equations.~t[;rhey :would be more aptly called Fermat's equations, since Fermat first~~(tlvestigated properties of nontrivial solutions of each equations. Neverthe-,~\(tess, Pellian equations have a long history and can be traced back to the

.;ff,.Greeks. Theon of Smyrna used x/y to approximate ~, where x and y~'\gY(ere integral solutions to x2 - 2y2 = 1. In general, if x2 = dy2 + 1, then

;~~2/y =d+ 1/y2. Hence, for y large, x/y is a good approximation of~'Yd, a fact well known to Archimedes.JI(Archimedes's problema bovinum took two thousand years to solve.itccording to a manuscript discovered in the Wolfenbiittel library in 1773

,tRY Gotthold Ephraim Lessing, the German critic and dramatist, Archi-~~edes became upset with Apollonius of Perga for criticizing one of hist~orks. He divised a cattle problem that would involve immense calculationj~?i solve and sent it off to Apollonius. In the accompanying correspon-r!~ence, A.rchimedes asked Apollonius to compute, if he thought he was,ii' .

~ - -~

smart enough, the number of the oxen of the sun that grazed once upon theplains of the Sicilian isle Trinacria and that were divided according to colorinto four herds, one milk white, one black, one yellow and one dappled,

with the following constraints:

white bull~ ~ yellow bulls + (~+ ~) black bulls,

(1 1)black bulls = yellow bulls + 4 + :5 dappled bulls, if(1 1) \

dappled bulls = yellow bulls + "6 + '7 white bulls,

white cows = (~+~) black herd,

black cows = (~+~) dappled herd,

dappled cows = (~+~) yellow herd, and

yellow cows = (~+ ~) white herd.

Archimedes added, if you find this number, you are pretty good at numbers,but do not pat yourself on the back too quickly for there are two more

conditions, namely:white bulls plus black bulls is square and

dappled bulls plus yellow bulls is triangular.

Archimedes concluded, if you solve the whole problem then you may 'goforth as conqueror and rest assured that thou art proved most skillful in the

science of numbers'.The smallest herd satisfying the first seven conditions in eight unknowns,

after some simplifications, lead to the Pellian equation x2-4729494 y2 = 1. The least positive solution, for which y has 41 digits, wasdiscovered by Carl Amthov in 1880. His solution implies that the numberof white bulls has over 2 X 105 digits. The problem becomes much moredifficult when the eighth and ninth conditions are added and the firstcomplete solution was given in 1965 by H.C. Williams, R.A. German, and

C.R. Zarnke of the University of Waterloo.In Arithmetica, Diophantus asks for rational solutions to equations of

the type x2 - dy2 = 1. In the case where d = m2 + 1, Diophantus offeredthe integral solution x = 2m2 + 1 and y = 2m. Pellian equations are found

in Hindu mathematics. In the fourth century, the Indian mathematican

--

94

95

96

9 Continued Fractions

Ex

1 +1

2 + 13+ 1

4

= 1 +1

2 + 113/4

= 1 +1

2 + 413

= 1 +1

30/13

= 1 +13

30

=43

30

looks silly until we consider some interesting continued fraction expansions

π : [3, 7, 15, 1, 292, 1, 1, 1, . . .] i.e.

3 +1

7 + 115+ 1

293+···

e : [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, . . .]

97

√2 : [1, 2, 2, 2, 2, . . .]√3 : [1, 1, 2, 1, 2, 1, 2, 1, 2, . . .]√5 : [2, 4, 4, 4, . . .]√n2 + 1 : [n, 2n, 2n, . . .] (Euler)

Definition By a simple continued fraction (or C.F.) we mean an expression

a0 +1

a1 + 1a2+···

= [a0, a1, a2, . . .]

where a0 ∈ Z and ai ∈ N for i > 1.

Note: [a0] = a0

1, [a0, a1] = a0a1+1

a1, [a0, a1, a2] = a2a1a0+a2+a0

a2a1+1

Generally, [a0, . . . , an] = pnqn

where pn and qn are polynomials in the ai, linear in any given

aj, and a0 does not occur in the denominator qn. (pn, qn) are called the nth convergents.

Note: [a0, . . . , an] = [a0, . . . , an−1 + 1an

]

Proposition If [a0, . . . , am] = [b0, . . . , bn], ai, bi ∈ N, am, bn > 1 then m = n andai = bi ∀i.Proof. This follows by induction from

[a0, . . . , am] = a0 +1

[a1, . . . , am]= b0 +

1

[b1, . . . , bn]

98

if we can show [a1, . . . , am] > 1 when a1, . . . , am > 1. But this is so since [a1, . . . , am] =a1 + 1

a2+··· . �

Let ai > 0 and ∀n let τn = [a0, . . . , an] then τn can be computed using the recursiveformulas, for n ≥ 2:

p0 = a0 p1 = a0a1 + 1 pn = anpn−1 + pn−2

q0 = 1 q1 = a1 qn = anqn−1 + qn−2

so τ0 = p0q0, τ1 = p1

q1and τn = pn

qn

Proof.

τn = [a0, . . . , an] = [a0, . . . , an−1 +1

an] =

p′n−1

q′n−1

where these belong to a0, . . . , an−2, an−1 + 1an

i.e. (induction)

p′n−1

q′n−1

=

(an−1 + 1

an

)pn−2 + pn−3(

an−1 + 1an

)qn−2 + qn−3

=an(an−1pn−2 + pn−3) + pn−2

an(an−1qn−2 + qn−3) + qn−2

=anpn−1 + pn−2

anqn−1 + qn−2

(induction again!)

99

Hence pn = anpn−1 + pn−2 and qn = anqn−1 + qn−2 �

(pn, qn) are called the nth convergents of the C.F.

Let θ ∈ R \ Z, θ > 1. a0 = bθc so θ = a0 + 1θ1, θ1 > 1 defines θ1. Continue with

θ1 = a1 + 1θ2

so a1 = bθ1c , θ2 > 1 if θ1 6∈ Z etc θn = an + 1θn+1

, an = bθnc , θn=1 > 1 ifθn 6∈ Z. We get

θ = a0 +1

a1 + 1a2+ 1

...+ 1

an+ 1θn+1

so θ = [a0, a1, . . . , an + 1θn+1

]

Proposition The expansion stops if θn = an is in N and then θ ∈ Q+ i.e. is a positiverational number. Conversely, if θ ∈ Q+, the C.F. expansion is finite.

Proof. Let θ = uv∈ Q+, u, v ∈ N. Use division

u = a0v + r1 0 < r1 < vv = a1r1 + r2 0 < r2 < r1r1 = a2r2 + r3 0 < r3 < r2

...rn−1 = anrn + 0

100

as if we were doing the Euclidean algorithm. These equations give

θ = θ0 =u

v= a0 +

r1v

= a0 +1

v/r1= a0 +

1

θ1

θ1 = a1 +r2r1

= a1 +1

r1/r2= a1 +

1

θ2

...

θn =rn−1

rn∈ N

so the C.F. expansion is finite. �

Proposition ∀n > 2

θ =θnpn−1 + pn−2

θnqn−1 + qn−2

Proof. The definition of θn is θ = [a0, . . . , an−1, θn] so θ = τn = pnqn

= θnpn−1+pn−2

θnqn−1+qn−2using an

and θn for this particular C.F. �

Ex√

2 = [1, 2, 2, . . .]

(√

2− 1)(√

2 + 1) = 2− 1 = 1 ⇒√

2− 1 = 11+√

2so√

2 = 1 + 11+√

2.

We now copy the expression for√

2 in the RHS into the√

2 on the RHS successively

101

(photocopy model for recursion).

√2 = 1 +

1

1 + 1 + 11+√

2

= 1 +1

2 + 11+√

2

= 1 +1

2 + 12+ 1

1+√

2

etc. leading to√

2 = [1, 2, 2, 2, 2, . . . , 2, 1 +√

2]. If we continue indefinitely we obtain√2 = [1, 2, 2, . . .] = [1, 2 ].

Every quadratic irrational has a periodic continued fraction—this characterises quadraticirrationals.

Ex√

2 = [1, 2, . . . , 2, 1 +√

2] so a0 = 1, a1 = 2, . . .

p0 = a0 = 1q0 = 1

}τ0 =

p0

q0=

1

1= 1

p1 = a0a1 + 1 = 3q1 = a1 = 2

}τ1 =

p1

q1=

3

2= 1.5

102

p2 = a2p1 + p0 = 7q2 = a2q1 + q0 = 5

}τ2 =

p2

q2=

7

5= 1.4

and the approximation τn ≈√

2 gets better.

Theorem 26 Let a0 ∈ Z, ai ∈ N, i > 1. Then (τn) converges to an irrational numberθ. The ai are uniquely determined by the C.F. expansion of θ. Conversely, if θ is anirrational number, and τn = [a0, . . . , an] are obtained by expanding θ as a C.F. then

θ = limn→∞

τn.

Proof. The sequences (pn) and (qn) are both strictly monotonically increasing sequencesof natural numbers.

Claim:pnqn−1 − pn−1qn = (−1)n−1 (23)

∀n > 1. If n = 1 this is p1q0 − p0q1 = (a0a1 + 1)1 − a0a1 = 1 = (−1)1−1 which is true.Assume it is true for n = m. Then

pm+1qm − pmqm+1 = (am+1pm + pm−1)qm − pm(am+1qm + qm−1)

= pm−1qm − pmqm−1

= −(pmqm−1 − pm−1qm)

= −(−1)m−1

= (−1)m

103

Hence, by induction, the claim is true ∀n > 1.

Divide (23) by qnqn−1 to obtain

pnqn− pn−1

qn−1

=(−1)n−1

qnqn−1

or

τn − τn−1 =(−1)n−1

qnqn−1

(24)

Apply this to θ = [a0, . . . , an−1, θn] to get

θ − τn−1 =(−1)n−1

qn−1(θnqn−1 + qn−2)(25)

But θi > 0 and qi →∞

∴ limn→∞

τn = θ

since RHS of (25) → 0. The proof of uniqueness is similar to that given above whenθ ∈ Q+. �

Aside Numbers of the form α + β√d, d ∈ N, d 6= m2 are a field, F = Q(

√d), the

“extension” of Q by√d:

1

α + β√d

=α− β

√d

α2 − β2d=

α2 − β2d

)−(

β

α2 − β2d

)√d ∈ {α1 + β1

√d}

104

Diophantine Approximation

Equation (25) implies ∣∣∣∣θ − pnqn

∣∣∣∣ =1

qn(θn+1qn + qn−1)

<1

qnqn+1

(26)

The numbers q0, q1, . . . are strictly increasing in N. The continued fraction process pro-vides us with an infinite sequence of rational approximations to an irrational number, θ,namely the convergents pn

qn∈ Q. How rapidly do they approach θ?

By (26), if xy

is a convergent, ∣∣∣∣θ − x

y

∣∣∣∣ < 1

y2

It is possible to prove that (Hurwitz, 1891] any irrational number θ has an infinite numberof rational approximations which satisfy∣∣∣∣θ − x

y

∣∣∣∣ < 1√5y2

(27)

This is the best possible: If we choose β >√

5 then there are numbers η ∈ R \ Q for

which there are only a finite number of rationals xy

with∣∣∣η − x

y

∣∣∣ < 1βy2

.

105

e.g. the golden ratio

g = 1 +1

1 + 11+ 1

1+...

= 1 +1

g

so g2 − g − 1 = 0 ⇒ g = 1+√

52

.

Inequalities of the form (27) will be very important later when we study rational, alge-

braic, irrational and transcendental numbers such as 401403, 1+

√5

2and e or π.

Quadratic Irrationals

• solutions to quadratic equations with Z coefficients e.g. x2 − 2 = 0 ⇒ x =√

2.

• simplest type of irrational e.g. (√

4 + 71/3)1/5 is ‘more’ irrational as is π (see later)

Ex θ = 24−√

1517

: 3 <√

15 < 4 ⇒ bθc = 1 and

θ = 1 +1

θ1

106

θ1 =1

θ − 1=

17

7−√

15=

7 +√

15

2

bθ1c = 5

⇒ θ1 = 5 +1

θ2

θ2 =1

θ1 − 5=

2√15− 3

=

√15 + 3

3

bθ2c = 2

⇒ θ2 = 2 +1

θ3

θ3 =1

θ2 − 2=

3√15− 3

=

√15 + 3

2

bθ3c = 3

⇒ θ3 = 3 +1

θ4

θ4 =1

θ3 − 3=

2√15− 3

=

√15 + 3

3so θ4 = θ2

⇒ 25−√

15

17= 1 +

1

5 + 12+ 1

3+ 1

2+13

107

Ex

√2 = [1, 2 ]√3 = [1, 1, 2 ]√5 = [2, 4 ]√6 = [2, 2, 4 ]

H. Davenport, The Higher Arithmetic

Ex√

50 = [ 7, 14 ]

108

Purely periodic fractions

Ex

√2 + 1 = 2 +

1

2 + 12+ 1···√

6 + 2 = [ 4, 2 ]

These numbers are easier to deal with than those with a ‘preperiod’.

Ex

α = 4 +1

1 + 13+ 1

4+ 1

1+ 13+···

= [4, 1, 3, α]

using the recursive equations we get convergents[

41, 5

1, 19

4, 5

1, . . .

].

α =19α + 5

4α + 1⇐ α =

αpn−1 + pn−2

αqn−1 + qn−2

Hence 4α2 − 18α− 5 = 0 and α is a quadratic irrational.

109

Now consider the number β which has the period of α reversed :

β = [ 3, 1, 4 ] ⇒ β =19β + 4

5β + 1

⇒ 5β2 − 18β − 4 = 0

The equations are the same if − 1β

= α ⇒ − 1β

is the second root of the equation for α

called the (algebraic) conjugate of α or α.

In general let α = [a0, . . . , an, α] be purely periodic, then

α =pnα + pn−1

qnα + qn−1

Let β = [an, . . . , a0] = [an, . . . , a0, β] then (Ex)

β =pnβ + qn

pn−1β + qn−1

As before − 1β

is the conjugate of the root α.

Note: If β > 1 then −1 < − 1β< 0.

Theorem 27 Any purely periodic continued fraction represents a quadratic irrationalnumber α > 1 with a conjugate α satisfying −1 < α < 0. This conjugate is α = − 1

β

where β is defined by the C.F. of α with the period reversed.

110

Remark (Galois, 1828) This property characterises numbers with purely periodic con-tinued fractions.

Definition A quadratic irrational α is reduced if α > 1 and −1 < α < 0.

Theorem 28 If α is reduced, its C.F. expansion is purely periodic.

Proof. There are integers a, b, c such that aα2 + b α + c = 0. Solving for α:

α =−b±

√b2 − 4ac

2a=P ±√D

Q

where P,Q ∈ Z, D ∈ N, D 6= m2. Assume the sign is positive, else multiply by(−1−1

)so

α =P +√D

Q

so α, the other root, is

α =P −√D

Q.

Note thatP 2 −DQ

=b2 − (b2 − 4ac)

2a= 2c ⇒ Q |P 2 −D

111

But 1 < α and −1 < α < 0 so

(i) α− α > 0 ⇒√DQ> 0 ⇒ Q > 0

(ii) α + α > 0 ⇒ PQ> 0 ⇒ P > 0

(iii) α < 0 ⇒ P <√D

(iv) 1 < α ⇒ Q < P +√D < 2

√D

⇒ P,Q ∈ N, P <√D, Q < 2

√D and Q |P 2 −D. (28)

Now expand α as a C.F.

α = a0 =1

α1

, a0 = bαc , α1 > 1

⇒ α = a0 +1

α1

⇒ α1 = − 1

a0 − α⇒ −1 < α1 < 0

Hence α1 is reduced also. Similarly α2, α3, . . . are reduced.

Now1

α1

= α− a0 =P +√D

Q− a0 =

P −Qa0 +√D

Q

so let P1 = −P +Qa0 so

α1 =Q

−P1 +√D

=P1 +

√D

Q1

(29)

112

where Q1Q = D − P 21 and Q1 ∈ Z since Q |D − P 2 and P1 ≡ −P (mod Q).

Then

α1 =P1 +

√D

Q1

and since α1 is reduced, P1 > 0, Q1 > 0 and get the conditions (28) above using (29).We carry on with the C.F. process, using α1 instead of α, . . .. Each complete quotientpnqn

has the form

αn =Pn +

√D

Qn

where Pn, Qn satisfy (28)There are only a finite set of possibilities for the pairs (Pn, Qn)so eventually we come to a pair (Pm, Qm) = (Pn, Qn) , m > n so αm = αn and so theC.F. is periodic from this point on.

Claim: The C.F. is purely periodic.

Subclaim: αn−1 = αm−1. If this were so we would be able to work back to get, eventually,α0 = αm−n proving pure periodicity. Proof of the subclaim: αn = an+ 1

αn−1⇒ αn = an+

1αn+1

. Let βn = − 1αn

then −1 < αn < 0 ⇒ 1 < βn and − 1βn

= an−βn+1 or βn+1 = an+ 1βn

so an = bαnc = bβn+1c. Now let n < m and αn = αm so αn = αm ⇒ βn = βm andan−1 = bβnc = bβmc = am−1. But αn−1 = an−1+ 1

αn, αm−1 = am−1+ 1

αm⇒ αn−1 = αm−1.

113

Applying this again successively α = α0 = αm−n = αr say, and

α = [a0, a1, . . . , ar−1, αr]

= [a0, a1, . . . , ar−1, α]

= [ a0, a1, . . . , ar−1 ]

pure periodic with period length r. �

114

115

116

Now Consider the table on given above: N ∈ N, n 6= m2. All the continued fractions areof a special form:

(1) None are purely periodic√N = −

√N < −1 but a0 =

⌊√N⌋, α = a0 +

√N

has 1 < α and −1 < α < 0 and the continued fraction of α begins with 2a0 since⌊a0 +

√N⌋

= a0 +⌊√

N⌋

= 2a0. Hence a0 +√N = [2a0, a1, . . . , an, 2a0] eventually since

it is purely periodic.(2) There is one preperiod term for

√N (with a periodic part consisting of symmetric

terms), followed by 2a0:√N = [a0, a1, a2, . . . , a2, a1, 2a0]

Ex √53 = [7, 3, 1, 1, 3, 14]

Theorem (Lagrange) A continued fraction is periodic ⇔ it is the continued fractionof a quadratic irrational. Ex

21/3 = [1, 3, 1, 5, 11, 4, 1, . . .]e− 1

e+ 1= [2, 6, 10, 14, . . .]

e = [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, . . .]

117

Return to the Fundamental Solution to Pell’s Equation

√N = [a0, a1, . . . , an, 2a0︸ ︷︷ ︸

periodic part

]

118

Let pn−1

qn−1, pnqn

be the two convergents just preceding 2a0. i.e

pn−1

qn−1

= [a0, . . . , an−1]

pnqn

= [a0, . . . , an−1, an]

Now α =√N = αn+1pn+pn−1

αn+1qn+qn−1(1) where

αn+1 = 2a0 +1

a1 + . . .= [2a0, a1, a2, . . .]

= a0 +√N

Substituting this value for αn+1 in (1) and simplifying gives

√N(√N + a0)qn +

√Nqn−1 = (

√N + a0)pn + pn−1

equating rational and irrational parts:

⇒ Nqn = a0pn + pn−1

a0qn + qn−1 = pn

⇒ pn−1 = Nqn − a0pn

qn−1 = pn − a0qn

119

Substitute in equation (1) on page 74 to get

pn(pn − a0qn)− qn((Nqn − a0pn) = (−1)n−1

⇒ p2n −Nq2

n = (−1)n−1

Hence x = pn, = qn is a solution to s2 − Ny2 = 1 if n is odd and x2 − Ny2 = −1 if nis even. In the latter case apply the same argument to the convergents at the end of thesecond period: [a0, a1, . . . , an, 2a0, a1, . . . , a2n+1] so x = p2n+1, y = q2n+1 solve

x2 −Ny2 = (−1)2n+1−1 = (−1)2n = 1

Ex N = 21 √21 = [4, 1, 1, 2, 1, 1, 8] so n = 5 odd

Convergents: 41, 5

1, 9

2, 23

5, 32

7, 55

12= p5

q5, . . .

So x1 = p5 = 55, y1 = q5 = 12 solves x21 − 21y2

1 = 1

This gives the fundamental solution. Other solutions are

xm + ym√

21 = (x1 + y1

√21)m, m = 2, 3, 4, . . .

Ex N = 29 √29 = [5, 2, 1, 1, 2, 10] so n = 4 even

Convergents: 51, 11

2, 16

3, 27

5, 70

13, 727

135, 1524

283, 2251

418, 3774

701, 9801

1820= p9

q9So x1 = 9801, y1 = 1820 gives the fundamental solution.

120

Note

1. Not all details have been proved, e.g. that the C.F. expansion gives all of thesolutions, including the fundamental.

2. There are deep mysteries tied up in C.F. expansions e.g. why are they o closelyrelated to quadratic irrationals?

121

10 Quadratic Reciprocity

Let p be an odd prime p ∈ {3, 5, 7, . . .}, n ∈ Z, p - n.

• If x2 ≡ n (mod p) has a solution x ∈ Z let (n | p) = 1.

• If x2 ≡ n (mod p) has no solution let (n | p) = −1.

• If p |n let (n | p) = 0.

This defines the Legendre symbol, (n | p).

122

Ex p = 11:

12 ≡ 1

22 ≡ 4

32 ≡ 9

42 = 16 = 11 + 5 ≡ 5

52 = 25 = 22 + 3 ≡ 3

and 62 ≡ (11− 5)2 ≡ (−5)2 ≡ 52 ≡ 3

72 ≡ (−4)2 ≡ 5

82 ≡ (−3)2 ≡ 9

92 ≡ (−2)2 ≡ 4

102 ≡ (−1)2 ≡ 1

and 112 ≡ 0 (mod 11)

So the quadratic residues are 1, 3, 4, 5, 9 ⇒ (n | 11) = 1the non-residues are ∴ 2, 6, 7, 8, 10 ⇒ (n | 11) = −1and (11 | 11) = 0.

Proposition a ≡ b (mod p) ⇒ (a | p) = (b | p)Proof. a = b + lp so if p | a ⇒ p | b ∴ (a | p) = 0 ⇔ (b | p) = 0. Also (a | p) = 1 ⇔x2 ≡ a (mod p) has a solution ⇔ x2 ≡ a ≡ b has a solution. �

123

Proposition Let p ∈ P be odd. Every (reduced) residue system mod p contains exactlyp−12

quadratic residues and p−12

quadratic non-residues mod p. The quadratic residuesbelong to the residue classes containing the numbers:

R =

{12, 22, 32, . . . ,

(p− 1

2

)2}

Proof. Claim: the numbers in R are distinct mod p: If x2, y2 ∈ N and x2 ≡ y2 ⇒(x − y)(x + y) ≡ 0 (mod p) ⇒ p | (x − y)(x + y) But 0 < x + y 6 p−1

2+ p−1

2< p so

p |x− y ⇒ x ≡ y (mod p) ⇒ x = y.Since (p − k)2 = p2 − 2pk + k2 ≡ k2 (mod p) and {1, 2, . . . , p − 1} is a complete set ofrepresentatives, every quadratic residue is congruent to one of the numbers in R. �

Ex p = 7

12 ≡ 1

22 ≡ 4

32 ≡ 2

The number of residues is 7−12

= 3. So1, 4, 2 are residues3, 5, 6 are non-residues

Ex For all odd p, (1 | p) = 1.

124

Ex For all odd p and m ∈ Z, (m2 | p) = 1.

Ex Fix p and let f(n) = (n | p) so f : Z → {−1, 0, 1}. Then f(n + p) = (n+ p | p) =(n | p) = f(n) so f is periodic with period p. It is also completely multiplicative.f(ab) = f(a)f(b) ∀a, b ∈ Z.

Theorem 29 (Euler) Let p ∈ P be odd. Then ∀n ∈ Z,

(n | p) ≡ np−12 (mod p).

Proof. Note p−12∈ N. If p |n both sides are zero so suppose p - n.

Let (n | p) = 1. Then ∃x so x2 ≡ n (mod p)

⇒ np−12 ≡ (x2)

p−12 = xp−1 ≡ 1 = (n | p)

by Fermat’s little theorem. Hence

np−12 ≡ (n | p) if (n | p) = 1

Let (n | p) = −1. Consider the polynomial

f(x) = xp−12 − 1, ∂f = degree of f =

p− 1

2

So, over any field, f has at most p−12

roots, hence the congruence f(x) ≡ 0 (mod p)

has at most p−12

solutions. But the p−12

quadratic residues mod p are solutions (the case

125

(n | p) = 1) so the non-residues are not. Hence

np−12 6≡ 1 (mod p) if (n | p) = −1.

Butnp−1 − 1 =

(np−12 − 1

)(np−12 + 1

)and p |np−1 − 1 so

np−12 ≡ ±1 (mod p)

Hencenp−12 ≡ −1 = (n | p) (mod p)

Proposition f(n) = (n | p) is completely multiplicative.

Proof. p |m or p |n ⇒ p |mn ⇒ (mn | p) = 0, hence f(mn) = f(m) · f(n) if p |m orp |n. Let p - m and p - n. Then p - mn so

(mn | p) ≡ (mn)p−12 = f(mn) = m

p−12 · n

p−12 ≡ (m | p) (n | p) (mod p)

But each of (mn | p) , (m | p) or (n | p) is ±1 os the difference (mn | p)− (m | p) (n | p)is 0,±2. But this difference is divisible by p, hence it is 0, and therefore, (mn | p) =(m | p) (n | p) ⇒ f(mn) = f(m) · f(n)¡ so f is completely multiplicative. �

Proposition

(−1 | p) = (−1)p−12 =

{1 p ≡ 1 (mod 4)−1 p ≡ 3 (mod 4)

126

Proof. By Euler, Theorem 29,

(−1 | p) ≡ (−1)p−12 (mod p)

since each side is ±1, they must be equal. �

Ex p = 5, p−12

= 2

n 0 1 2 3 4 5(n | 5) 0 1 1 0

p = 7n 0 1 2 3 4 5 6 7

(n | 7) 0 1 -1 0

Proposition

(2 | p) = (−1)p2−1

8 =

{1 p ≡ ±1 (mod 8)−1 p ≡ ±3 (mod 8)

p 0 1 2 3 4 5 6 7 8× R × N × N × R ×

127

Proof. Consider the p−12

congruences mod p:

p− 1 ≡ 1(−1)1

2 ≡ 2(−1)2

p− 3 ≡ 3(−1)3

...

r ≡ p− 1

2(−1)

p−12

Multiply these together and note that each integer on LHS is even, since p is odd.

⇒ 2 · 4 · 6 · · · (p− 1) ≡(p− 1

2

)!(−1)1+2+3+···+ p−1

2 (mod p)

⇒ 2p−12

(p− 1

2

)! ≡

(p− 1

2

)!(−1)

p2−18 (mod p)

But p -(p−12

)!, hence, by Euler, Theorem 29,

(2 | p) = 2p−12 ≡ (−1)

p2−18 (mod p)

and since LHS and RHS are ±1 we have

(2 | p) = (−1)p2−1

8

128

Euler’s theorem is normally too computationally expensive to compute (n | p). Gauss’lemma and Reciprocity theorem, proved below, both give better ways to evaluate thisfunction.

Note: f : Z→ {−1, 0, 1} ⊂ S ′∪{0} = {z ∈ C : |z| = 1}∪{0} is an example of a so-called

character an extension of the group character χ :Z �pZ → S ′, χ(

[n ]p

)= (n | p)

Theorem 30 (Gauss) Let p - n and consider the residues mod p of the p−12

multiples

of n, M = {n, 2n, 3n, . . . , p−12n} which are the least positive residue representatives, i.e.

which lie in {1, . . . , p}. If m is the number which exceed p2, then (n | p) = (−1)m.

Proof. If p | in− jn with 1 6 i, j 6 p−12

. Then p | (i− j)n but p - n ⇒ i = j. hence thenumbers in M are incongruent mod p.Consider their least positive residues and put them in two disjoint sets A = {a1, . . . , ak}and B = {b1, . . . , bm} where ai ≡ tn (mod p), 1 6 t 6 p−1

2, 0 < ai <

p2

and bi ≡ sn

(mod p), 1 6 s 6 p−12, p

2< bi < p (1).

Since A ∩ B = ∅, m + k = p−12

. Let ci = p− bi, 1 6 i 6 m and C = {c1, . . . , cm}. Now0 < ci <

p2

by (1).Claim A ∩ C = ∅: If ci = aj ⇒ p− bi = aj ⇒ aj + bj = p ≡ 0 (mod p) ∴ tn+ sn =(t + s)n ≡ 0 (mod p) for some s and t with 1 6 s, t < p

2. But this is impossible since

129

1 <6 s+ t < p ⇒ p - s+ t. Hence A ∩ C = ∅.Hence #(A ∪ C) = m+ k = p−1

2integers in [1, p−1

2]. Hence

A ∪ C = {a1, . . . , ak, c1, . . . , cm} = {1, 2, . . . , p− 1

2}

Now form the product of all of the elements in A ∪ C:

a1a2 · · · akc1c2 · · · cm =

(p− 1

2

)!

But ci = p− bi so(p− 1

2

)! = a1 · · · ak(p− b1) · · · (p− bm)

≡ (−1)ma1 · · · akb1 · · · bm (mod p)

≡ (−1)mn(2n)(3n) · · ·((

p− 1

2

)n

)(mod p)

≡ (−1)mnp−12

(p− 1

2

)! (mod p)

⇒ np−12 ≡ (−1)m (mod p) and (n | p) = (−1)m follows by Theorem 29. �

Theorem 31 If m is defined as in the above theorem,

m ≡

p−12∑j=1

⌊jn

p

⌋+ (n− 1)

(p2 − 1

8

)(mod 2)

130

so if n is odd:

m ≡

p−12∑j=1

⌊jn

p

⌋(mod 2)

Note (1) If n ∈ N, n = 2am, a > 0, m odd, so

(n | p) = (2 | p)a (m | p) where m is odd

= (−1)a(p2−1)

8 (m | p)

(2) (n | p) = (−1)m so only the value of m (mod 2) (its parity) is needed to computethe Legendre symbol.

Proof. The number m is the number of least positive residues of n, 2n, . . . , p−12n exceeding

p2. Let jn be one of these

jn

p=

⌊jn

p

⌋+

{jn

p

}where 0 <

{jn

p

}< 1

so jn = p

⌊jn

p

⌋+ p

{jn

p

}= p

⌊jn

p

⌋+ rj where 0 < rj < p

131

The number rj is the least positive residue of jn : rj = jn− p⌊jnp

⌋(1). Using the same

notation as in the previous theorem,

{r1, . . . , r p−12} = {a1, . . . , ak, b1, . . . , bm}{

1, 2, . . . ,p− 1

2

}= {a1, . . . , ak, c1, . . . , cm}

ci = p− bi

Add all of the elements in each set:

p−12∑j=1

rj =k∑i=1

ai +m∑j=1

bj (2)

p−12∑j=1

j =k∑i=1

ai +m∑j=1

cj =k∑i=1

ai +mp−m∑j=1

bj (3)

In (2) use (1) for rj:

k∑i=1

ai +m∑j=1

bj = n

p−12∑j=1

j − p

p−12∑j=1

⌊jn

p

⌋(4)

(3) is mp+k∑i=1

ai −m∑j=1

bj =

p−12∑j=1

j (5)

132

Add (4) and (5) to get

2

(k∑i=1

ai

)+mp = (n+ 1)

(p2 − 1

8

)− p

p−12∑j=1

⌊jn

p

⌋But −p ≡ 1 (mod 2) and n+ 1 ≡ n− 1 (mod 2), hence

m ≡ (n+ 1)

(p2 − 1

8

)+

p−12∑j=1

⌊jn

p

⌋(mod 2)

Theorem 32 (Quadratic Reciprocity Law, Gauss, 1796) If p and q are distinctodd primes, then

(p | q) (q | p) = (−1)(p−1)(q−1)

4 (1)

Proof. (q | p) = (−1)m where

m ≡

p−12∑j=1

⌊jq

p

⌋(mod 2)

Similarly (p | q) = (−1)n where

n ≡

q−12∑i=1

⌊ip

q

⌋(mod 2)

133

Hence (p | q) (q | p) = (−1)m+n and (1) follows from the claimed identity:

p−12∑j=1

⌊jq

p

⌋+

q−12∑i=1

⌊ip

q

⌋=

(p− 1

2

)(q − 1

2

)(2)

Consider the rectangle with given vertices. (In the illustration, p = 7, q = 5.)

The diagonal does not pass through any lattice point, because if so, y = qpx at the lattice

point (x, y). ⇒ xq = yp ⇒ p |x and q |y so x > p, y > q and the point (x, y) must be

134

outside the rectangle.The total number of lattice points inside the rectangle is

(p−12

) (q−12

)= c

The total number of points in the triangle below the diagonal is

b =

p−12∑j=1

⌊jq

p

⌋The number above is

a =

q−12∑i=1

⌊ip

q

⌋So a+ b = c and so (2) follows, hence (1). �

Ex (219 | 383). Note 383 ∈ P. Now 219 = 3 · 73 (73 ∈ P) so, by multiplicativity

(219 | 383) = (3 | 383) (73 | 383) .

Reciprocity implies that

(3 | 383) (383 | 3) = (−1)(383−1)(3−1)

4 = −1

so

(3 | 383) = − (−1 | 3) using periodicity mod 3

= 1

135

Also

(73 | 383) = (383 | 73) (−1)(383−1)(73−1)

4

= (18 | 73)

= (2 | 73) (3 | 73)2

= (−1)732−1

8

= 1

Hence (219 | 383) = 1 · 1 = 1 and x2 ≡ 219 (mod 383) has a solution.

136

11 Elliptic Equations and Curves

• Diophantine family with interesting properties.

• Used in factoring and encryption.

• Curves have their own intrinsic arithmetic.

Ex (see above) y2 = x3 + 7 has no Z solutions.

General (Weierstrass) form:

y2 + a1xy + a3y = x3 + a2x2 + a4x+ a6

y → 12(y − a1x− a3) and multiplication by 4 gives

y2 = 4x3 + (a21 + ra2)x

2 + 2(2a4 + a1a3)x+ (a23 + 4a6)

x→ x−3(a21+4a2)

36, y → y

108and multiplying by 1082 we get

y2 = x3 − Ax−B

and Proposition If the ai ∈ Z so are A and B.

Discriminant

137

Definition D = (4A3 − 27B2)

If y2 = F (x) is an elliptic curve where F (x) is a cubic polynomial with integer coefficientswith roots r1, r2, r3 ∈ C then the discriminant

D :=∏i<j

(ri − rj)2 ∈ N

which is nonzero if and only if the roots are distinct.

• If D = 0 then x3 − Ax−B = (x− 2α)(x+ α)2, α =√

AB

.

1. If D = 0 and A 6= 0 then the curve y2 = x3−Ax−B crosses itself—known asa node.

2. If D = 0 and A = B = 0 we have a cusp.

138

• If D 6= 0 curve is non-singular i.e. ‘interesting’.

139

Can use Mathematica to plot elliptic curves e.g.ContourPlot[y2 − x3 + x, {x, -4, 4}, {y, -4, 4}, PlotPoints->200,Contours-> {0}, ContourShading->False]

Line intersection property: each non-vertical line meeting a curve E(R), points onthe curve with coordinates in R, in two points P,Q meets it in a third point R.

140

Proof. If the line is y = mx+ c solve with y2 = x3−Ax−B so (mx+ c)2 = x3−Ax−Bhas two solutions x1, x2 if P = (x1, y1) , Q = (x2, y2) and therefore a third x3 so letR = (x3, mx3 + c) �

Now if P, Q have rational coordinates, m, c ∈ Q, so if A, B ∈ Z then x1x2x3 = −(−B) =B so if x1, y1 ∈ Q and x2, y2 ∈ Q so does x3 and hence y3 ∈ Q.

This simple observation enables us to generate new Q solutions or points on E(R) out ofold.

141

Vertical lines : We say each vertical line meets the curve again “at∞” and give this pointa label 0 or zero.

Note:

1. P ′ = Q′ is possible but we still get a third point R.

2. If P ′′Q′′ is vertical, then their x-coordinates are the same so P ′′ = (x1, y1) , Q′′ =

(x2, y2) ⇒ x1 = x2 so y21 = x3

1 − Ax1 − B = x32 − Ax2 − B = y2

2. Hence y2 = −y1

and the curve is symmetric about OX.

Definition of the group law: definition of +If P,Q ∈ E(R) and R′ has the same x-coordinate as R, the third point on the line throughP and Q, but with y-coordinate negated, let R′ = P +Q. This defines +.

• P = (x1, y1) , Q = (x2, y2) , R′ = (x3, y3) then x1 6= x2 ⇒

x3 =(y2−y1x2−x1

)2

− x1 − x2

y3 = − y2−y1x2−x1

x3 − y1x2−y2x1

x2−x1

}(A).

• P = Q ⇒x3 =

(3x2

1−A2y1

)2

− x1 − x2

y3 =3x2

1−A2y1

(x1 − x3)− y1

(B).

142

• P = Q′ ⇒ 0 = P +Q.

Notes:

1. The proof of these formulas are an exercise in coordinate geometry.

2. P ′′ = P since (x1, −(−y1)) = (x1, y1).

3.

Q+ P =

((y1 − y2

x1 − x2

)2

− x1 − x2, −y1 − y2

x1 − x2

x3 −y2x1 − y1x2

x1 − x2

)

=

((y2 − y1

x2 − x1

)2

− x1 − x2, −y2 − y1

x2 − x1

x3 −y1x2 − x2y1

x2 − x1

)= P +Q

so + is commutative.

4. + is also associative (P +Q) +R = P + (Q+R).

5. + takes a point with Q coordinates to a point with Q coordinates i.e. + : E(Q)×E(Q)→ E(Q).

143

Write 2P instead of P + P and nP for P + (n− 1)P .

Note: We could have P 6= 0 but nP = 0 for some n > 1.

Ex y2 = x3 − 63x− 162 : P1 = (−6, 0) , P2 = (−3, 0) , P3 = (9, 0) all satisfy 2Pi = 0

Definition If nP = 0 with P 6= 0 we say P is a torsion point.

Ex y2 = x3 − 2, 52 = 33 − 2 ⇒

P = (3, 5) ∈ E(Z) ⊂ E(Q)

2P =

(129

100, − 383

1000

)3P =

(164, 323

29, 241, −66, 234, 835

5, 000, 211

)etc and nP 6= 0 ∀n ∈ N.

144

Ex y2 = x3 − 11, P = (3, 4) , Q = (15, 58) generate an ‘independent’ set of twodimensions nP +mQ = 0 ⇒ n = m = 0 and E(Q) has no torsion points.

Ex (Mestre) y2 − 246xy + 36, 599, 029y = x3 − 19, 339, 780x − 36, 239, 244 has at least12 independent points.

Conjecture ∀n ∈ N ∃ an elliptic curve with at least n independent points.

Note: Finding points can be difficult: (Bremner, Cassels) y2 = x3 + 877x; P =(0, 0) , 2P = 0 the next simplest point is

(375494528127162193105504069942092792346201

6215987776871505425463220780697238044100,

256256267988926809388776834045513089648669153204356603464786949

490078023219787588959802933995928925096061616470779979261000

)

Theorem (Mazur) The number, t, of torsion points for an elliptic curve E(Q) isone of {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 16}. Also, if A,B ∈ Z, a torsion point has integralcoordinates and either y = 0 (so 2 (x, y) = 0) or y2 |∆ := 4D.

There are infinitely many non-torsion points if there are any at all.

145

Definition The rank of a curve E(Q) is an integer r such that there are r independentpoints on the curve and every rational point can be expressed as a sum of multiples ofthese points and some torsion point.

Theorem (Mordell) 0 6 r <∞ so each curve has finite rank.

Note:

1. The structure of E(Q) is that of a finitely generated abelian group. G ∼= T ⊕ Zrwhere the torsion elements T form a subgroup.

2. Finding r is a very difficult problem.

Elliptic curves mod pWe can often get insight into points on curve with rational coordinates, E(Q), by consid-ering them modulo p, where p is a prime. If p 6= 2 or 3, formulas (A) and (B) still workso they define + for E(Zp)×E(Zp)→ E(Zp) with Zp = {[ 0 ] , . . . , [ p− 1 ]} = GF (p) thefinite field of order p.

Ex

y2 ≡ x3 − Ax−B (mod p)

y2 ≡ x3 + x+ 2 (mod 11)

146

The solutions are {(1, ±2) , (2, ±1) , (4, ±2) , (5, 0) , (6, ±2) , (7, 0) , (10, 0)} = S and0 =∞ making 12. All points are torsion since they are finite in number.

How many points are there in E(Zp)? x can have p values, so x3 − Ax − B at mostp. If these were random, we would expect about half to be quadratic residues and halfnon-residues, the residues giving two possible values of y.

Theorem (Hasse) |#E(Zp)− (p+ 1)| < 2√p

Proposition p ≡ 1 (mod 4) ⇒ the number of points on y2 ≡ x3 − x is exactly p(mod p) (including ∞).

Any integral solution of y2 = x3−Ax−B becomes a modular solution of the congruencey2 ≡ x3 − AX −B (mod p).

Warning: Over Z, we may have D 6= 0 (a requirement), but D ≡ 0 (mod p) whenp |∆. and such a curve would not be elliptic mod p. If ∆ 6≡ 0 (mod p) we have goodreduction, so we assume this is so.

147

E : y2 = x3 − Ax−B, A,B ∈ Z, ∆ = 16(4A3 − 27B2) 6= 0, D = 4A3 − 27B2

Map

θ : Z → Z�pZ = Fp = GF (p) (Zp before)

n 7→ [n ]p

Theorem (Nagell-Lutz) Points P = (x, y) of E(Q) of finite order, other than 0, i.e.the torsion points, have integer coordinates, i.e. are in E(Z), and either y = 0 or y | D.

We map E(Q) to E(Zp) through considering y2 ≡ x3 − Ax−B (mod p)

Theorem (Reduction Theorem) Let T ⊂ E(Q) be the subgroup of all points of finiteorder (the torsion subgroup). If p - 2D, p ∈ P then reduction mod p is an isomorphismof T onto a subgroup of E(Zp).

Theorem (Lagrange) If S ⊂ G and S is a subgroup of the finite group G, then#(S) |#(G)i.e. the order of S divides the order of G.

Corollary If P ∈ T and the order of P in E(Q) is m ∈ N then m |#E(Zp) ∀p - 2D.

These theorems can be used to determine the points of finite order of an elliptic curveE(Q).

148

Ex E : y2 = x3 + 3, D = −35 so let p > 5, p ∈ P. Then #E(Z5) = 6, #E(Z7) =13 ⇒ #T |6 and #T |13 ⇒ #T = 1 ⇒ T = {0} so T has no (finite) points of finiteorder. Note (1, 2) ∈ E(Q) since 22 = 13 + 3, so (1, 2) has infinite order and E(Q) hasan infinite number of points.

Ex E : y2 = x3−43x+166, D = 91215 ·13. Exploring small integers (x, y) ∈ Z2 we findP = (3, 8) ∈ E. Using the point doubling formula above the x-coordinates of 2P, 4P, . . .are x(P ) = 3, x(2P ) = −5, x(4P ) = 11, x(8P ) = 3 so x(P ) = x(8P ) ⇒ 8P = ±P soP is a point of finite order.

Since 3 - 2D, by the Reduction Theorem, T is isomorphic to a subgroup of E(Z3).#E(Z3) = 7 so #(T ) = 1 or 7. But 0 ∈ T and so does P = (3, 8) so #(T ) = 7. The onlyabelian group of order 7 is Z7, a cyclic group generated by P (which must be of order 7since its order divides 7). Computing {0, P, 2P, 3P, 4P, 5P, 6P} we get

T = {0, (3, ±8) . (−5, ±16) , (11, ±32)}.

Congruent Number Problem

Find a simple test to determine whether or not n ∈ N is the area of a right triangle, allof whose sides are of Q length.

149

Ex

6 = 12· 3 · 4 so 6 is congruent.

Ex Fermat n = 1 is not congruent. (X4 + Y 4 6= Z4 ∀X, Y, Z ∈ Z).

Ex Euler n = 7 is congruent.

{1, 2, 3, 4} are not congruent but {5, 6, 7, } are congruent.

Problem: Find a nice criteria to check n.

Theorem (Tunnell, 1983) Let n be an odd square-free natural number. Then if n iscongruent, the number of triples satisfying 2x2 + y2 + 8z2 = n is twice the number oftriples (x, y, z) satisfying 2x2 + y2 + 32z2 = n.

Let n be square-free and let X, Y, Z (X < Y < Z) be sides of a right triangle with arean. The number n ∈ N is fixed.

150

So n = 12XY, X2 + Y 2 = Z2.

Proposition There is a 1-1 correspondence between the right triangles given above andrational numbers x for which x, x + n, x − n are each the square of a rational number.The correspondence is

(X, Y, Z) 7→ x =

(Z

2

)2

x 7→ X =√x+ n−

√x− n

Y =√x+ n+

√x− n

Z = 2√x

In particular, n is congruent ⇔ ∃x ∈ Q+ such that x, x+n, x−n are squares of rationalnumbers.

Proof. (⇒) Let X, Y, Z ∈ Q+ be a triple with n = 12XY, X2 + Y 2 = Z2. Then

X2+Y 2 = Z2 and 2XY = 4n ⇒ (X±Y )2 = Z2±4n ⇒ (1)(X±Y

2

)2=(Z2

)2±n = x±n

151

if x =(Z2

)2. So x, x± n are squares of rational numbers.

(⇐) Given x, x± n being squares, then

X =√x+ n−

√x− n

Y =√x+ n+

√x− n

Z = 2√x

satisfy X < Y < Z and X, Y, Z ∈ Q+. Finally

XY = (√x+ n−

√x− n)(

√x+ n+

√x− n) = (x+ n)− (x− n) = 2n

and

X2 + Y 2 = (x+ n) + (x− n)− 2√

(x+ n)(x− n)

+ (x+ n) + (x− n) + 2√

(x+ n)(x− n)

= 4x

= Z2.

Let n be a congruent number. By the above equation (1),(X ± Y

2

)2

=

(Z

2

)2

± n if n =1

2XY

152

Multiply these two equations together:(X2 − Y 2

4

)2

=

(Z

2

)4

− n2

so v2 = u4 − n2 has a rational solution v = X2−Y 2

4, u = Z

2. Now multiply by u2 :

u6−n2u2 = (uv)2. Let x = u2 =(Z2

)2(as before) and y = uv = (X2−Y 2)Z/8 ⇒ a pair

(x, y) ∈ Q2 satisfying y2 = x3 − n2x—an elliptic equation y2 = x(x− n)(x+ n).

Hence if n is congruent, the curve y2 = x3 − n2x has a nontrivial rational point.Theconverse, that any point (x, y) ∈ Q2 must come from such a triangle is false in general.We need extra conditions equivalent to ∃Q ∈ En(Q) such that (x, y) = P = 2Q i.e. P isa (rational) point which is double a rational point.

Theorem 32B Let (x, y) ∈ Q2 be on y2 = x3 − n2x. Let x satisfy

(i) it is the square of a rational number,(ii) its denominator is even,(iii) its numerator is coprime with n.

Then there is a right triangle with rational sides and area n under the correspondence ofthe above Proposition.

Proof. Let u =√x ∈ Q+ (i) and let v = y

u∈ Q+. Since (x, y) is on En(Q) : v2 = y2

x=

x2 − n2 ⇒ v2 + n2 = x2 (1). Let t ∈ N be the denominator of u, i.e. the smallest N so

153

tu ∈ Z. By (ii) t is even.Because n ∈ N, the denominators of v2 and x2 are the same by (1), namely t4.Hence (t2v)2 +(t2n)2 = (t2x)2 is a primitive Pythagorean triple with t2n even. (Primitivethrough (iii).) Hence ∃a, b ∈ Z such that t2n = 2ab, t2v = a2 − b2, t2x = a2 + b2. Thenthe right triangle with sides 2a

t, 2bt, 2u has area 1

22atsbt

= 2abt2

= n. Finally, the image of

this triangle with X = 2at, Y = 2b

t, Z = 2u is x =

(Z2

)2as required. �

Ex (i) and (ii) alone are not sufficient : n = 5, x = 254, y = 75

8⇒ X =

√x+ n −√

x− n =√

5 6∈ Q.

Back to Tunnell’s theorem n odd and square-free,

n congruent ⇒ #{(x, y, z) ∈ Z3 : 2x2 + y2 + 8z2 = n}= 2#{(x, y, z) ∈ Z3 : 2x2 + y2 + 32z2 = n} (B)

(A) ⇒ (B)

Then, subject to an unproved conjecture, (B) ⇒ (A). We can confidently use not(B) ⇒not(A)

Ex #{(x, y, z) : 2x2 + y2 + 8z2 = n} = #{(x, y, z) : 2x2 + y2 + 32z2 = n} if n < 8. Sonone of {1, 62, 3, 64, 5, 66, 7} can be congruent unless the size of each set is O(2 · 0 = 0).But x2, y2 ≡ 0, 1 or 4 (mod 8) ⇒ 2x2 + y2 + 8z2 6≡ 5, 7 (mod 8). So e.g. if n = 5 or 7both of the sets of triples are ∅.

154

Ex The first congruent number n ≡ 1, 3 (mod 8) is n = 41:

If (B) ⇒ (A) is true, the above argument would imply all of the following (odd, square-free) numbers are congruent, through 2 · 0 = 0: {5, 7, 13, 15, 21, 23, 29, 31, 37, 39, 47}.

155

156

157

12 Numbers Rational and Irrational

If α ∈ R we say α ∈ Q if α = mn, m, n ∈ Z, n 6= 0.

Proposition√

2 6∈ Q.

Proof. Assume√

2 ∈ Q

⇒√

2 =a

b, (a, b) = 1 (???)

⇒ a =√

2b

⇒ a2 = 2b2 ⇒ 2 |a2 ⇒ 2 |a

So a = 2c and 4c2 = 2b2

⇒ 2c2 = b2 ⇒ 2 |b2 ⇒ 2 |bHence 2 |a and 2 |b so 2 |(a, b) so (a, b) 6= 1 (!!!). �

We say√

2 is irrational or√

2 ∈ I = R \Q.

We can generalise the above proposition to get a much wider family of irrational numbers:

Theorem 33 If x ∈ R satisfies the equation

xn + c1xn−1 + · · ·+ cn = 0

where ci ∈ Z, then x is either an integer or an irrational number.

158

Proof. Let x ∈ Q i.e. x = ab, b > 0, (a, b) = 1. Then

an = −b(c1an−1 + c2an−2b+ · · ·+ cnb

n−1)

If b > 1, then p | b ⇒ p | an ⇒ p | a but then p | (a, b) (!!!). Hence b has no primedivisors, so b = 1. �

Corollary If m ∈ N is not an nth power then m1/n = n√m ∈ I since α = m1/n satisfies

xn −m = 0.

Trigonometric function values and π

Lemma 1 Let g ∈ Z[x] (i.e. a polynomial with integral coefficients). Let h(x) = xng(x)n!

.If j 6= n, h(j)(0) is an integer divisible by (n + 1). If g(0) = 0, h(n)(0) is an integerdivisible by (n+ 1).

Proof. Let

xng(x) =1

n!(cnx

n + cn+1xn+1 + · · ·+ cjx

j + · · · )

where c0, · · · , cn−1 = 0 and the ci are integers.

Then the j’th derivative

h(j)(0) =cjj!

n!.

If j < n, cj = 0.

159

If j > n, n+ 1 |h(j)(0) since j!n!

= (n+ 1)(n+ 2) · · · (j).

If j = n ⇒ h(j)(0) = cj ⇒ h(n)(0) = cn but g(0) = 0 ⇒ xng(x) = xn[g1x+g2x2+· · · ] =

cn+1xn+1 + · · · ⇒ n+ 1 |h(j)(0). �

Lemma 2 If f(x) is a polynomial in (r − x)2, then, for any odd positive integer j,f (j)(r) = 0 i.e. f ′(r) = 0, f ′′′(r) = 0, · · · .Proof. Let j ≥ 0 be an integer and n ∈ N a positive integer.

f(x) = a0 =⇒ f (2j+1)(x) = 0 =⇒ f (2j+1)(r) = 0

f(x) = (r − x)2 =⇒ f ′(x) = −2(r − x) =⇒ f ′(r) = 0

f(x) = (r − x)2n, 2j + 1 > 2n =⇒ f (2j+1)(x) = 0 =⇒ f (2j+1)(r) = 0

f(x) = (r − x)2n, 2j + 1 < 2n =⇒ f (2j+1)(x) = (−1)2j+1 2n!

2j + 1!(r − x)2n−2j−1

=⇒ f (2j+1)(r) = 0.

Therefore if f(x) if a sum of even powers of (r − x) all of its odd derivatives vanish atx = r. �

Theorem 34 π is irrational, i.e. π ∈ I.Proof. Let f(x) = xn(1−x)n

n!where n ∈ N.

160

By Lemma 1 above, ∀j, f (j)(0) ∈ Z and f(x) = f(1 − x) ⇒ f (j)(1) ∈ Z. Since0 < x < 1 ⇒ 0 < xn < 1 and 0 < 1 − x < 1 ⇒ 0 < (1 − x)n < 1 we have0 < f(x) < 1

n!(1).

Let π2 = ab, a > 1, b > 1, a, b ∈ N (???). Let

F (x) = bn[π2nf (0)(x)− π2n−2f (2)(x) + π2n−4f (4)(x)− · · ·+ (−1)nf (2n)(x)].

So F (0) ∈ Z, F (1) ∈ Z. Now

d

dx

{F ′(x) sinπx− πF (x) cosπx

}=

{F (2)(x) + π2F (x)

}sinπx

= bnπ2n+2f(x) sinπx

= π2anf(x) sinπx

161

So

πan∫ 1

0

f(x) sinπx dx =

[F ′(x) sinπx

π− F (x) cosπx

]1

0

= F (1) + F (0) ∈ Z.

But by (1),

0 < πan∫ 1

0

f(x) sinπx dx <πan

n!< 1 for n > n0

which is a contradiction. Hence π2 is irrational. �

Corollary π is irrational: If not π2 would be rational. ’

Note: With a similar, but more complex proof, we can show r ∈ Q \ {0} ⇒ cos r isirrational.

Corollary 1 to the note π is irrational, since if π ∈ Q, cosπ ∈ I but cosπ = −1.

Corollary 2 All trigonometric functions are irrational at non-zero rational values oftheir arguments.

Proof. r ∈ Q and sin r ∈ Q ⇒ cos2r = 1 − 2 sin2 r ∈ Q, which is false. Similarly,tan r ∈ Q ⇒ cos 2r = 1−tan2 r

1+tan2 r∈ Q. �

Corollary 3 Any non-zero value of an inverse trigonometric function is irrational atrational values of the argument.

162

Proof. Let r ∈ Q and arccos r = cos−1 r = s. Suppose s ∈ Q ⇒ cos s = r which is false.�

Exponential, hyperbolic and logarithmic functions

Note: e0 = 1 ∈ Q and sinh 0 = 0, cosh 0 = 1 but these are the only rational values atrational arguments. The proof is similar to Theorem34 based on cosh:

Corollary 4 er ∈ Q ⇒ e−r = 1er∈ Q ⇒ er+e−r

2∈ Q but this is not possible if r ∈ Q.

Theorem 35 e is irrational, e ∈ I.Proof. Claim: ∀n ∈ N

0 < e−n∑j=0

1

j!<

1

n · n!(1)

Represent e by an infinite series e = 1 + 11!

+ 12!

+ · · ·+ 1j!

+ · · · so

e−n∑j=0

1

j!=

∞∑j=n+1

1

j!> 0

163

Also

e−n∑j=0

1

j!=

1

(n+ 1)!+

1

(n+ 2)!+ · · ·

=1

n!

[1

n+ 1+

1

(n+ 1)(n+ 2)+

1

(n+ 1)(n+ 2)(n+ 3)+ · · ·

]<

1

n!

[1

n+ 1+

1

(n+ 1)2+

1

(n+ 1)3+ · · ·

]=

1

n!

[1/(n+ 1)

1− 1/(n+ 1)

](sum of a geometric series r = 1

n+1)

=1

n!

1

n

which proves the claim.

Now let e = mn, m, n ∈ N, (m, n) = 1 (???), and assume n 6= 1. Let

η = n!

(e−

n∑j=0

1

j!

)By (1)

0 < η < n!1

n · n!=

1

nBut

η = n!

(m

n− 1− 1

1!− 1

2!− · · · − 1

n!

)∈ Z (!!!)

164

Hence e is irrational. �

Corollary√e is irrational, since otherwise e = (

√e)2 would be in Q.

Question: e seems to be ‘more’ irrational than√

2. We will explore families of irrationalnumbers below.

Let S ⊂ R be a subset. We say S has measure zero if it is possible to cover S with afinite or countable set of intervals of arbitrarily small total length. Write µ(S) = 0.

Ex S = N:

1 ∈(

1− ε

2, 1 +

ε

2

)2 ∈

(2− ε

22, 2 +

ε

22

)j ∈

(j − ε

2j, j +

ε

2j

)= Ij

So

N ⊂∞⋃j=1

Ij

165

and

`(Ij) = length of Ij

= j +ε

2j−(j − ε

2j

)= 2

ε

2j

Then

∞∑j=1

`(Ij) = 2ε∞∑j=1

1

2j

= 2ε

which can be made arbitrarily small by choice of ε > 0. Hence µ(N) = 0. We can replaceN by any countable set A = {an : n ∈ N} ⊂ R. by defining Ij =

(aj − ε

2j, aj + ε

2j

)since

`(Ij) = 2ε2j

.

166

Definition A property of real numbers is said to hold “almost everywhere” or toalmost all numbers, if the set of numbers which do not have the property has measurezero.

Ex µ(Q) = 0 since Q is countable. Hence almost all numbers are irrational.

Note We can count the numbers in Q+ via listing them and then counting the diagonals,skipping any already counted.

r1 r3 → r4↓ ↗ ↙r2 ×↙

r5↓

11

12→ 1

3· · ·

↓ ↗ ↙21

22

23· · ·

↙31

32

33· · ·

167

1/1 1/2 → 1/3 · · ·↓ ↗ ↙

2/1 2/2 2/3 · · ·↙

3/1 3/2 3/3 · · ·↓

⇒ Q+ = {rn : n ∈ N}.

Since `([0, 1]) = 1 > 0, [0, 1] and (hence) R are not of measure 0. Hence, since the unionof any two countable sets is countable, the irrational numbers I are not countable.

Proof. A = {an : n ∈ N}, B = {bn : n ∈ N} ⇒ A ∪ B = {cn : c2n = an, c2n−1 = bn, n =1, 2, 3, . . .} so A ∪B is countable. �

Now let S ⊂ R. We say S is dense in R if ∀α < β ∃x ∈ S with α < x < β.

Archimedian Axiom (AA) ∀ε > 0 ∃n ∈ N such that 0 < 1n< ε.

Proposition Q is dense in R.

Proof. Let α < β¿ By AA ∃n ∈ N such that 0 < 1n< β − α. Let m ∈ Z satisfy

m < nβ 6 m + 1. Then α < β − 1n6 m+1

n− 1

n= m

nand m

n< β. Hence α < m

n< β and

we can let x = mn

. �

168

Proposition I is dense in R.

Proof. Let α, β ∈ R have α < β. Let α < mn< β as above, and using AA choose k ∈ N

so

0 <1

k<β − m

n√2.

Then α < mn< m

n+√

2k< β and x = m

n+√

2k∈ I. �

Definition A number is algebraic if it satisfies an equation

xn + a1xn−1 + a2x

n−2 + · · ·+ an = 0

with ai ∈ Q.

Ex√

2 satisfies x2 − 2 = 0.

The unique polynomial with leading coefficient 1 (called monic) inQ[x] of minimal degreewhich has a given algebraic number α as a root is called the minimal polynomial of α,and the degree of this polynomial is called the degree of α.

The set of all algebraic numbers is called A ⊂ R.

Proof. If An is the set of algebraic numbers of degree n for n = 1, 2, 3, . . . then

A =∞⋃n=1

An

169

There are a countable number of polynomials of degree n with Q coefficients since p(x) =xn + a1x

n−1 + · · ·+ an ↔ (a1, . . . , an) ∈ Qn and the latter is a countable set.

But each polynomial has at most n roots in R ⇒ An is countable. To complete theproof we need to assume that a countable union of countable sets is countable. To seethis, use the diagonal counting trick:

A1 = {a11, a12, → a13, . . .}↓ ↗ ↙

A2 = {a21, a22, a23, . . .}↙

A3 = {a31, a32, a33, . . .}↓...

Since µ(A) = 0, almost all numbers are not algebraic. We call these numbers transcen-dental and the set of all such numbers T = R \ A.

Ex21/3 +

√2√

3∈ A, π and e ∈ T

The former is not difficult, but π and e are both very difficult.

170

BothQ and I (and T) are dense in R. This implies each real number can be expressed as thelimit of rational numbers : Let α ∈ R then give n ∈ N ∃rn ∈ Q with α− 1

n< rn < α + 1

n

so |α− rn| < 1n⇒ α = limn→∞. But this universal fact gives little insight into the

difference between Q, A and T.

Definition A real number α is said to be approximable by rationals to order n ∈ Nif ∃ a constant C = C(α) > 0 such that the inequality∣∣∣∣α− h

k

∣∣∣∣ < C

kn

has infinitely many rational solutions hk

where k > 0, (h, k) = 1.

Note Approximable to order 3 ⇒ Approximable to order 2 and 1.

Theorem 36 If α ∈ I, ∃ infinitely many hk∈ Q with∣∣∣∣α− h

k

∣∣∣∣ < 1

k2

i.e. α is approximable to order 2.

Proof. See page 75. If α ∈ I its continued fraction expansion is infinite so the set ofconvergents pn

qnis infinite and ∣∣∣∣α− pn

qn

∣∣∣∣ < 1

qnqn+1

<1

q2n

171

so we can let hk

= pnqn

.OR Let n ∈ N. Consider the n+ 1 real numbers

S = {0, α− bαc , 2α− b2αc , . . . , nα− bnαc}

and their distribution in the intervals jn6 x < j+1

n, j = 0, . . . , n − 1 which cover [0, 1),

so contain all of the numbers in S. Hence (by the Dirichlet pigeon-hole principle) twonumbers lie in the same interval, say 0 6 n1 < n2 6 n, n1α − bn1αc , n2α − bn2αc ∈[jn, j+1

n

). The length of this interval is 1

nso

|(n2α− bn2αc)− (n1α− bn1αc)| <1

n

Let k = n2 − n1 and h = bn2αc − bn1αc , k ∈ N, h ∈ Z. so |kα− h| < 1n

andk 6 n (1) ⇒

∣∣α− hk

∣∣ < 1nk6 1

k2 . Suppose there were only a finite number of such pairs(h, k) : (h1, k1) , . . . , (hr, kr). Let

ε = min

{∣∣∣∣α− h1

k1

∣∣∣∣ , . . . , ∣∣∣∣α− hrkr

∣∣∣∣} > 0.

Use AA to find n ∈ N with 0 < 1n< ε so ∃h, k by (1) so

∣∣α− hk

∣∣ < 1nk6 1

n< ε so h

k6= hi

ki(!!!). �

Theorem 37 Any rational number is approximable to order 1, but not to any higherorder.

Proof. Let α = ab, (a, b) = 1, b > 1 be rational. Then there are infinitely many solutions

(x, y) to ax − by = 1 (x = x0 + bt, y = y0 + at, t ∈ Z if(x, y) is one solution) and

172

infinitely many with x > 0. Then

ax− by = 1 ⇒∣∣∣ab− y

x

∣∣∣ =1

bx<

2

x

Hence α is approximable to order 1.If y

x∈ Q and y

x6= a

bthen ∣∣∣a

b− y

x

∣∣∣ =

∣∣∣∣ax− bybx

∣∣∣∣ > 1

bx

there is no constant C such that 1bx< C

x2 for infinitely many x ∈ N. Henceab

= α is notapproximable to any order higher than 1. �

Ex ξ = 110

+ 1102 + 1

204 + · · ·+ 1102m + · · · ∈ I. Let rm = (m+ 1)th partial sum of ξ, rm ∈

Q. |ξ − rm| = 10−2m+1+ 10−2m+2

+ · · · < 2 · 10−2m+1= 2(10−2m)2. rm = an

102m , an ∈ N. Sothis inequality shows we can approximate ξ to order 2 at least. Hence ξ 6∈ Q ⇒ ξ ∈ I.

173

Theorem 38 A real algebraic number α of degree n is not approximable to order n+ 1or higher.

Proof. Theorem 37 is n = 1. Let α satisfy the equation f(x) = a0xn+a1x

n−1+· · ·+an = 0where ai ∈ Z, n > 2. Since the degree of α is n, this polynomial must be irreducible(since otherwise f(x) = g(x) · h(x) ⇒ 0 = f(α) = g(α) · h(α) so g(α) = 0 or h(α) = 0and each would have lower degree than n).If x ∈ (α− 1, α + 1) , |x| < |α|+ 1 so

|f ′(x)| =∣∣na0x

n−1 + (n− 1)a1xn−2 + · · ·+ an−1

∣∣6

∣∣na0xn−1∣∣+∣∣(n− 1)a1x

n−2∣∣+ · · ·+ |an−1|

< n |a0| {|α|+ 1}n−1 + (n− 1) |a1| {|α|+ 1}n−2 + · · ·+ |an−1|= A

Now if hk

is a rational approximation to α with α−1 < hk< α+1 we must have f

(hk

)6= 0,

since otherwise f(x) would have a factor x − hk

over Q, but it is of degree n > 2 andirreducible. Hence ∣∣∣∣f (hk

)∣∣∣∣ =|a0h

n + a1hn−1k + · · ·+ ank

n|kn

>1

kn

By the Mean Value Theorem f(hk

)= f

(hk

)−f(α) =

(hk− α

)f ′(ξ) for some ξ between

hk

and α. Hence ∣∣∣∣hk − α∣∣∣∣ =

∣∣f (hk

)∣∣|f ′(ξ)|

>1

Akn

There is no constant C so that aAkn

< Ckn+1 for infinitely many k, hence α is not approx-

imable to order n+ 1 or higher. �

174

Definition A Liouville Number η ∈ R satisfies ∀m ∈ N ∃hmkm∈ Q such that

∣∣∣η − hmkm

∣∣∣ <1kmm

.

Proposition Any Liouville Number is transcendental.

Proof. ∀m > n+1,∣∣∣η − hm

km

∣∣∣ < 1kmm

< 1kn+1m

¿ So η cannot be algebraic of order n+1 ∀n ∈ N.

Ex η1 = 10−1! + 10−2! + · · ·+ 10−m! + · · · = 0.1100010 . . . and hmkm

is the mth partial sum

so km = 10m! ∣∣∣∣η1 −hmkm

∣∣∣∣ = 10−(m+1)! + 10−(m+2)! + · · ·

< 2 · 10−(m+1)!

< (10m!)−m

=1

kmm.

Ex (Baker)(Alledi, 1979)∣∣log(2)− a

b

∣∣ > 11010b5.8

∀ab∈ Q.

Ex (Baker, 1964)∣∣ 3√

2− ab

∣∣ > Cb296∀ab∈ Q.

Ex (Mahler, 1953)∣∣π − a

b

∣∣ > 1b42∀ab∈ Q.

175

Ex (Gelford, Schneider, 1934) α ∈ A, β ∈ A \ Q, α 6= 0 ⇒ αβ ∈ T. e.g. 2√

2 ∈T,√

2√

2 ∈ T.

Ex (Hermite, 1873) e ∈ T.

Ex (Lindeman, 1882) π ∈ T via α ∈ A \ {0} ⇒ eα ∈ I since eiπ = −1.

RSA Public Key Cryptograms

1. Let p, q ∈ P be large and distinct primes known only to Alice.

2. Alice selects a (large) random integer e relatively prime to r := (p− 1)(q − 1).

3. Alice publishes n = pq and e (say in a newspaper—the public key).

4. The sender Bob has a message and encodes this in an integer 1 < m < n.

5. He uses the public key (n, e) to compute the least positive residue c ≡ me (mod n).The encrypted message is c.

6. He sends c to Alice using any (public) transmission process.

176

7. Alice recovers m from c using

φ(n) = n∏p|n

(1− 1

p

)

= pq

(1− 1

p

)(1− 1

q

)= (p− 1)(q − 1)

= r

177

as follows:

8. Compute d ≡ e−1 (mod r) using xe+ yr = 1 so de ≡ 1 (mod r).

9. Then cd ≡ (me)d ≡ med ≡ m1+`φ(n) ≡ m(mφ(n))` ≡ m · 1` (mod n) ⇒ cd ≡ m(mod n) and 1 < m < n ⇒ we have recovered m.

Code Cracking

Given n = pq where p, q ∈ P and are large, find p and q.

Method 1 Either p 6√n or q 6

√n so for 1 6 j 6 b

√nc try j |n until it succeeds.

• If n ∼ 10100,√n ∼ 1050.

• If we can check 1 million divisors on average, per second then we need 3.2 × 1037

years= T .

• If we speed this up by 1 million times (i.e. 1012 per second), T = 3.2× 1031 years.

Method 2 (Pollard’s p − 1, 1974) Let n have a prime factor p such that p − 1 is aproduct of (high powers) of small primes. By Fermat’s little theorem, if p - a, ap−1 ≡ 1

178

(mod p) ⇒ p | (ap−1 − 1, n). The prime p is unknown. So first let k = 2α13α2 · · · rαswhere 2, 3, . . . , r are the first s primes and αi ∈ N and are “small”.Compute

(ak − 1, n

)=((ak − 1) (mod n), n

), which can be done in O(log2(2kn)) oper-

ations. If ∃p |n with p− 1 |k and (a, n) = 1 then p |ak − 1 since ak = a(p−1)` = (ap−1)` ≡1` ≡ 1 (mod p) and so

(ak − 1, n

)> p > 1.

If(ak − 1, n

)6= n we have a non-trivial factor of n, so we can divide by it, and repeat

this process on each factor.

If(ak − 1, n

)= n choose a new a.

If(ak − 1, n

)= 1 choose a larger k.

Ex n = 246, 082, 373

1. Compute 2n−1 6= 1 (Mathematica: PowerMod[2, n-1, n]—very slow). Hence n isnot prime, by Fermat’s little theorem. ⇒ n is composite.

2. Let a = 2, k = 22 · 32 · 5 = 180 then k = 180 = 22 + 24 + 25 + 27 in base 2. We needto compute 22i (mod n) for 0 6 i 6 7 we can do a few extras by successive mod n

179

squaring :i 22i (mod n)0 21 42 163 2564 65,5365 111,566,9556 166,204,4047 214,344,9978 111,354,9989 82,087,367

10 7,262,56911 104,815,687

Using the table

2180 = 222 · 224 · 225 · 227

≡ 16 · 65, 536 · 111, 566, 955 · 28, 795, 219 (mod 246, 082, 373)

≡ 121, 299, 227 (mod n)

then, using the Euclidean algorithm:(2180 − 1, n

)= gcd(121, 299, 226, 246, 082, 373) = 1

so the test fails because n has no factor p with p− 1 dividing 180.Choose a new k = {2, 3, . . . , 9} = 23 · 32 · 5 · 7 = 2520. In base 2 2520 = 23 +

180

24 + 26 + 27 + 28 + 211 so 22520 ≡ 223 · 224 · · · 2211 ≡ 101, 220, 672 (mod n) and(22520 − 1, n) = gcd(101, 220, 672, 246, 082, 373) = 2521 so 2521 |n and we have afactor. Indeed n = 2521 · 97613 and each of these factors is prime.

Summary (Pollard p− 1) n > 2 composite given.

1. k = {1, 2, 3, . . . , K}, a product of small primes to small powers.

2. Choose arbitrary a in 1 < a < n, say a = 2.

3. Calculate (a, n). If more than 1 then a is a factor of n so return a.

4. Let d =(ak − 1, n

).

If 1 < d < n return d.If d = 1 go to 1. and choose K → K + 1.If d = n go to 2. and choose another a.

Pollard’s algorithm eventually returns a proper factor since we will reach K = 12(p − 1)

so k = 2α1 · · · 12(p− 1) and (p− 1) |k.

Note: The algorithm is fast only when n has a prime factor p such that p − 1 is theproduct of small primes to small powers, i.e. K is reasonably small.

Method 3 (Lenstra, ECM, 1987) Non-zero elements of(Z�pZ

)= Fp form a multi-

plicative group of order p− 1 so p− 1 |k ⇒ ak = 1 in the group. This makes Pollard’s

181

p − 1 work. Here the group F∗p (so-called multiplicative group) is replaced by the groupof points on an elliptic curve E(Fp) and a by a point P ∈ E(Fp).As before, choose k a product of small primes. If #E(Fp) | k ⇒ kP = 0 in E(Fp) andthis will often allow us to find a non-trivial factor of n.

Lenstra is good if for some curve E(Q) and some p ∈ P, p |n and #E(Fp) is a product ofsmall primes. If we lose with Pollard, the game is over, e.g n = pq and p− 1, q − 1 bothhave large prime factors. If we lose with Lenstra, we simply choose a new curve.

Note: (Subject to a conjecture but crucial underpinning) #E(Fp) = p+1−εp, |εp| 6 2√p

and for fixed p, as we pass over all such curves, the numbers εp are well spread in theinterval

[−2√p, 2√p]

so it is likely we will find a curve E with #E(Fp) =product ofsmall primes.

Summary (Lenstra, ECM) n > 2 a composite integer.

1. Check (n, 6) = 1 and n 6= mr for any r > 2.

2. Choose random integers b, x1, y1 with 1 < b, x1, y1 < n.

3. Let c = y21 − x2

1 − bx1 (mod n)Let E : y2 = x3 + bx+ cLet P = (x1, y1) ∈ E

4. Check g = (rb3 + 27c2, n) = 1. If g = 1 we have ‘bad reduction’ so go back and geta new b. If 1 < g < n we have a non-trivial factor, so return g.

182

5. Let k = {1, 2, . . . , K} for some K ∈ N.

6. Compute

kP =

(akd2k

,bkd3k

)7. Calculate d = (dk, n).

If 1 < d < n, return d.If d = 1 go to 2. and choose a new curve.If d = n go to 5. and decrease k.

So how/why does it work? What is step 6.?

Suppose we eventually found a curve E such that for p |n,#E(Fp) |k, then each P ∈ E(Fp)has an order o(P ) |#E(Fp) | k so o(P ) | k hence kP = 0 i.e. kP is the point at ∞, 0.Then p |dk (see � below). Hence p |(dk, n) and normally n - dk.

How to compute kP efficiently ((P + P ) + P + · · · is too slow. k = k0 + k1 · 2 + k2 ·22 + · · ·+ kr · 2r in binary ki ∈ {0, 1}.

P0 = PP1 = 2P0 = 2PP2 = 2P1 = 22P

...Pr = 2Pr−1 = 2rP

kP =

∑ki=1

Pi

183

(2 log2(k) steps).

All computations are done mod nQ1 = (x1, y1) , Q2 = (x2, y2) , xi, yi ∈ Zn (integers mod n).Q3 = Q1 +Q2 = (x3, y3)

x3 = λ2 − x1 − x2

y3 = −λx3 − (y1 − λx1)

where λ = y2−y1x2−x1

where the division is carried out mod n. Note that in Z�nZ = Zn, x2−x1

may not have an inverse. Then

If (x2 − x1, n) = 1 ⇒ inverse exists.If 1 < (x2 − x1, n) < n return this.If (x2 − x1, n) = n go back to 2. or 5. in Lenstra.

To double a point Q = (x, y) (mod n) we need

λ =f ′(x)

2y=

3x2 + 2ax+ b

2y(mod n)

and the same choices 1.,2. or 3. apply based on (2y, n).

Ex n = 1, 715, 761, 513, 2n−1 ≡ 93, 082, 891 (mod n) ⇒ n 6∈ P.

1. n is not a power :√n, 3√n, . . . , 31

√n = 1.9855 are not integers (Mathematica: check

n == Floor[n(1/j)]j, for j = 1, . . . , 31. (n, 6) = 1.

184

2.√n ≈ 42, 422 so ∃p | n, p < 42, 422. We want k so that some integer close to p

divides k. Try k{1, 2, . . . , 17} = 12, 252, 240 with lots of factors less than 42,422.

Choosing an elliptic curve: Choose a point P and one coefficient for E, then the otherso the point is on the curve.

Ex P = (2, 1) , c = −7 − 2b e.g. b = 1 ⇒ c = −9 so E : y2 = x3 + x − 9 and(2, 1) ∈ E.

Now compute kP using successive doubling

k = 12, 252, 240

= 24 + 26 + 210 + 212 + 213 + 214 + 215 + 217 + 219 + 220 + 221 + 223

so we need 2iP (mod n) for 0 6 i 6 23.

So

kP ≡ (1, 225, 303, 014, 142, 796, 033)

≡ (421, 401044, 664, 333, 727)

This tells us nothing about the factors of n. It is when the addition law breaks that weget a factor. So we need a new k, a new P or a new curve.

185

k = 12, 252, 240 as before, P = (2, 1) as before. b = 2 ⇒ c = −7 − 2b = −11 soE : Y 2 = x3 + 2x− 11 and P ∈ E and kP (mod n) is still okay. b = 42 ⇒ c = −91 soE : y2 = x3 + 42x− 91, P ∈ E. The addition law breaks an d a factor is delivers. Table(A) 2iP (mod n) is okay. then we start adding up the points to produce (B).

At the penultimate step

(24 + 26 + · · ·+ 221)P = 3, 863, 632P

≡ (1, 115, 004, 543, 1, 676, 196, 055) (mod n)

Then from the new (A) 223P ≡ (1, 267, 572, 925, 848, 156, 341) (mod n) and try to addthese points. We need the inverse mod N of the difference of their x-coordinates, butgcd (1, 115, 004, 543− 1, 267, 572, 925, n) = 26, 927 6= 1. which gives us the factor n =26, 927 · 63, 719.

Note: � Now we see what this means. 0 is not a finite point on the curve, so kP = 0means we are not able to compute mod p coordinates for kP . The only way that canhappen is if p | x2 − x1 a denominator, i.e. kP = 0 means at some stage the process ofbuilding kP (using the appropriate versions of table (A) and (B) above breaks down.

Mathematica: << NumberTheory‘FactorIntegerECM‘ may be bad. ab (mod n) isPowerMod[a, b, n] and appears bad.

Method 4 Search for solutions to x2 ≡ y2 (mod N) since then x2−y2 = (x−y)(x+y) =

186

kN and we might get a factor of N .

187

188

13 ABC Conjecture

A simple but powerful relation between the additive and multiplicative properties ofnumbers.

Definition The radicalN(n) =

∏p|n

p

is the largest square-free divisor of n, or the core of n.

Ex n = 223553 ⇒ N(n) = 2 · 3 · 5N(pα) = p ∀p ∈ P

Proposition N is multiplicative and N(n) ·N(m) = N(nm) ·N((n, m)).

ABC Conjecture: ∀ε > 0 ∃Kε > 0 such that if a, b, c are relatively prime integersand a+ b = c then

max (|a| , |b| , |c|) 6 KεN(abc)1+ε

This is an important unsolved problem. It is so deep it implies the asymptotic Fermat’sLast Theorem.

Ex a = 3k, b = 2 · 3k, c = 3k+1 so a + b = c and max (|a| , |b| , |c|) = c = 3k+1.

189

abc = 2 · 33k+1 so N(abc) = 2 · 3 and 3k+1 6 Kε(2 · 3)1+ε ∀k and some Kε is false for someε > 0. i.e. (a, b, c) = 1 is essential.

Theorem (Asymptotic Fermat Theorem) ABC ⇒ ∃n0 ∈ N so that the Fermatequation xn + yn = zn has no solution in relatively prime integers ∀n > n0.

Proof. Let x, y, z be relatively prime (i.e. each has different prime factors). Note thatN(xnynzn) = N(xyz) 6 xyz 6 z3 since x < z, y < z. Apply ABC with ε = 1 sozn = max (xn, yn, zn) ≤ K1N(xnynzn)2 6 K1z

6 so n log(z) 6 log(K1) + 6 log(z)

⇒ n 6 6 +log(K1)

log(z)6 6 +

log(K1)

log(3)

so let

n0 = 7 +log(K1)

log(3).

Conjecture (Catalan Conjecture) 8 and 9 are the only consecutive powers i.e. theonly solution to Catalan’s equation

xm − yn = 1 �

in positive integers x, y, n,m > 1 is 32 − 23 = 1.

Many special cases of this conjecture have been proved: e.g.

190

1. x2 − yn = 1 has one solution x = n = 3, y = 2.

2. xm − y2 = 1 has no solutions.

So we need only consider n,m > 3.

Theorem (Asymptotic Catalan Conjecture) ABC⇒ the Catalan equation has onlya finite number of solutions.

Proof. Let (x, y,m, n) be a solution with m,n > 3. Then (x, y) = 1 since otherwisep | x, p | y ⇒ p | 1. The ABC Conjecture with ε = 1

4⇒ ∃K1/4 = K such that

max (|a| , |b| , |c|) 6 KN(abc)5/4 But � ⇒ xm = 1 + yn so

yn < xm 6 KN(1 · xm · yn)5/4

= KN(xy)5/4

6 K(xy)5/4

⇒ m log(x) 6 log(K) + 54(log(x) + log(y))

⇒ n log(y) < log(K) + 54(log(x) + log(y))

⇒ m log(x) + n log(y) < 2 log(K) + 52(log(x) + log(y))

⇒(m− 5

2

)log(x) +

(n− 5

2

)log(y) < 2 log(K)

But 2 6 x, 2 6 y so(m− 5

2

)log(2) +

(n− 5

2

)log(2) <

(m− 5

2

)log(x) +

(n− 5

2

)log(y)

< 2 log(K)

191

⇒ m+ n <2 log(K)

log(2)+ 5

Thus there are only finitely many exponents m and n for which � has a solution. It isknown that, for fixed m,n, � has only a finite set of solutions x, y. Hence � has only afinite set of solutions x, y,m, n. �

Wieferich Primes

If p ∈ P is odd 2p−1 ≡ 1 (mod p)If p is such that 2p−1 ≡ 1 (mod p2), p is called a Wieferich prime and we writep ∈ W ⊂ P.

Ex 9 - 22 − 1 so 3 is not one.25 - 24 − 1 so 5 is not one.49 - 26 − 1 so 7 is not one

Problem: Are there an infinite number of Wieferich (or non-Wieferich) primes?

Lemma Let p ∈ P be odd. If ∃n ∈ N so 2n ≡ 1 (mod p) but 2n 6≡ 1 (mod p2) thenp 6∈ W .

192

Proof. Let 2d ≡ 1 (mod p) with d minimal (d > 0). Then d |n (else n = ed+ r, 0 < r < dand 1 ≡ 2n = 2ed · 2r = (2d)e · 2r ≡ 2r (mod p) contradiction d being minimal).

Now 2n 6≡ 1 (mod p2) ⇒ 2d 6≡ 1 (mod p2) (else (2d)3 ≡ 1e ⇒ 2n ≡ 1 (mod p2)). Now2d ≡ 1 (mod p) ⇒ 2d = 1 +kp and (k, p) = 1 (else 2d = 1 +k′p2 and 2d ≡ 1 (mod p2)).Also 2p−1 ≡ 1 (mod p) ⇒ d |p− 1 ⇒ p− 1 = de for some e with 1 6 e 6 p− 1. Then(ek, p) = 1 and 2p−1 = (2d)e = (1 + kp)e ≡ 1 + ekp 6≡ 1 (mod p2) so p 6∈ W . �

Definition A powerful number is a v ∈ N such tat p | v ⇒ p2 | v. For example72 = 2 · 62 = 23 · 32 is powerful but 192 = 2 · 96 = 22 · 48 = 22 · 42 · 3 = 26 · 3 is not.

193

Theorem ABC ⇒ |W c| =∞.

Proof. ∀n ∈ N let 2n − 1 = unvn ⊗ where vn is the maximal powerful divisor of 2n − 1so un is just those primes which appear to power 1, is square free.

If p | un then 2n ≡ 1 (mod p) by ⊗ but 2n 6≡ 1 (mod p2) since 1 is the power of pappearing in 2n− 1. Hence p ∈ W c, so all the prime divisors of un ∈ W c. If |W c| <∞, ∃only finitely many square-free integers with prime divisors all in W ⇒ #{un : n =1, 2, . . .} <∞ ⇒ #{vn : n = 1, 2, . . .} =∞ and so is unbounded in size.Since vn is powerful N(vn) 6

√vn. Let 0 < ε < 1 in ABC and consider (2n − 1) + 1 =

a+ b = c = 2n.

vn 6 un · vn = 2n − 1

< 2n = max (|a| , |b| , |c|)6 KεN(2n(2n − 1) · 1)1+ε

= KεN(2unvn)1+ε

6 Kε(2un)1+εN(vn)1+ε

6 K ′εv(1+ε)/2n

so vn < K ′εv(1+ε)/2n ⇒ v

1− 12− ε

2n < K ′ε ⇒ vn < (K ′ε)

11/2−ε/2 = Bε contradicting the

unbounded nature of the {vn}. �

Theorem (LeVeque, 1952) If a, b > 2 are given, the equation ax− by = 1 has at mostone solution in positive integers x, y unless a = 3, b = 2 where there are two solutions:(x, y) = (1, 1) and (x, y) = (2, 3).

194

Proof. Assume that (u, v) and (x, y) are solutions with u < x. So au− bv = ax− by = 1.Then v < y since 0 < ax − au = by − bv and au(ax−u − 1) = bv(by−v − 1). Nowau − bv = 1 ⇒ (a, b) = 1 ⇒ bv = au − 1 = ax−u − 1 and au = bv + 1 = by−v − 1.

Hence au = ax−u so u = x − u ⇒ 2u = x and also by−v − bv = 2 so y − v < v andbv(by−2v − 1) = 2. Hence v = 1, b = 2, by−2v − 1 = 1 so y − 2v = 1 ⇒ y = 1 + 2v = 3.Thus au = 1 + bv = 3 so u = 1 and a = 3. �

195

14 Formulas for Primes

A function is easy: Letf(n) := max{p ∈ P : p |n}

indeed

f(n) = limr→0

lims→0

limt→0

s∑u=0

(1− cos2

[(u!)nπ

n

])2t

Both are impractical.

A formula for pn, the nth prime, would be nice, but probably impossible to find, theelements of P being scattered in such an irregular manner.

Easier aim: find a formula which produces only primes. Will show below that no poly-nomial will work but

f(n) =⌊θ3n⌋

does work for some θ ∈ R.

Ex f(n) = an + b. Let f(n) = p and f(m) = q, p, q ∈ P, p 6= q. Then (a, b) = 1since d = (a, b) ⇒ d | a and d | b so d | an + b = p. Similarly, d | q, but p 6= q so(p, q) = 1 ⇒ d = 1. So if f has more than one prime value, (a, b) = 1.

Tables of primes reveal arithmetic progressions of various lengths:

196

1. 3, 5, 7.

2. 7, 37, 67, 97, 127, 157.

3. 199, 409, 619, 829, 1039, 1249, 1459, 1669, 1879, 2089.

Proposition No arithmetic progression of N, of infinite length, can yield only primes.

Proof. Let an + b = p ∈ P, n, p fixed, and nk = n + kp, k = 0, 1, 2, . . .. Then the nthk

term of the progression is

ank + b = a(n+ kp) + b = an+ b+ akp = p(1 + ak)

so p |ank + b ∀k > 1. Thus, since the nk numbers come at intervals every p terms, every

pth term of the original progression is divisible by p. Hence the progression containsinfinitely many composite numbers. �

Note: Dirichlet’s Theorem (see back) says {an + b : n ∈ N} contains an infinitenumber of primes if (a, b) = 1.

Proposition If p - a then every pth term of {an + b}, starting somewhere, is divisibleby p.

Proof. p - a ⇒ (p, a) = 1 ⇒ ∃r, s so pr + as = 1. Let nk = kp − bs, k = 1, 2, 3, . . ..

197

Then

f(nk) = ank + b = a(kp− bs) + b

= akp− abs+ b

= akp− b(1− pr) + b

= p(ak + br).

Thus p |ank + b. Since nk+1 − nk = p, the terms ank + b occur p terms apart. �

Ex 2 - a ⇒ every second term is divisible by 2. So {an + b} cannot have more than 1consecutive prime values.

Ex {30, 030n−6887 : n = 1, 2, 3, . . .} has 12 consecutive terms which are prime. 30, 030 =2 · 3 · 5 · 7 · 11 · 13. This is a curiosity: Linear formulas fail.

Quadratic Formulas

f(n) = an2 + bn+ c

Ex f(n) = n2 + 21n + 1 is not composite for n = −38, −37, . . . , 0, 1, 2, . . . , 17: 56values. f(0) = 1 6∈ P of course. However, f(18) = 703 = 37 · 19.

198

Ex 19 |f(n) if n ≡ −1 (mod 19): since n = −1 + 19` ⇒

f(n) = (−1 + 19`)2 + 21(−1 + 19`) + 1

= 19(−1 + 19`+ 19`2).

Ex f(n) = n2 + n+ 41 ∈ P for n ∈ {−40,−39, . . . , 39} i.e. for 80 consecutive values.

Conjecture 80 is the best possible for any quadratic.

Known (1967): No f(n) = n2 + n+ A (A > 41) gives primes for n = 0, 1, . . . , A− 2.

Proposition No quadratic can always be prime.

Proof. f(n) = an2 + bn + c = p ∈ P ⇒ f(nk) ≡ an2 + bn + c (mod p), nk = n + kp so

every pth term of {f(n)} is divisible by p. �

Does {an2 + bn+ c} contain an infinite number of prime values?—Unknown.Does {n2 + 1} have an infinite number of prime values?—Also unknown.

If f ∈ Z[x] is a polynomial and f(n) = p ∈ P, then p |f(n+ kp) for k = 0, 1, 2, . . ., so nosuch f has an infinite set of consecutive prime values.

Ex Give d ∈ N, ∃ a polynomial f ∈ Z[x] of degree d, taking on d+1 arbitrarily assigned

199

values, which could be prime:

60f(x) = 7x5 − 85x4 + 355x3 − 575x2 + 418x+ 180

hasn = 0 1 2 3 4 5

f(n) = 3 5 7 11 13 17

Method: Use Lagrange interpolation with xi = {0, 1, . . . , 5} and yi = {3, 5, 7, . . . , 17}and

f(x) =6∑i=1

(6∏

j=1,j 6=i

(x− xj)(xi − xj)

)yi

so f(xi) = yi, 1 6 i 6 6.

By these examples, we see the functions must be more complex than linear or polynomial,if they are to have all prime values.

Theorem There is a number θ ∈ R such that

f(n) =⌊θ3n⌋

is prime for all n ∈ N.

Note: This formula is not effective, since to know θ exactly, we would need to be ableto recognise arbitrarily large primes (θ ≈ 1.3064 . . .).

200

Lemma If u1 6 u2 6 · · · 6 un 6 · · · 6 B is a bounded increasing real sequence, then

limn→∞

un = θ

exists.

Lemma If A 6 · · · 6 vn 6 · · · 6 v2 6 v1 is a decreasing real sequence which is boundedbelow, then

limn→∞

vn = α

also exists.

Assumption: ∃A ∈ N such that if n > A, ∃p ∈ P with n3 < p < (n+ 1)3 − 1.

“The proof of this assumption is very difficult” (Elementary Number Theory by Under-wood Dudley). Indeed, but easier than ∃p so n2 < p < (n+ 1)2, I would say.

Proof.of the Theorem Let p1 be any prime with A < p1 and for n = 1, 2, 3, . . . let pn+1

be a prime withp3n < pn+1 < (pn + 1)3 − 1

Let un = p3−nn = p

13nn and vn = (pn + 1)3−n , n = 1, 2, . . .. Then un+1 = p3−n−1

n+1 >

(p3n)

3−n−1

= p3−nn = un. So {un} is increasing. Also {vn} is decreasing since vn+1 =

(pn+1 + 1)3−n−1< ((pn + 1)3 − 1 + 1)3−n−1

= (pn + 1)3−n = vn.From their definitions above, un < vn ∀n ∈ N. Hence, by the two Lemmas above

limn→∞

un = θ and limn→∞

vn = α

201

Since un < vn we have θ 6 α, indeed un < θ 6 α < vn because {un} is strictly increasingand {vn} strictly decreasing.Therefore u3n

n < θ3n 6 α3n < v3n

n ∀n ∈ N. But, from their definitions, u3n

n = pn andv3n

n = pn + 1 so pn < θ3n < pn + 1 ⇒ pn =⌊θ3n⌋∀n ∈ N so

⌊θ3n⌋

is prime. �

This Theorem would be valuable if we could work out the value of θ without reference toprimes.

Ex

f(n) = sin

(π(1 + (n− 1)!)

n

)then f(n) = 0 ⇔ n ∈ P.

Another Catalan Conjecture (1876) Let p0 = 2 ∈ P and pn+1 = 2pn − 1 for n =0, 1, 2, . . . then

p1 = 22 − 1 = 3 ∈ Pp2 = 23 − 1 = 7 ∈ Pp3 = 27 − 1 = 127 ∈ P

and p4 ∈ P. Is pj ∈ P ∀j? {pj} increases very rapidly.

Theorem (Matijasevic, 1971) There exists a multinomial p (p ∈ Z[a, b, c, . . . , z] ofdegree 25 (Jones, Sato, Weda, 1975)) such that the set of prime numbers coincides withthe set of positive values assumed by this multinomial,as the variables range in the set ofnon-negative integers Z+ = N ∪ {0}.

202

Proposition (Dixon) If p is the multinomial, r = 2 + 12(p− 2 + |p− 2|) is a function

(not a multinomial) with range exactly the set of primes.

Proof. p(x) 6 0 ⇒ p − 2 < 0 ⇒ |p− 2| = 2 − p so r = 2 + 12· 0 = 2 ∈ P and

p(x) > 0 ⇒ p(x) > 2 ⇒ r = 2 + 12(p− 2 + p + 2) = 2 + p− 2 = p so r(x) = p(x) ∈ P.

(Jones, 1979) F0 = 1, F1 = 1 and Fn+1 = Fn + Fn−1, n > 1, the Fibonacci numbersare the set of positive values at non-negative integers of

p(x, y) = 2xy4 + x2y3 − 2x3y2 − y5 − x4y + 2y

Hilbert’s 10th Problem: There is no algorithm which is good enough to decide whetherany given diophantine equation has a solution in positive integers. (Matijasevic).

(Siegel, 1972) Every quadratic diophantine equation is decidable.

Unknown: Is every multinomial in 2 variables decidable?

These results and questions relate to the axiomatic and logical foundation of arith-metic. e.g. are some problems simply impossible to solve because we do not have anappropriate set of properties of numbers to begin with?

Resistant problems:

203

1. Twin primes conjecture: ∃ an infinite set of pn ∈ P so that pn + 2 ∈ P also.

2. There exist infinitely many Sophie Germain primes i.e. pn ∈ P and 2pn + 1 ∈ P.

3. Mp = 2p− 1 is a Mersenne number. Are infinitely many composite? We believe so.

A simple new “axiom”/“conjecture” will resolve each of these questions.

204

15 Axiom D

(Dirichlet) (a, b) = 1, a 6= 0, b > 1, f(x) = bx+ a then ∃ an infinite number of integersm > 0 with f(m) ∈ P.

Conjecture/Axiom (Dixon, 1904) Let s > 1 and fj(x) = bjx + aj with bj > 1 andaj, bj ∈ Z. If @n > 1 with

n |f1(k)f2(k) · · · fs(k) ∀k ∈ Z

OR∀n > 1, ∃k ∈ Z so n - f1(k)f2(k) · · · fs(k) �

Then there exist infinitely many m ∈ N with {f1(m), . . . , fs(m)} all primes.

This is Axiom D, the weakest form of a more general Axiom H where the linearpolynomials fj are replaced with polynomials of arbitrary degree.

Proposition Axiom D ⇔ � ⇒ ∃m ∈ N so {f1(m), . . . , fs(m)} are primes.

Proof. (⇒) Follows directly.(⇐) ∃m1 > 1 so f1(m1), . . . , fs(ms) are primes. Let gj(x) = fj(x+ 1 +m1) for 1 6 j 6 s.Then � is satisfied by the {gj}, hence ∃k1 > 1 so g1(k1), . . . , gs(k1) are primes. Letm2 = k1 +1 +m1 > m+1 so f1(m2), . . . , f3(m2) are primes. Repetition of this proceduregenerates infinitely many mj ∈ N. �

205

Theorem Axiom D ⇒ ∀m > 1 there exist infinitely many arithmetic progressionsconsisting of m Sophie Germain primes.

Proof. Let d = (2m+ 2)! > 4! (even). Consider the 2m polynomials:

f1(x) = x+ d

f2(x) = x+ 2d...

fm(x) = x+md

fm+1(x) = 2x+ 2d+ 1

fm+2(x) = 2x+ 4d+ 1...

f2m(x) = 2x+ 2md+ 1

so fm+j(x) = 2fj(x) + 1, 1 6 j 6 m. These polynomials satisfy �: Let f(x) =∏2mj=1 fj(x) with degree 2m and leading coefficient 2m. Let p ∈ P divide f(k) for

k = −1, 0, 1, . . . , p − 2 (1). Now f(−1) =∏2m

j=1(odd) ≡ 1 (mod 2) ⇒ p 6= 2 sof(x) ≡ 0 (mod p) has 2m roots (in a field extension of Fp), but it has p roots from (1).Hence p 6 2m and p |d = (2m+ 2)! But we CLAIM f(−1) ≡ 1 (mod 3).

f(−1) = (d− 1)(2d− 1) · · · (2d− 1)(4d− 1) · · · ≡ (−1)2m(3) ≡ 1 (mod 3). Hence p 6= 3.

206

But

f(1) = (1 + d)(1 + 2d) · · ·︸ ︷︷ ︸m factors

(3 + 2d)(3 + 4d) · · ·︸ ︷︷ ︸m factors

≡ 3m (mod p)

since p | d. Hence p - f(1) which is a contradiction. Hence the {fj} satisfy �. ByAxiom D there exist infinitely many k so fj(k) = pi and fm+i(k) = 2pi + 1 are primesfor i = 1, . . . ,m. Moreover p1 < p2 < · · · < pm are in progression with difference d. �

Corollary Axiom D ⇒ there exist infinitely many Sophie Germain primes.

Proposition Let a, b, c be pairwise relatively prime non-zero integers (i.e. each consistsof products of different primes). There exist infinitely many pairs of primes (p, q) soap− bq = c assuming Axiom D.

Proposition (Schinzel and Sierpinski, 1958) Axiom D⇒ there exist infinitely manyn with 1

2φ(n) ∈ P where φ is Euler’s phi function.

Proposition There exist infinitely many triples of consecutive integers, each being theproduct of two distinct primes, assuming Axiom D.

Proof.f1(x) = 10x+ 1f2(x) = 15x+ 2f3(x) = 6x+ 1

f1(0)f2(0)f3(0) = 2f1(1)f2(1)f3(1) = 11 · 17 · 7

207

⇒ � is satisfied. Thus there exist infinitely many integers m > 1 such that:

p = 10m+ 1q = 15m+ 2r = 6m+ 1

are primes.

Then3p = 30m+ 3 = 3p3p+ 1 = 30m+ 4 = 2q3p+ 2 = 30m+ 5 = 5r

so {3p, 3p+ 1, 3p+ 2} are products of two distinct primes. �

Theorem Axiom D ⇒ there exist infinitely many composite Mersenne numbers (Mp =2p − 1).

Proof. Letf1(x) = 4x− 1f2(x) = 8x− 1

}f1(0)f2(0) = 1 ⇒ �

Hence there exist infinitely many m > 1 such that

p = 4m− 1q = 8m− 1

}are primes

But then q = 2p+ 1 and p ≡ 3 (mod 4).

Claim q | 2p − 1: Consider the Legendre symbol (2 | q) = (−1)q2−1

8 . p ≡ 3 (mod 4) ⇒

208

q = 2(3 + 4m) + 1 ⇒ q = 7 + 8m ≡ −1 (mod 8) ⇒ q2 ≡ 1 (mod 8). So (2 | q) =

(−1)1−18 = 1 ≡ 2

q−12 = 2p (mod q). Hence q |2p − 1.

Now, if m > 1 the corresponding primes p, q satisfy 2p− 1 = 24m−1− 1 > 8m− 1 = 2 ⇔16m − 2 > 16m − 2 ⇔ 16m > 16m which is true. So q | 2p − 1 is a proper divisor andMp = 2p − 1 is composite. �

Theorem Let a1 < a2 < · · · < as be non-zero integers and assume f1(x) = x +a1, . . . , x + as = fs(x) satisfy �. Then there exist infinitely many integers M > 1 so{m+ a1,m+ a2, . . . ,m+ as} are consecutive primes.

Theorem Axiom D ⇒ ∀k ∈ N there exist infinitely many pairs of consecutive primeswith difference 2k. In particular, there exist infinitely many pairs of twin primes.

Proof. Let

f1(x) = x+ 1f2(x) = x+ 2k + 1

}then

f1(0)f2(0) = 1 + 2k = af2(1)f2(1) = 2(2 + 2k) = b

and

(a, b) = (1 + 2k, 4(1 + k))

= (1 + 2k, 1 + k)

= 1

since 2(1 + k)− (1 + 2k) = 1. Hence {f1, f2} satisfy �.

209

By the previous Theorem, since 1 < 2k + 1, there exist infinitely many integers m > 1so f1(m), f2(m) are consecutive primes, i.e. p = m+ 1 and q = m+ 2k + 1 = p+ 2k areconsecutive primes, so {p, p+ 2k} are twins with the given property. �

Axiom D has a number of other consequences e.g. on the existence of primes in arith-metic progressions.

Let 1 < n and d a multiple of∏

p6n p. Then there exist infinitely many arithmeticprogressions with difference d, each consisting of n consecutive primes.

Proving Axiom D: First try s = 2, generalising s = 1. Showing Axiom D is indepen-dent—very difficult and unlikely.

210

16 Partitions

Generating functions arise because if n,m ∈ Z

addition n+m︸ ︷︷ ︸integers

! zn · zm︸ ︷︷ ︸polynomials/series

multiplication

Ex (Lagrange) ∀n ∃xi, 1 6 i 6 4, n = x21 + x2

2 + x23 + x2

4 is equivalent to if (1 + z12+

z22+ · · ·+ zn

2+ · · · )4 = f(z) = a0 + a1z + a2z

2 + · · · then ai > 0 ∀i = 0, 1, 2, . . ..

Change Making How many ways can we make change for n ∈ N if the coins are ofdenomination 1, 2 and 3 i.e. given N how many different solutions are there to

N = 1x+ 2y + 3z

in x, y, z > 0 all integers?

Let |z| < 1 and write, using the sum to ∞ of a geometric series:

1

1− z= 1 + z + z2 + z3 + · · ·

= 1 + z1 + z1+1 + z1+1+1 + · · ·1

1− z2= 1 + z2 + z2+2 + z2+2+2 + · · ·

1

1− z3= 1 + z3 + z3+3 + z3+3+3 + · · ·

211

so

1

(1− z)(1− z2)(1− z3)= (1 + z1 + z1+1 + · · · )(1 + z2 + z2+2 + · · · )(1 + z3 + z3+3 + · · · )

=∞∑n=0

c(n)zn

c(0) = 1.

what happens when we multiply out the RHS? We get terms like z1+1+1+1 · z2 · z3+3 = z12

but this is zfour ‘1’s + one ‘2’ + two ‘3’s i.e. a method of changing 12 into ‘1’s, ‘2’s and‘3’s. Every way of changing 12 will appear so c(12) is exactly the number of ways 12 canbe ‘changed’. Similarly, c(n) is the number of ways n can be changed

∞∑n=0

c(n)zn =1

(1− z)(1− z2)(1− z3)

so our number theory counting problem has been transformed into an analytic problem,i.e. finding coefficients of a Taylor series. Don’t do the series multiplication to get the‘c(n)’s. Use partial fractions (check by simplifying the RHS):

1

(1− z)(1− z2)(1− z3)=

1

6

1

(1− z)3+

1

4

1

(1− z)2+

1

4

1

(1− z2)+

1

3

1

(1− z3)

Thend

dz

(1

1− z

)=

1

(1− z)2=

d

dz

( ∞∑n=1

zn

)=∞∑n=0

(n+ 1)zn

212

and

d

dz

(1

2(1− z)2

)=

1

(1− z)2=

d

dz

( ∞∑n=0

n+ 1

2zn

)=∞∑n=0

(n+ 1)(n+ 2)

2zn

⇒ c(n) =1

6· (n+ 1)(n+ 2)

2+

1

4(n+ 1) +

{14

n even13

3 |n

=

⌊n2

12+n

2+ 1

⌋(see below)

If 2 |n and 3 |n get

c(n) =n2

12+

(3

12+

1

4

)n+

2

12+

1

4+

1

4+

1

3

=n2

12+n

2+ 1

But c(n) ∈ N so

c(n) =

⌊n2

12+n

2+ 1

213

If 2 |n and 3 - n

c(n) =n2

12+n

2+ 1− 1

3

=

⌊n2

12+n

2+ 1

⌋since c(n) ∈ N. Similarly, if 2 - n and 3 |n:

c(n) =n2

12+n

2+ 1− 1

4

=

⌊n2

12+n

2+ 1

Crazy Dice

Normal die have faces labelled 1–6. When tossed there exist 6 × 6 = 36 equally likelyoutcomes e.g. the probability of (6, 6) is 1

36. What are the probabilities for sums? s =

214

x+ y, 2 6 s 6 12.p(z) = z1 + z2 + z3 + z4 + z5 + z6

Combined possibilities for sums are encoded in

(z1 + z2 + z3 + z4 + z5 + z6)(z1 + z2 + z3 + z4 + z5 + +z6)

= z2 + 2z3 + 3z4 + 4z5 + 5z6 + 6z7 + 5z8 + 4z9 + 3z10 + 2z11 + z12

so there are 3 ways in which we can achieve s = 10:

5 + 5 = 106 + 4 = 104 + 6 = 10

215

Question: Can we label the two cubes with other positive integers and obtain the samefrequencies for sums? i.e do there exist a1, . . . , a6; b1, . . . , b6 ∈ N so

pa(z) · pb(z)

= (za1 + za2 + za3 + za4 + za5 + za6)(zb1 + zb2 + zb3 + zb4 + zb5 + zb6)

= z2 + 2z3 + 3z4 + 4z5 + 5z6 + 6z7 + 5z8 + 4z9 + 3z10 + 2z11 + z12

Call these Crazy Dice:

LHS = (z + z2 + z3 + z4 + z5 + z6)2

=

[z

(1− z6)

1− z

]2

=

[z

(1− z2)(1 + z2 + z4)

1− z

]2

=[z(1 + z)(1 + z + z2)(1− z + z2)

]2Since Z[x] is a unique factorisation domain, the polynomials pa and pb must consistof these factors. Since ai > 1, bi > 1, 1 6 i 6 6, a factor z must occur in both.pa(1) = 1a1 + 1a2 + 1a3 + 1a4 + 1a5 + 1a6 = 1 + 1 + 1 + 1 + 1 + 1 = 6 so (1 + z + z2)(1 + z)must appear in a factorization of pa. The same applies to pb. This leaves the two factors

216

(1− z + z2) to distribute. One to each → normal die. Both to pa → crazy die.

pa(z) = z(1 + z)(1 + z + z2)(1− z + z2)2

= z + z3 + z4 + z5 + z6 + z8

pb(z) = z(1 + z + z2)(1 + z)

= z + 2z2 + 2z3 + z4

so {1, 3, 4, 5, 6, 8} and {1, 2, 2, 3, 3, 4} are the labels.

Representation Function

Let A ⊂ Z+ = N ∪ {0} a subset of non-negative integers.

How many ways can a given n ∈ N be written as the sum of two elements of A?

• Order counts and the summands can be equal :

r(n) = #{(a, b) ∈ A×A : n = a+ b}

• Order does not count, but they can be equal :

r+(n) = #{(a, b) ∈ A×A : a 6 b, n = a+ b}

217

• Order does not count, and they cannot be equal :

r−(n) = #{(a, b) ∈ A×A : a < b, n = a+ b}

Let A(z) be the generating function for the set A i.e.

A(z) =∑n∈A

zn

Then∞∑n=0

r(n)zn = (A(z))2 = A2(z)

and∞∑n=0

r−(n)zn =1

2

[A2(z)− A(z2)

]= B(z) say

finally∞∑n=0

r+(n)zn = B(z) + A(z2) =1

2

[A2(z) + A(z2)

]Question: Is there an infinite set A ⊂ Z+, with A 6= ∅, for which r+(n) = C =constant ∀n or ∀n > n0?

Then 12

(A2(z) + A(z2)) = C1−z + P (z) where P is a polynomial with ∂P < n0.

218

Let z → −1+. Then |P (z)| 6 B1 a bound,∣∣∣∣ C

1− z

∣∣∣∣ 6 B2 a bound,

A2(z) > 0 and A(z2) → A(1) → ∞ so the RHS is unbounded. Hence the answer to thequestion is no.

Question: Can we split Z+ into two disjoint sets A and B so every non-negative integeris expressible in the same number of ways as the sum of two distinct members of A as itis the sum of two distinct members of B?

Trial-and-error: Let 0 ∈ A, then 1 ∈ B else 1 = 1 + 0 = a+ a′ but not 1 = b+ b′. Then2 ∈ B else 2 = 2 + 0 = a + a′ 6= b + b′ (1 + 1 is not distinct). then 3 ∈ A else 3 6= a + a′

whereas 3 = 1 + 2 = b+ b′ etc. Then

A = {0, 3, 5, 6, 9, . . .}B = {1, 2, 4, 7, 8, . . .}

What is the pattern? Are A and B unique?

Use generating functions A(z) for A and B(z) for B so

1

2

[A2(z)− A(z2)

]=

1

2

[B2(z)−B(z2)

](1)

Also, because A t B = Z+ is a splitting

A(z) +B(z) =1

1− z= 1 + z + z2 + z3 + · · · (2)

219

(1) ⇒ A2(z)−B2(z) = A(z2)−B(z2) so (A(z)−B(z))(A(z) +B(z)) = A(z2)−B(z2).

(2) ⇒ A(z)−B(z)

1− z= A(z2)−B(z2)

⇒ A(z)−B(z) = (1− z)[A(z2)−B(z2)] ∀z, |z| < 1

z → z2 ⇒ A(z2)−B(z2) = (1− z2)[A(z4)−B(z4)]

⇒ A(z)−B(z) = (1− z)(1− z2)[A(z4)−B(z4)]

Iterating this gives:

A(z)−B(z) = (1− z)(1− z2)(1− z4) · · · (1− z2n−1

)[A(z2n)−B(z2n)

]But A(0) = 1, B(0) = 0 and z2n → 0 as n→∞ since |z| < 1⇒

A(z)−B(z) =∞∏j=0

(1− z2j

)[A(0)−B(0)]

=∞∏j=0

(1− z2j

)(3)

We can easily multiply this out!

Every n ∈ Z+ has a unique binary representation, i.e. expression as a sum of powers f 2,2j. Indeed,

220

• n = sum of an even number of powers of 2 ⇒ zn has coefficient +1.

• n = sum of an odd number of powers of 2 ⇒ zn has coefficient −1.

so A = {n : n is an even . . .}, B = {n : n is an odd . . .}

This is not trivial, not something we might have guessed.

A =

0 = 03 = 20 + 21

5 = 20 + 22

6 = 21 + 22

9 = 20 + 23

...

B =

1 = 20

2 = 21

4 = 22

7 = 20 + 21 + 22

8 = 23

...

Euler’s Identity

Consider the number of ways of expressing n as the sum of (any number of) distinct

221

positive integers p(n):

6 = 1 + 2 + 3

= 2 + 4

= 1 + 5

= 6

so p(6) = 4. Also express n as the sum of positive odd numbers, q(n) allowing repeats so:

6 = 1 + 5

= 3 + 3

= 1 + 1 + 1 + 3

= 1 + 1 + 1 + 1 + 1 + 1

so q(6) = 4 and p(6) = q(6). This is not a coincidence!

Theorem (Euler) The number of ways of expressing N as the sum of distinct positiveintegers equals the number of ways of expressing n as (not necessarily distinct) odd positiveintegers.

Proof. To prove∑∞

n=0 p(n)zn =∑∞

n=0 q(n)zn i.e

(1 + z1)(1 + z2)(1 + z3) · · · = 1

(1− z)(1− z3)(1− z5) · · ·

This is Euler’s identity.

222

Consider 1

RHS × LHS = (1− z)(1− z3)(1− z5) · · · (1 + z)(1 + z2)(1 + z3) · · · . This is

(1− z)(1 + z)(1− z3)(1 + z3) · · · (1 + z2)(1 + z4)(1 + z6) · · ·= (1− z2)(1− z6)(1− z10) · · · (1 + z2)(1 + z4)(1 + z6) · · · = P (z), say

But then

P (z2) = (1− z4)(1− z12) · · · (1 + z4)(1 + z8) · · ·= (1− z2)(1 + z2)(1− z6)(1 + z6) · · · (1 + z4)(1 + z8) · · ·= (1− z2)(1− z6) · · · (1 + z2)(1 + z4) · · ·= P (z)

So P (z) = P (z2) so P ′(z) = 2zP ′(z2) and P ′(0) = 0 similarly P ′′(0) = 0, P ′′′(0) =0, . . . , p(n)(0) = 0 ∀n ∈ N so (Taylor expansion) P (z) = P (0) + 0 = P (0) ∀ |z| < 1. ButP (0) = (1− 0)(1− 0) · · · (1 + 0)(1 + 0) · · · = 1. Hence P (z) = 1 ∀ |z| < 1 so LHS = RHSand we have proved Euler’s mysterious identity. �

Ex Euler’s identity at z = 12

is

∞∏j=1

(1 +

1

2j

)=∞∏i=1

(1− 1

22i−1

)−1

i.e. 32· 5

4· 9

8· · · = 2

1· 8

7· 32

31· · · or take log and use log(1 + z) = z− z2

2+ z3

3− z4

4· · · (|z| < 1).

∞∑j=1

∞∑n=1

(−1)nzjn

n=∞∑i=1

∞∑n=1

z(2i−1)n

n

223

Partition Function p(n)

Question: In how many ways can n ∈ N be expressed as a sum of natural numbers?

First let order count :

Ex4 = 1 + 3 = 3 + 1

= 2 + 2= 4= 2 + 1 + 1 = 1 + 2 + 1 = 1 + 1 + 2= 1 + 1 + 1 + 1

8 = 24−1 ways

Proposition If order counts, n can be expressed as a sum in 2n−1 = q(n) ways.

Proof. n = 1 : q(1) = 1 = 21−1 so the result is true.Given n > 1 assume it is true for n− 1 and write

n = (n− 1) + 1 = (n− 2) + 2 = · · · = 1 + (n− 1)

by the induction hypothesis each of these brackeded numbers can be expressed in a total

2n−1 − 1 = 2n−2 + 2n−3 + · · ·+ 1

224

ways and this represents the sums for n with 2 or more terms with order counting. Theonly remaining sum is n = n so we get q(n) = 2n−1 ∀n ∈ N. �

If order does not count then the counting is much more complex: p(1) = 1, p(2) =2, p(3) = 3,

4 = 1 + 1 + 1 + 1

= 1 + 1 + 2

= 1 + 3

= 2 + 2

= 4

and p(4) = 5. Similarly p(5) = 7. There is no pattern.

Major MacMahon computed hundreds of values of p(n) by hand and it suddenly occurredto him that from a distance, the outline of the digits formed a parabola!

⇒ # of digits ∼ C√n so p(n) ∼ eα

√n. Later work showed

p(n) ∼ eπ√

2n/3

4√

3 · n(Rademacher)

At n = 200, RHS ; 4×1012 ; p(200). the proof uses elliptic modular functions. We willderive an upper bound for the RHS. p(n) is call the (unrestricted) partition function.

225

Geometric Representation

Ex 15 = 6 + 3 + 3 + 2 + 1• • • • • •• • •• • •• ••

Reading vertically, 15 = 5+4+3+1+1+1 is another, “conjugate” partition. Then numberof parts in the first equals the size of the largest part in the second, and vice-versa.

Proposition The number of partitions of n into m parts is equal to the number of

226

partitions of n into parts, the largest of which is m.

Theorem (Euler)∞∏m=1

1

1− xm=∞∑n=0

p(n)xn

|x| < 1, p(0) = 1. So the LHS is a generating function for p(n).

Proof. Expand each factor on LHS as a power series using the sum to ∞ of a geometricseries:

LHS = (1 + x+ x2 + x3 + · · · )(1 + x2 + x4 + · · · )(1 + x3 + x6 + · · · ) · · ·

Now multiply out and collect like powers of x so

LHS = 1 +∞∑j=1

a(j)xj

We need to prove a(j) = p(j). If we take a term xk1 from the first, x2k2 from the second,

x3k3 from the third,. . . , xmkm from the mth where each ki > 0, their product is

xk1 · x2k2 · · ·xmkm = xk

say, where k = k1 + 2k2 + 3k3 + · · ·+mkm or

k = (1 + 1 + · · ·+ 1)︸ ︷︷ ︸k1

+ (2 + 2 + · · · )︸ ︷︷ ︸k2

+ (3 + 3 + · · · )︸ ︷︷ ︸k3

+ · · ·+ (m+m+ · · · )︸ ︷︷ ︸km

227

so this is a partition of k into positive summands. Conversely each term xk comes fromsuch a partition. Hence a(k) = p(k). (This can be made into a more rigorous proof.) �

Similarly other types of partitions can be described using generating functions:

Generating function for the number of partitions ofn into parts which are∏∞

m=11

1−x2m even∏p

11−xp prime∏∞

m=1(1 + xm) unequal∏∞m=1(1 + xm

2) distinct squares∏∞

m=11

1−xm2 squares∏p(1 + xp) distinct primes

Pentagonal Numbers

These belong to the family of polygonal numbers, beloved by the Greek Pythagoreans.

228

1 = 11 + 4 = 51 + 4 + 7 = 121 + 4 + 7 + 10 = 22

In general, the nth pentagonal number is the nth partial sum of the arithmetic progression

229

1, 4, 7, 10, 13, . . . , 3n+ 1, . . . n = 0, 1, 2, . . .. Let

ω(n) =n−1∑j=0

(3j + 1)

= 3n−1∑j=0

j +n−1∑j=0

1

=3

2n(n− 1) + n

=3n2 − n

2

Then, normally, ω(n) = 3n2−n2

and ω(−n) = 3n2+n2

are called pentagonal numbers.ω(1) = 1, ω(2) = 5, ω(3) = 12, . . ..

Theorem (Euler’s Pentagonal Number Theorem) Let |x| < 1, then

∞∏m=1

(1− xm) =∞∑

n=−∞

(−1)nxω(n)

So, surprisingly, the LHS is a sort of generating function for the ω(n). Note also thesurprising relationship between the p(n) and ω(n):

1 =

(∞∑n=0

p(n)xn

)(∞∑

n=−∞

(−1)nxω(n)

)

230

Proof. (Euler by induction 1750, Legendre 1830, Jocobi 1846, Franklin 1881 gave thisremarkable “combinatorial” proof) Let

∞∏m=1

(1− xm) = 1 +∞∑n=1

a(n)xn

Now every partition of n into unequal parts contributes to a term on the right with

• +1 if xn is the product of an even number of terms.

• −1 if xn is the product of an odd number of terms.

Hence∞∏m=1

(1− xm) = 1 +∞∑n=1

(pe(n)− po(n))xn (1)

Franklin showed that there is a 1-1 correspondence between even and odd partitions, sope(n) = po(n), except when n is pentagonal.

Consider the graph of a partition. It is in standard form if the parts are in strictlydecreasing order going down the page.

Definition The base of the graph is the longest line segment connecting points in thelast row. Let b be the number of points.

231

Definition The slope of the graph is the longest 45o segment joining the last point inthe first row with the last point in successive rows. Let s be the number of points in theslope.

. . .7. -Slope (s=4)

. . . .. . .

.-,-.'-&"""-21

---'-Definition Operation A: Move points on the base so they all lie on a line parallel tothe slope. It is permissible if the resulting graph is in standard form.

Definition Operation B: Move all points on the slope so they lie on a line below thebase. Again it is permissible if the resulting graph is in standard form.

232

If A is permissible we get a new partition of n into unequal parts with 1 less term. If Bis permissible we get a new partition of n into unequal parts with 1 more term.

If for a given n and every partition of n, exactly one of A or B is permissible, there willbe a 1-1 correspondence between partitions of n into an even and odd number of terms⇒ pe(n) = po(n) for these n.

Determination whether A or B is permissible:

• Case 1 b < s : b 6 s− 1 ⇒ A is okay but B is not.

• Case 2 b = s: B is not okay. A is okay except when the base and slope intersect.

• Case 3 s < b: A is not permissible, B is okay except when b = s+ 1¿

∴ there are just two exceptions, (a) and (b) above.

233

• Consider (a): Let there be k rows so b = k and counting ‘•’s:

n = k + (k + 1) + · · ·+ (2k − 1) =3k2 − k

2= ω(k)

So if k is even we get an extra partition into an even number (k) of parts. If K isodd we get an extra odd partition. ∴ pe(n)− po(n) = (−1)k.

• In (b):

n =3k2 − k

2+ k because there s an extra point on each row

=3k2 + k

2= ω(−k)

and again pe(n)− po(n) = (−1)k.

Hence, by (1)∞∏m=1

(1− xm) = 1 +∞∑k=1

(−1)kxω(k) +∞∑k=1

(−1)kxω(−k)

Theorem (Euler) Let p(0) = 1 and p(n) = 0 for n < 0:

p(n) =∞∑k=1

(−1)k+1{p(n− ω(k)) + p(n− ω(−k))}

234

Proof. By the above two theorems(1 +

∞∑k=1

{xω(k) + xω(−k)}

)(∞∑m=0

p(m)xm

)= 1

For n > 1 the coefficient of xn on RHS is zero. So equating coefficients of xn on eachside:

∞∑n=0

p(n)xn +∞∑m=0

∞∑k=1

(−1)kp(m)xm+ω(k) +∞∑m=0

∞∑k=1

(−1)kp(m)xm+ω(−k) = 0

∞∑n=0

p(n)xn +∞∑n=0

[∞∑k=1

(−1)kp(n− ω(k))

]xn +

∞∑n=0

[∞∑k=1

(−1)kp(n− ω(−k))

]xn = 0

⇒ p(n) =∞∑k=1

(−1)k+1{p(n− ω(k)) + p(n− ω(−k))}

Ex p(5) =∑∞

k=1(−1)k+1{p(5−ω(k))+p(5−ω(−k))}. Using ω(0) = 0, ω(1) = 1, ω(2) =5, ω(3) = 12, ω(−1) = 2, ω(−2) = 7, ω(−3) = 15. we get:

p(5) = (−1)2{p(5− ω(1)) + p(5− ω(−1))}+ (−1)3{p(5− ω(2)) + p(5− ω(−2))}= 1 · {p(4) + p(3)} − {p(0) + p(−2)}+ 0

= {5 + 3} − {1 + 0}= 7

235

as before.

An upper bound for p(n)

Theorem ∀n > 1, p(n) < ek√n where k = π

√23.

Proof. Let F (x) =∏∞

n=1(1− xm)−1 = 1 +∑∞

k=1 p(k)xk and restrict x to lie in 0 < x < 1.Then p(n)xn < F (x), each term being positive. So

log(p(n)) < log(F (x)) + n log(

1x

)= A + B

236

First estimate A, then B:

A = log(F (x))

= − log

(∞∏n=1

(1− xn)

)

= −∞∑n=1

log(1− xn)

=∞∑n=1

∞∑m=1

xmn

m

=∞∑m=1

1

m

∞∑n=1

(xm)n

=∑m=1

1

m· xm

1− xm

Now 1−xm1−x = 1 + x+ x2 + · · ·+ xm−1 and 0 < x < 1 so

mxm−1 <1− xm

1− x< m

⇒ m(1− x)

x<

1− xm

xm<m(1− x)

xm

with all terms positive so inverting gives:

xm

m2(1− x)6

1

m· xm

1− xm6

1

m2· x

1− x

237

Sum on m

A =∞∑m=1

1

m· xm

1− xm6

x

1− x

∞∑m=1

1

m2=π2

6· x

1− x

Let t = 1−xx

so 1 + t = 1 + 1−xx

= 1x

so A 6 π2

6· 1t

and log(

1x

)= log(1 + t) < t

Hence log(p(n)) < log(F (x)) + n log(

1x

)< π2

6t+ nt

Now the minimum value of θ(t) = π2

6t+ nt occurs when t0 = π√

6n.

238

For this value of t

θ(t0) = 2nt0 =2nπ√

6n= K√n

Hence log(p(n)) < K√n ⇒ p(n) < eK

√n. �

We can use generating functions and logarithmic differentiation to devise recursionformulas for arithmetical functions:

Let A ⊂ N be a subset. Let f(n) be an arithmetical function. Let the product

FA(x) =∏n∈A

(1− xn)−f(n)n

and the series

GA(x) =∑n∈A

f(n)

nxn

239

converge absolutely for |x| < 1. Then

log(FA(x)) = −∑n∈A

f(n)

nlog(1− xn)

=∑n∈A

f(n)

n

∞∑n=1

xmn

m

=∞∑m=1

1

mGA(xm)

Then differentiate and multiply by x to obtain:

xF ′A(x)

FA(x)=

∞∑m=1

G′A(xm)xm

=∞∑m=1

∑n∈A

f(n)xmn

=∞∑m=1

∞∑n=1

xA(n)f(n)xmn

= RHS

where

xA(n) =

{1 n ∈ A0 n 6∈ A

is the so-called characteristic function of A.

240

Now collect terms with mn = k to get

RHS =∞∑k=1

fA(k)xk

wherefA(k) =

∑d|k

x(d)f(d) =∑

d|k,d∈A

f(d)

Hence

xF ′A(x) = FA(x)∞∑k=1

fA(k)xk (1)

Now write FA(x) as a power series in x. The coefficient will depend on A and f of courseso call them pA,f (n):

FA(x) =∞∑n=0

pA,f (n)xn, pA,f (0) = FA()) =∏n∈A

1 = 1

Finally, equate the coefficients of xn on both sides of (1) to obtain

npA,f (n) =n∑k=1

fA(k)pA,f (n− k)

withpA,f (0) = 1 and fA(k) =

∑d|k,d∈A

f(d)

241

Ex A = N, f(n) = n ⇒ pA,f (n) = p(n), the (unrestricted) partition function, andfA(k) =

∑d|k d = σ(k)¡ the divisor sum function so:

np(n) =n∑k=1

σ(k)p(n− k)

Check: p(1) = 1, p(2) = 2, p(3) = 3, p(4) = 5, p(5) = 7, so LHS = 35

RHS = σ(1)p(4) + σ(2)p(3) + σ(3)p(2) + σ(4)p(1) + σ(5)p(0)

= 1 · 5 + 3 · 3 + 4 · 2 + 7 · 1 + 6 · 1= 35