Leibnitz's Formula

November 3, 2009 at 11:42 pm (Mathematics)

Leibnitz’s Formula for π

one of the peculiarities of π, the ratio of circumference of a circle to its diameter and thus a strictly geometric entity, is that it comes up in all sorts of unexpected places, thus giving rise to the belief, common amongst pure mathematicians, that Nature has a sort of basic kit of numbers, including notably π, e, i and Γ that She applies here, there and everywhere. Buffon, the eighteenth-century French naturalist, worked out a formula giving the probability of a needle of length l dropped at random onto a floor ruled with parallel lines at unit intervals cutting at least one line. If l is less than a unit in length, the formula turns out to be 2l/π and this result has even been tested experimentally by a modern scientist, Kahan. Actually, in this case and very many others, there is a perfectly rational connection between the formula and the properties of circles, but I must admit that I am floored by the connection between π and the Gamma Function in the weird and rather beautiful result Γ(1/2) = √π

π also turns up as the limit to various numerical series, a matter which in the past was of considerable importance as manufacturing methods required better and better estimates of the value of π. Today, computers have calculated the value of π to over a million decimal places so the question of exactitude has become academic — although computers still use formulae originally discovered by pure mathematicians such as Euler or Ramanujan.

Leibnitz, co-inventor of the Calculus, produced several centuries ago, somewhat out of a hat, the remarkable series

π/4 = 1 – 1/3 + 1/5 – 1/7 + ……

British mathematicians, eager to give as much credit as possible to Newton, pointed out that a Scot, Gregory, had already derived, using Newton’s version of the calculus, the formula

tan^-1 x = x – 1/3 x² + 1/5 x² ……

and that you obtain Leibnitz’s formula by setting x = 1.

However, apart from the question of priority, one might reasonably wonder why it should be necessary to bring in calculus to validate such a simple-looking series. A problem in so-called elementary number theory should, so I feel at any rate, make no appeal to the methods of analysis or any other ‘higher’ mathematics but rely uniquely on the properties of the natural numbers. I feel so strongly about this that I had at one point even thought of offering a small money reward for a strictly numerical proof of Leibnitz’s famous series, but I am glad I did not do so, since I have subsequently come across one in Hilbert’s excellent book, Geometry and the Imagination.

The complete proof is not at all easy — ‘elementary’ proofs in Number Theory are not necessarily simple, far from it — but the general drift of the argument is straightforward enough.

Consider a circle whose centre is at the origin with radius r , r a positive integer (> 0). The formula for the circle is thus x²+ y² = r² .

We mark off lattice points to make a network of squares (or use squared paper), and take each lattice as having a side of unit length.

For any given choice of circle (with r > 1), there will be squares which ‘overlap’, part of the square falling within the circumference and part falling outside the circumference and a single point counts as ‘part’ of a square.

We define a function f(r) with r a positive integer to be the sum total of all lattices where the bottom left hand corner of the lattice is either inside or on the circumference of a circle radius r. (Any other criterion, such as counting a square ‘when there is more than half its area inside the circle’, would do so long as we stick to it, but there are good reasons for choosing this ‘left hand corner’ criterion, as will shortly be apparent.)

It is not clear at a glance whether the lattice area, evaluated according to our left hand corner criterion, is larger or smaller than the true area of the circle. However, as we make the lattices smaller and smaller, i.e. increase r, we expect the difference to diminish progressively.

f(1) = 5 — remember we are counting the squares where only the left hand corner point lies on the circumference. I make f(2) come to 13 and f(3) come to 29, while the two higher values given below are taken from Hilbert’s book Geometry and the Imagination :

f(2) = 13

f(3) = 29

f(10) = 317

f(100) = 31417

The absolute value of the difference between the lattice area, f(r), evaluated simply by counting the relevant lattices, and the area of the circle, π r², is |f(r) – π r²| If we use f(r) as a rough and ready estimate of the area of the circle and divide by r² we thus get an estimate of the value of π obtaining

π ≈ 13/4 = 3.125

π ≈ 29/9 = 3.222222…

π ≈ 317/100 = 3.17

π ≈ 31417/1000 = 3.1417

Now, since the diagonal of a unit square lattice is √2, all the ‘borderline cases’ will be included within a circular annulus bounded within by a circle of radius of (r – Ö2) and without by a circle of radius (r + √2).

The area of this annulus is the difference between the larger and smaller circles, i.e.

[(r + √2)² π – (r + √2)² π] = 4 √2 π r

|f(r) – π r²|, the discrepancy between the lattice area and the area of the circle, is bound to be less than the annulus area since some lattices falling within the annulus area get counted in f(r), and certainly f(r) cannot be greater than the annulus area.

Thus

|f(r) – π r²| ≤ 4 √2 π r

which, dividing right through by r² gives

|f(r)/r²– π| ≤ 4 √2 π/r …………………..(i)

Now, assuming Cartesian coordinates with 0 as the centre of the circle, for any value of r there will be a certain number of points which lie on the circumference of the circle, those points (x, y) which satisfy the equation

(x² + y²) = r² where r is a positive integer (> 0).

But we must count all the negative values of x and y as well. For example, with r = 2, the circumference will pass through the lattice points (2, 0), (–2, 0), (0, 2) and (0, –2) and no others.

We now introduce a new variable n = r² making the radius Ön and the equation of the circle becomes x² + y² = n Although n must be an integer, we lift the restriction on r so that the radius is not necessarily an integer, e.g. r = √7, r = √13 and so on.

Now, the number of lattice points on the circumference of a circle with radius √n is equivalent to four times the number of ways that an integer n can be expressed as the sum of two squares — four times because we allow x and y to take minus values. This is strictly a problem in Number Theory and an important theorem states that

The number of ways in which an integer can be expressed as the sum of the squares of two integers is equal to four times the excess of the number of factors of n having the form 4k + 1 over the number of factors having the form 4k + 3.

Take 35 = 5 × 7. We have as factors of 35 : 1, 5, 7 and 35 which are respectively

1 (mod 4)

3 (mod 4)

Since there are two of each type and 2 –- 2 = 0 there is no excess of the (4k+ 1) type and so, if the theorem is correct, 35 cannot be represented as the sum of two squares, which is the case.

The proof of the theorem is quite complicated and will not be attempted here. What we can show at once is that

No prime p which is 3 (mod 4) can be represented as the sum of two (integer) squares.

This is so because any odd number, whether it be 1 or 3 (mod 4), will be 1 (mod 4) when squared. And every even number, whether 2 or 0 (mod 4) will be 0 (mod 4) when squared. So if p happens to be 3 (mod 4) like 7 or 11, it will have no representation as the sum of two squares, i.e. the equation a² + b² = 3 (mod m) is insoluble in integers.

However, if p prime is 1 (mod 4) it may be possible to find a representation in two squares since (4k+1)² + even² = 1 (mod 4) is possible. A theorem given by Fermat, which goes some way towards establishing the principal theorem, states that

An odd prime p is expressible as the sum of two squares if and only if p = 1 (mod 4)

The ‘if’ part means that every odd prime p such as 5, 13, 17 and so on can be expressed as the sum of two squares. 13 = 3² + 2² for example and 17 = 4² + 1².

From our point of view, any representation such as 5 = 1² + 2² gives us eight lattice points, four for the different ways of forming (1² + 2²) and four for the different ways of forming (2² + 1²) i.e. the lattice points with coordinates

(1, 2), (1, –2), (–1, 2), (–1, –2)

and those with coordinates

(2, 1), (2, –1), (–2, 1), (–2, –1)

65 = 5 ´ 13 has factors, 1, 5, 13 and 65 all of which are positive integers which are 1 (mod 4). There should, then, be four different ways of representing 65 as the sum of two squares, where the order in which we write the two squares matters. And in effect we have

65 = (1² + 8²) = (8² + 1²) = (4² + 7²) = (7² + 4²)

We end up with eight lattice points for each combination, namely

(1, 8), (–1, 8), (1, –8), (–1, –8),

(8, 1), (8, –1), (–8, 1), (–8, –1),

and

(4, 7), (–4, 7), (4, –7), (–4, –7),

(7, 4), (7, –4), (–7, 4), (–7, –4),

The idea now is that, by considering every number n £ r² , working out how many times it can be expressed as a sum of two squares and adding the results, we will obtain f(r) on multiplying by 4 . Actually, this would include the origin, the point (0, 0), which we do not want to consider, so, excluding this, we have

(f(r) – 1) = 4 ∑ representations of n ≤ r² as two squares.

Now 1 has a representation since 1² = 1² + 0² giving the four points (1, 0), (0, 1), (–1, 0) and (0, –1), 2 = 1² + 1² has a representation giving four points, 3 none and 4 = 2² + 2² gives four points giving twelve in all. I made f(2) = 13 which checks out with the above since (f(2) – 1) = 12.

Actually, rather than work out the excess for each number n individually, it is much more convenient to add up the number of factors of all numbers of the form (4k+1) and then subtract the number of factors of all numbers of the form (4k+3). In the first list we have

1, 5, 9…. (4k+1) ≤ r² and in the second

3, 7, 11…… (4k+3) ≤ r²

Each of the numbers above must appear in the total for its class as many times as there are multiples of it that are at most r². 1 will obviously appear r² times, but 5 will only appear [r²/5] where the square brackets indicate the nearest integer ≤ r²/5

Finally, since we are not removing or adding anything, we can subtract the first term in the (4k+3) category from the first term in the (4k+1) category, the second term from the second and so on. We end up with the open-ended series, depending on the choice of r

(f(r) – 1) = 4 ∑ representations of n ≤ r² as two squares.

= 4 { [r²] – [r²/3] + [r²/5] – [r²/7] + [r²/9] ……..} ..(ii)

Now the ‘least integer’ series [r²] – [r²/3] + [r²/5] – [r²/7] + [r²/9] …

unlike the series r² – r²/3 + r²/5 –r²/7 + …… is not an infinite series since it terminates as soon as we reach the point where r²/(4k+1) < 1 making all subsequent terms = 0.

We assume for simplicity that r is odd and of the type 4k+1 so that r–1 is a multiple of 4. Since all the terms with 4k+1 as denominator are positive, we can split the series into two, and then add up the pairs, where the first member of a pair is taken from the + and the second from the – Series. The ‘+ Series’ contains [r²/r] and the final non-zero term is [r²/(r²].

[r²/1] + [r²/5] + [r²/9] …+ [r²/r] + [r²/(r+4] + ……..[r²/r²]

0 + [r²/3] + [r²/7] …+ [r²/(r–2)] + [r²/(r+2)] +…[r²/(r²–2)]

If we cut off the series at [r²/r] the error involved, namely the rest of the original series, will be less than r, or a r where a is some proper fraction, i.e.

[r²/(r+4] – [r²/(r+2)] …………… – [r²/(r² – 2)] + [r²/r²] < r

To see this, we write all terms after [r²/r] as [r²/(r+k)] where k is even and ranges from 2 to (r² – r) since (r+ (r² – r)) is the denominator of the final non-zero term. The absolute values of all these terms are less than [r²/r] = r and they come in pairs which alternate in sign.

Also, all terms where 2(r+k) > r² or k > (r²/2 – r) = r(r – 2)/2 will make [r²/(r+k)] = 1 The first such term comes when k = r(r – 2)/2 + 1/2 (since k is even) i.e. when k = (r–1)²/2 From this point on all pairs will sum to zero so we can ignore them and only need consider the pairs between [r²/r] and ending [r²/(r+(r–1)²/2)]. There will be (r–1)²/8 such pairs with a maximum difference of 1 in each case, and so the sum total of the error cannot exceed (r–1)²/8 < r since (r–1)² < 8r for r ≥ 2

An example may make this more intelligible. Take r = 9 which is a number of the form 4k+1. Then [9²/9] = 9 and all terms from then on have their absolute values < 9 while the final last term is [9²/9²] = [9²/(9+72)] The last term where [9²/(9+k)] ≥ 2 comes when k = 30 and we can neglect all pairs where k has values > 32 (we make the last value k = 32 to make up the pair). k even thus ranges from 2 to 32

2 4

6 8

10 12

…………

30 32

The maximum absolute amount possible will thus be 32/4 = 8 (in this case (–8)) and 8 < 9 = r

A similar argument can be used to establish the case where r is odd and of the form 4k+3 and any even value of r will be sandwiched between the two cases.

We thus have, returning to (ii)

¼ (f(r) – 1) = [r²] – [r²/3] + [r²/5] – [r²/7] + [r²/9] ± [r²/r] ± a r

where a < 1

To lift the square brackets, we note that the error in each term is less than 1 and that there will be, for r odd, (r+1)/2 terms if we cut the series off at [r²/r]. The total possible error is thus < (r+1)/2 < r for r ³ 2 and can be written as ± b r where b < 1

We can thus write

¼ (f(r) – 1) = r² – r²/3 + r²/5 – r²/7 ……. ± a r ± b r ………..(iii)

Dividing right through by r² we obtain

1/4r² (f(r) – 1) = 1– 1/3 + 1/5 – 1/7 ……. ± a/r ± b/r

1/4 (f(r)/r² – 1/r²) = 1 – 1/3 + 1/5 – 1/7 ……. ± a/r ± b/r

which has limit as r → ∞ f(r)/4r² = 1 – 1/3 + 1/5 – 1/7 …….

Finally, we note that the discrepancy between the area of the circle and the lattice representation is

|f(r)/r²– π| ≤ 4 √2 π/r with limit 0 as r → ∞ giving us the desired

limit r → ∞ 1 – 1/3 + 1/5 – 1/7 + 1/9 ……. ± 1/(2r+1) = π/4

Sebastian Hayes

Sebastian Hayes