Differences as precursors to differentials

When Leibniz came to Paris in 1672, his mathematical knowledge was rather scant. It was through his acquaintance with Ch. Huygens, one of the leading mathematicians of his time in Europe, that Leibniz’s interest in Mathematics sparked. One of the problems that Huygens proposed was that of finding the sum of the series of the reciprocals of triangular numbers

\displaystyle{\frac 1{1}+\frac 1{3}+\frac 1{6}+\frac 1{10}+\dots.}

Realizing that the terms in the series were the successive differences between the terms of the sequence

\displaystyle{\frac 2{1}, \frac 2{2}, \frac 2{3},\dots.}

Leibniz concluded that the nth partial sums of the former sequence were equal to the difference between the (n+1)-th term and the first term of the latter, that is 1-\frac 2{n+1} (this device is what we call telescoping). He developed this idea, considering sequences of differences of differences (second differences), third differences, etc. Thus, one can move “up” and “down” along the sequence of successive differences, establishing relations between the sums of differences and the net change of the generating sequence.

Perhaps the simplest example is that of arithmetic sequences \{a_n\}. In this case, the sequence of differences is constant, b_n=a_{n+1}-a_n=d. The sum of the first n differences is just nd hence a_{n+1}-a_1=nd or a_{n+1}=a_1+nd. Arithmetic sequences are just linear functions of n.

Now, what if the differences of the original sequence form an arithmetic sequence and the second differences are constant?

Suppose the original sequence is a_1,a_2,\dots a_{n}, the sequence of first differences is b_1,b_2,\dots b_{n-1} where b_i=a_{i+1}-a_i and the sequence of second differences is c_1,c_2,\dots c_{n-2} with c_i=b_{i+1}-b_i=d, a constant. We know that \{b_i\} is an arithmetic sequence with difference d, hence b_i=b_1+(i-1)d. Hence

a_{n+1}-a_n=b_n=b_1+(n-1)d

a_{n}-a_{n-1}=b_{n-1}=b_1+(n-2)d,

\dots\dots

a_{2}-a_{1}=b_{1}

Adding all the relations above and taking into account the cancelations on the left hand side (telescopic effect) we obtain

a_{n+1}=a_1+nb_1+d(1+2+\dots +n-1)

or

\displaystyle{a_{n+1}=a_1+nb_1+d\frac{n(n-1)}{2}}\qquad\qquad (!)

Observe the following: a) a_n is given by a second degree polynomial; b) the free term is the first element of the sequence a_1, the coefficient of the linear term is b_1, the first “first difference”, and the coefficient of the quadratic part is d, which is the constant value of the second difference. At this point it should be clear that we can extend this procedure to the case when the third differences or, more generally, the differences of a certain order k are constant. Unsurprisingly, we get polynomials of degree equal to the order of the constant difference, where the coefficients only depend on the first values of consecutive differences. This is what we could call a “discrete” (and finite) Taylor series.

Leibniz was, above all, a philosopher. He realized that he could extend this methods to functions of a continuous variable. But then he would have to replace the differences by infinitesimal differences between to “successive” values of the variable. He had been thinking about the concept of infinitesimal for years, particularly through correspondence with Hobbes and his concept of “conatus“. All the pieces came together in his mind, leading to the creation of infinitesimal Calculus within a few years. Simultaneously, he introduced the notation for successive “differentials”: dy, d^2y, etc. and integrals (“summa omnia”) \int, \iint,\dots for the above processes of moving “down” and “up”, but this time applied to functions of a continuous variable. The passage n\to (n+1) becomes x\to x+dx. We deal with the continuous case in the next post.

Infinitesimals and their orders

Nowadays, we conceive Calculus as the study of functions. The concept of function, originated with Leibniz, was not defined in its full generality until the mid 1800s by Dirichlet. But in the early development of Calculus, the main objects of Calculus were variable quantities, notably those that varied together. For example, Euler considered quantities that vanish or increase infinitely in his famous  Introductio in analysin infinitorum, from 1748. Such dynamic view of quantities is one of the features that have been lost or, at least obscured by the modern approach, based on limits. The \epsilon - \delta definition is a bit too “algebraic” and static: the dynamics is encoded in the conditional “\forall\epsilon\,\,\, \exists\delta\,\,\,\, such that..” Euler’s “quantities”, in contrast, had a variable nature.

A quantity that we can consider as becoming ever smaller is called an infinitesimal. A possible mental image is that of a segment which increasingly diminishes until it becomes a single point, or an angle whose sides become closer and closer until they coincide. In modern terminology, an infinitesimal is a variable whose limit is zero. If a variable quantity x is approaching some (finite) value L, the difference x-L is an infinitesimal.

Calculus is concerned with simultaneous variations of related quantities. Thus for example we can consider the variation of the area of a square of side l=1 when its side is increased/decreased by an infinitesimal amount. Such infinitesimal amount was denoted by Leibniz, the Bernoullis, Euler, etc. and even today by physicist, engineers, etc, by dl, and is called a differential. A simple computation gives the new area

A=(1+dl)^2=1+2dl+(dl)^2

The corresponding change of area is A-1, which is another infinitesimal. In modern terminology, we say that the area is a continuous function of the side: if the side is infinitesimally increased/decreased, the area changes infinitesimally. In most applications to Physics and Engineering, we deal with continuous functions.

A central idea to Calculus is that infinitesimals can be classified according to the relative speed with which they approach zero. It does not make sense to talk about the speed of one infinitesimal, but it does make sense to talk about whether two related infinitesimals (as the ones considered above) approach zero at a comparable speed. Thus if h is an infinitesimal quantity, the related infinitesimal h^2 approaches zero much faster, since their ratio h^2/h=h is itself infinitesimal. We all know that if we look at a sequence of values like 1,0.1,0.01,\dots, their squares form a sequence that approaches zero much faster; 1,0.01,0.0001\dots. The infinitesimal h^3 approaches zero even faster since the ratio h^3/h^2=h is again an infinitesimal.

These considerations lead to the concept of order of infinitesimals. Namely, given two related infinitesimals h and k, we say that k is of a higher order (than h) if their ratio k/h is infinitesimal. A nice notation introduced by E. Landau and widely used in Computer Science is that of “little o”. Using the “little o” notation, we write

k=o(h)

In modern terminology, we would say: given two functions f and g with lim_{x\to a}f(x)=lim_{x\to a}g(x)=0, we say that f=o(g) if lim_{x\to a}f(x)/g(x)=0. This is nice and clean, but one has the impression that the dynamics is somehow lost.

In many instances, two related infinitesimals are “comparable”. Back to the previous example of the area as a function of the side, the quotient

\frac{A-1}{dl}=2+dl

is not an infinitesimal, but takes values closer and closer to 2 instead. In such cases we use the “big O” notation. For generic related infinitesimals k and h as above, we would write

k=O(h)

Thus, A-1=O(dl).

In the particular case when the ratio approaches 1, we say that the infinitesimals are equivalent. That means, of course, the given infinitesimals take very close values as they vanish. Equivalency is denotes by the symbol “\sim

It seems to me that the qualitative aspect of the concept of order is not sufficiently stressed. Even the suggestive “little o” notation is not used in many of the standard current undergraduate Calculus textbooks, including J. Stewart’s, Edwards & Penney, Larson & Edwards, and many others. I think this is very unfortunate, and is part of a general tendency to avoid “qualitative” and “synthetic” reasoning, favoring quantitave and procedural aspects instead.

An interesting infinitesimal is y=\sin x, where x is an infinitesimal angle. In the figure below, the radius is chosen to be one for simplicity. The following inequalities are obvious.

Area(OAB)<Area(ODB)<Area(ODC)

or, equivalently,

\frac{OA\cdot AB}{2}<\frac{x}{2}<\frac{CD}{2}

which, after dividing by AB throughout, becomes

OA<\frac{x}{AB}<\frac{CD}{AB}=\frac 1{OA},

the latter equality being a consequence of similarity of triangles OAB and ODC.

Clearly, as x approaches zero, OA approaches one. As a consequence. The ratio \frac x{AB}, being trapped between two quantities approaching one, also approaches one. Using the above terminology, AB and x are equivalent infinitesimals, AB\sim x. In the language of limits

\lim\limits_{x\to 0}\frac{\sin x}{x}=1.

A vivid example of infinitesimals of different orders is given by the different segments, areas, etc. determined on a circle by an infinitesimal angle \theta.

When the angle \theta is infinitesimal, the following quantities (functions of the angle): a, s, h and the area of the yellow circular segment are related infinitesimals. s=R\theta is just a linear function of \theta so s/\theta=R which is a non-zero constant. Therefore s=O(\theta). Next, we see that a=2R\sin\theta/2 which, according to the previous example, is equivalent to 2R\theta/2=R\theta. Therefore, a=O(\theta). As for h, we have

h=R(1-\cos\theta/2)=2R\sin^2(\theta/4).

Since \sin(\theta/4)\sim\theta/4, \sin^2(\theta/4)\sim\theta^2/16 and h=o(\theta).

Finally, the area of the segment is

A=\frac{R^2\theta}{2}-\frac{R^2\sin\theta}{2}=R^2(\theta-\sin\theta)/2

On the right hand side, we have the difference between two equivalent infinitesimals. What is the order of that? I will leave the question open at this point, and will come back to it in my next post, where we will deal with successive differences and differentials.