Differentials in action

Before we move onto concrete examples, let us quickly review some basic differentiation rules. In order to differentiate equations of the form

F(u,v,w,\dots)=0

we need to replace all the variables u,v,w,\dots by their perturbed values u+du,v+dv,w+dw,\dots and disregard infinitesimals which are of order higher than one with respect to du, dv,dw, \dots in the resulting expression.

For example, if F(u,v)=Au+Bv for some constants A,B, we have

F(u+du,v+dv)=A(u+du)+B(v+dv)=Adu+Bdv,

since Au+Bv=0. In this case, the expression is linear in du and dv and, therefore

d(Au+Bv)=Adu+Bdv.

To differentiate an equation involving a product, F(u,v)=uv, we notice that

F(u+du,v+dv)=uv+udv+vdu+dudv

and, since uv=0, we get

d(uv)=udv+vdu.

Let us next find dF for F(u,v)=\frac {u}{v}

\displaystyle{F(u+du,v+dv)=\frac{u+du}{v+dv}=\frac{u}{v}\frac{1+\frac{du}{u}}{1+\frac{dv}{v}}}

Now, we observe that

\displaystyle{\left(1+\frac{dv}{v}\right)\left(1-\frac{dv}{v}\right)=1-\frac{dv^2}{v^2}},

Hence, up to linear terms,

\displaystyle{\frac 1{1+\frac{dv}{v}}=1-\frac{dv}{v}}

and therefore

\displaystyle{F(u+du,v+dv)=\frac{u}{v}\left(1+\frac{du}{u}\right)\left(1-\frac{dv}{v}\right)=\frac{u}{v}\left(1+\frac{du}{u}-\frac{dv}{v}\right)},

where we have disregarded the quadratic term \frac{dudv}{uv}. Finally, since F(u,v)=\frac{u}{v}=0 we get

\displaystyle{dF=\frac {vdu-udv}{v^2}}.

Successive applications of the product rule give the important power rule:

d(u^2)=udu+udu=2udu;\qquad d(u^3)=ud(u^2)+u^2du=3u^2du

and, in general, d(u^n)=nu^{n-1}du. For negative integer powers the same rule applies, as follows from the quotient and power rules. Combining the previous, we can differentiate any equation involving polynomial or rational functions.

What about irrational/transcendental functions? As for n-th roots, we have

\displaystyle{du^{1/n}=\frac 1{n}u^{1/n-1}}du

by just rearranging the power rule. Therefore, the power rule extends to rational powers.

Let’s move now to the fun part.

Reasoning with infinitesimals

Suppose we want to compute d(\sin x). In modern textbooks, they focus on the derivative d(\sin u)/du. The computation goes more or less like this.

\displaystyle{\frac{d(\sin x)}{dx}=\lim_{h\to 0}\frac{\sin(x+h)-\sin x}{h}=\lim_{h\to 0}\cos (x+h/2)\frac{\sin h/2}{h/2}}=\cos x,

where the trigonometric identity for the difference of sines has been used, as well as the fact that \sin t/t\to 1 as t\to 0. All this is nice and clean but does not provide compelling intuition. Next, we provide a proof based on infinitesimals. No computation needed. It is essentially a picture-based proof.

In the figure below, the unit circle is represented, and we consider a generic angle x being increased by an infinitesimal amount dx.

The corresponding change of \sin x is d(\sin x)=EB-AD=EB-CE=BC. The triangles \Delta OEB and \Delta ABC are similar since they are right and \angle BOE=\angle ABC. Therefore,

\displaystyle{\frac{d(\sin x)}{dx}=\frac{BC}{AB}=\frac{OE}{OB}=\cos x}.

From the perspective of rigor, the above argument is flawed, for many reasons. The “triangle” \Delta ABC is not really a rectilinear triangle, OE is actually \cos(x+dx), etc. BUT it provides a visual, compelling reason as to why d(\sin x)=\cos x dx and the conclusion is absolutely correct. It is correct because we know that the vanishing circular arc {AB} is indistinguishable from its tangent, while \cos(x+dx) is indistinguishable from \cos x.

Admittedly, this type of argument relies on the ability to recognize which approximations will become exact in the limit. We believe this is a skill worth developing.

Here is another example mentioned in the beautiful book [1] and attributed there to Newton. Let us show that d(\tan x)=\sec^2 xdx. In the figure below, |AB|=1 and \angle ABC is right. Then \tan\theta=|BD|. We increase the angle \theta by an infinitesimal amount d\theta. The tangent is then increased by d\tan\theta=|CD|. Let DE be the arc of a circle centered at A.

The triangles \Delta ABC and \Delta EDC are similar and thus

\displaystyle{\frac{|CD|}{|ED|}=\frac{|AC|}{|AB|}}

or

\displaystyle{\frac{d(\tan\theta)}{Ld\theta}=L}

(because |AC| differs from L by an infinitesimal) and finally

\displaystyle{\frac{d(\tan\theta)}{d\theta}=L^2=\sec^2\theta}.

Solving problems using differentials

Let’s start with a simple optimization problem which is more elegantly solved using differentials.

Consider a sliding segment of length L, with endpoints on the positive X and Y semi-axes. What is the location of the segment for which the area of the triangle that it determines on the first quadrant is maximal?

The standard approach to the problem is to express the area of the triangle as a function of some parameter (the horizontal projection of the segment, some angle, etc.) and find the value of the parameter that makes the derivative equal to zero. A more “symmetric” approach would be as follows. Let x\ge 0 and y\ge 0 be the lengths of the legs of the triangle. Then,

A=xy;\qquad x^2+y^2=L^2.

Differentiating both relations,

dA=xdy+ydx;\qquad xdx+ydy=0.

A necessary condition for A to reach a local maximum is dA=0. Thus we get a system of equations

xdy+ydx=0;\qquad xdx+ydy=0.

For the above homogeneous linear system to have a non-trivial solution (so infinitesimal changes (dx,dy) are allowed) we need zero determinant, y^2-x^2=0 which in our case implies x=y and then, from our second finite relation we get x=y=\sqrt{2}L/2.

If we were to follow the approach based on derivatives, we could either express y=\sqrt{L^2-x^2} and substitute, leading to the problem of minimizing a function of x

A(x)=x\sqrt{L^2-x^2}/2

We would then find A'(x) and set it equal to zero. In this simple example there is no big difference, but notice that with our approach: a) no “independent variable” is singled out; b) we avoid the annoyance of taking the derivative of a square root; c) L does not appear explicitly throughout the manipulation with differentials, but only at the beginning and at the end, when we appeal to finite relations (and this is, in principle, unavoidable given the nature of differential relations).

A similar example is presented in [2]. There, the classical problem of minimizing the length of a piece-wise straight path connecting two given points with a given line (so called Heron’s problem) is solved.

The given quantities (see Fig. above) are d=a+b, C and D and the quantity to minimize is l=p+q. This problem involves more parameters and, no matter what quantity you choose as independent, the solution using derivatives is a bit messy. Using differentials, it reduces to a compatibility condition for a homogeneous linear system as above. Namely, we have

p^2=C^2+a^2;\qquad q^2=D^2+b^2;\qquad a+b=L;\qquad p+q=l.

Differentiating all the equations gives

da+db=0;\qquad 2ada=2pdp;\qquad 2bdb=2qdq;\qquad dp+dq=dl

The condition of extremum is dl=0. Replacing da and db in terms of dp and dq we end up with a 2\times 2 system for dp,\, dq:

\displaystyle{\frac{p}{a}dp+\frac{q}{b}dq=0;\qquad dp+dq=0}

The determinant should be zero, that is a/p=b/q. This in turn implies

\displaystyle{\frac{b^2}{a^2}=\frac{q^2}{p^2}=\frac{b^2+D^2}{a^2+C^2}}

and, consequently, b/a=D/C. Finally,

\displaystyle{a=\frac{Cd}{C+D};\qquad b=\frac{Dd}{C+D}}.

Observe in particular that a/C=b/D so the path should “reflect” on the line.

In both examples above, we solved a conditional optimization problem. A possible approach is to use the method of Lagrange multipliers. Thus in our case we are trying to. minimize

f(p,q)=p+q

under the constraint

a+b=g(p,q)=\sqrt{p^2-C^2}+\sqrt{q^2-D^2}=d.

The main advantage of that method is to keep the symmetry between the variables, at the expense of introducing a number of multipliers. On our next post, we will discuss how the method was introduced by Lagrange to deal with constraints in mechanical systems. Not surprisingly, infinitesimals were at the heart of his analysis, in the form of virtual displacements.

References:

[1] T. Needham, “Visual Complex Analysis”,  Oxford University Press, 1999.

[2] T. Dray, Corinne A. Manogue, “Putting Differentials Back into Calculus”, The College Mathematics Journal 42 (2), 2010.