…many mathematicians think in terms of infinitesimal quantities: apparently, however, real mathematicians would never allow themselves to write down such thinking, at least not in front of the children. —Bill McCallum
The concept of differentiable function and differential introduced in a a previous post were related to the expansion
,
valid as . During the XVII and XVIII centuries, mathematicians thought of
and
as actual infinitesimals “à la Leibniz”. After the advent of the rigorous concept of limit, the point of view changed. The above relation is now mostly used to introduce the derivative
, which plays a central role, and both
and
are deprived of their infinitesimal nature. Namely,
is understood as an arbitrary, finite, increment of the “independent” variable whereas
is a function of
, linear in its second argument.
is thus a functional differential, and the relation
is true by definition. This point of view stresses the notion of derivative, rendering differentials as superfluous, some sort of annoyance of the past, used only in the context of linear approximation, see [1].
However, in Engineering and Physics practice, differentials come first and are perceived as actual infinitesimals while derivatives are obtained as ratios. This seems more natural, the concept of ratio being psychologically more complex than that of infinitesimal. But that is not the only advantage. Indeed, as we will show on examples, reasoning with differentials as infinitesimals leads to shorter and more clear proofs of propositions and solutions to many problems. One should concede, however, that the notion of derivative is easier to put on a firm basis in the language of limits, while infinitesimals are logically problematic.
A more symmetric point of view, where no distinction between “independent” and “dependent” variables is made, is very prolific. As an example, suppose the quantities and
are linked by the relation
If we increase and
by
(respectively
) in a way consistent with the above relation,
,
we have
.
The latter is a condition on the finite increments and
to be “compatible” with
. By considering infinitesimal increments
and
instead, we can “filter” the above relation by keeping only the terms which are linear in
and
:
where the quadratically small term is dropped. This is a general principle in Differential Calculus: in any computation involving infinitesimals, one should keep the leading ones (i.e. the ones of lowest order), dropping the higher order ones. This principle is followed by physicists, engineers and other users of differential calculus, and deemed as intuitively clear. It can be ultimately justified using the language of limits.
Thus, is the differential relation between
corresponding to the “finite” relation
. We derived
from
. This time we are not differentiating a function, but rather an equation.
The opposite process is called “integration” or “solving a differential equation”. Of course one cannot completely recover the finite relation from the differential one: the constant in
could be replaced by any other constant without altering the resulting differential relation.
In line with a more “algebraic” point of view, differential equations in modern textbooks are written in the form
,
where is a general function of three variables and the unknown is a function
. This approach carries some unpleasant consequences. For example, in modern notation
may be presented as
and solutions are sought as functional relations on some interval. Thus, the function
on the interval
is a solution of (3). The function
on the same interval is also a solution. But it would definitely be much nicer to be able to say that the full circle
we started with is a solution to the symmetric differential equation
, as well as any other circle centered at the origin. In some textbooks, they deal with the issue by saying that the equation
actually presupposes the simultaneous consideration of
where we look for solutions . Then, the full set of solutions (to be precise, non-prolongable ones) is made of right, left, upper and lower half-circles. This is a consequence of insisting on functional relations and finite rates, where the variables play an asymmetric role. This was never an issue for Leibniz, Huygens, the Bernoullis, L’Hôpital or Euler. Much less for Newton, whose independent variable was always time, an extrinsic parameter. Equations in “differential form” like
are still presented in many texts, especially those for Engineers. The solution is then presented in the natural implicit form, apparently oblivious of their initial definition of solutions as functional relations.
Back to the above procedure, in order to get from
it is assumed that both
and
are infinitesimals, and
is satisfied up to infinitesimals of order higher than
and
. This process is called linearization. More generally, given an implicit relation between the variables
we replace by
on the left hand side and impose
up to infinitesimals of order higher than
(quadratic, cubic, etc.) That leads to a differential relation
valid “along” (see below).
In order to reconcile the above linearization process with our previous concept of functional differential, one can consider a new quantity defined by
and proceed as in the case of one independent variable, i.e. expanding the change of as a sum of powers of independent increments
of
respectively. For the sake of simplicity, assume there are two independent variables,
and
. Then, if
can be written in the form
(where the higher order terms contain higher powers of and
, products like
etc.) we say that
is a differentiable function of its arguments, and call the linear part
the differential of (or
). The coefficients
and
are called partial derivatives of
with respect to
(respectively,
), denoted as
Summing up, when differentiating an equation , we differentiate the function on the left hand side as if the variables were independent. The obtained differential relation, however, only holds if
are compatible with
to first order. Next, we present a nice geometric interpretation of the notion of compatibility.
The point of view of Differential Geometry
There is a very nice interpretation of the linearization process in the language of Differential Geometry. We can think of our relation as a (hyper)surface in the space of the variables
, contained in the domain of
. Then,
is an (exact) differential form, defined on the domain of . At each point in this domain, it is a linear function of the differentials
. For example, the above relation
defines a hyperbola in the
-plane, and the corresponding differential form is
.
Then, the relation holds along
or, more generally, along any level set of
. More precisely, it holds when the differentials
are the components of a vector, tangent to the hypersurface
. Those familiar with Multivariable Calculus will recognize the condition
as the requirement for
to be perpendicular to the gradient vector
which is in turn perpendicular to the level set
. This is the geometric meaning of “differential increment being compatible with the given relation”. In the figure below,
.

More generally, if is a scalar function defined on a manifold, the tangent space to the submanifold
is the kernel of the differential form
. Solving an equation in “differential form” is precisely finding a submanifold whose tangent space is the kernel of the given form.
There is no trace of infinitesimal quantities in the above linearization procedure. After all, there is no restriction on the size of the vector ; it just needs to be tangent to the hyper-surface
. However, among Physicists, Engineers and other practitioners of Calculus, it is common to assume that the differentials of the involved variables are actual infinitesimals, and higher order terms are dropped by virtue of their relative smallness. Formally, both approaches lead to the same result, but the latter can often be used as a computational shortcut and is more intuitive.
Yet another advantage of differential relations like is that if the variables
depend on further variables
, we obtain a valid differential relation between the new variables by just considering
as functional differentials of the new independent variables and substituting accordingly. This formal invariance of the first differential was already pointed out by Leibniz and is very useful in applications. From the point of view of derivatives, it is nothing but the chain rule. From an abstract point of view, it is a consequence of linearity, and thus does not extend to higher order differentials.
In forthcoming posts, we will present some examples of the use of differentials when thought as infinitesimals and some applications of the above calculus with differentials to Physics, to the solution of optimization problems, related rates problems, etc.
References:
[1] “Putting Differentials Back into Calculus”, T. Dray, Corinne A. Manogue, The College Mathematics Journal 42 (2), 2010.