# Total derivative

{{#invoke: Sidebar | collapsible }}

In the mathematical field of differential calculus, the term total derivative has a number of closely related meanings.

{{safesubst:#invoke:list|bulleted}}

## Differentiation with indirect dependencies

Suppose that f is a function of two variables, x and y. Normally these variables are assumed to be independent. However, in some situations they may be dependent on each other. For example y could be a function of x, constraining the domain of f to a curve in ${\mathbb {R} }^{2}$ . In this case the partial derivative of f with respect to x does not give the true rate of change of f with respect to changing x because changing x necessarily changes y. The total derivative takes such dependencies into account.

For example, suppose

$f(x,y)=xy$ .

The rate of change of f with respect to x is usually the partial derivative of f with respect to x; in this case,

${\frac {\partial f}{\partial x}}=y$ .

However, if y depends on x, the partial derivative does not give the true rate of change of f as x changes because it holds y fixed.

Suppose we are constrained to the line

$y=x;$ then

$f(x,y)=f(x,x)=x^{2}$ .

In that case, the total derivative of f with respect to x is

${\frac {{\mathrm {d} }f}{{\mathrm {d} }x}}=2x$ .

Instead of immediately substituting for y in terms of x, this can be found equivalently using the chain rule:

${\frac {{\mathrm {d} }f}{{\mathrm {d} }x}}={\frac {\partial f}{\partial x}}+{\frac {\partial f}{\partial y}}{\frac {{\mathrm {d} }y}{{\mathrm {d} }x}}=y+x\cdot 1=x+y.$ Notice that this is not equal to the partial derivative:

${\frac {{\mathrm {d} }f}{{\mathrm {d} }x}}=2x\neq {\frac {\partial f}{\partial x}}=y=x$ .

While one can often perform substitutions to eliminate indirect dependencies, the chain rule provides for a more efficient and general technique. Suppose M(t, p1, ..., pn) is a function of time t and n variables $p_{i}$ which themselves depend on time. Then, the total time derivative of M is

${\operatorname {d} M \over \operatorname {d} t}={\frac {\operatorname {d} }{\operatorname {d} t}}M{\bigl (}t,p_{1}(t),\ldots ,p_{n}(t){\bigr )}.$ The chain rule for differentiating a function of several variables implies that

${\operatorname {d} M \over \operatorname {d} t}={\frac {\partial M}{\partial t}}+\sum _{i=1}^{n}{\frac {\partial M}{\partial p_{i}}}{\frac {\operatorname {d} p_{i}}{\operatorname {d} t}}={\biggl (}{\frac {\partial }{\partial t}}+\sum _{i=1}^{n}{\frac {\operatorname {d} p_{i}}{\operatorname {d} t}}{\frac {\partial }{\partial p_{i}}}{\biggr )}(M).$ This expression is often used in physics for a gauge transformation of the Lagrangian, as two Lagrangians that differ only by the total time derivative of a function of time and the n generalized coordinates lead to the same equations of motion. An interesting example concerns the resolution of causality concerning the Wheeler–Feynman time-symmetric theory. The operator in brackets (in the final expression) is also called the total derivative operator (with respect to t).

For example, the total derivative of f(x(t), y(t)) is

${\frac {\operatorname {d} f}{\operatorname {d} t}}={\partial f \over \partial x}{\operatorname {d} x \over \operatorname {d} t}+{\partial f \over \partial y}{\operatorname {d} y \over \operatorname {d} t}.$ Here there is no ∂f / ∂t term since f itself does not depend on the independent variable t directly.

## The total derivative via differentials

Differentials provide a simple way to understand the total derivative. For instance, suppose $M(t,p_{1},\dots ,p_{n})$ is a function of time t and n variables $p_{i}$ as in the previous section. Then, the differential of M is

$\operatorname {d} M={\frac {\partial M}{\partial t}}\operatorname {d} t+\sum _{i=1}^{n}{\frac {\partial M}{\partial p_{i}}}\operatorname {d} p_{i}.$ This expression is often interpreted heuristically as a relation between infinitesimals. However, if the variables t and $p_{i}$ are interpreted as functions, and $M(t,p_{1},\dots ,p_{n})$ is interpreted to mean the composite of M with these functions, then the above expression makes perfect sense as an equality of differential 1-forms, and is immediate from the chain rule for the exterior derivative. The advantage of this point of view is that it takes into account arbitrary dependencies between the variables. For example, if $p_{1}^{2}=p_{2}p_{3}$ then $2p_{1}\operatorname {d} p_{1}=p_{3}\operatorname {d} p_{2}+p_{2}\operatorname {d} p_{3}$ . In particular, if the variables $p_{i}$ are all functions of t, as in the previous section, then

$\operatorname {d} M={\frac {\partial M}{\partial t}}\operatorname {d} t+\sum _{i=1}^{n}{\frac {\partial M}{\partial p_{i}}}{\frac {\partial p_{i}}{\partial t}}\,\operatorname {d} t.$ ## The total derivative as a linear map

$\lim _{x\rightarrow p}{\frac {\|f(x)-f(p)-\operatorname {d} f_{p}(x-p)\|}{\|x-p\|}}=0.$ The linear map $\operatorname {d} f_{p}$ is called the (total) derivative or (total) differential of $f$ at $p$ . A function is (totally) differentiable if its total derivative exists at every point in its domain.

Note that f is differentiable if and only if each of its components $f_{i}:U\rightarrow {\mathbb {R} }$ is differentiable. For this it is necessary, but not sufficient, that the partial derivatives of each function fj exist. However, if these partial derivatives exist and are continuous, then f is differentiable and its differential at any point is the linear map determined by the Jacobian matrix of partial derivatives at that point.

## Total differential equation

{{#invoke:main|main}} A total differential equation is a differential equation expressed in terms of total derivatives. Since the exterior derivative is a natural operator, in a sense that can be given a technical meaning, such equations are intrinsic and geometric.

## Application of the total differential to error estimation

In measurement, the total differential is used in estimating the error Δf of a function f based on the errors Δx, Δy, ... of the parameters x, y, .... Assuming that the interval is short enough for the change to be approximately linear:

Δf(x) = f'(x) × Δx

and that all variables are independent, then for all variables,

$\Delta f=f_{x}\Delta x+f_{y}\Delta y+\cdots$ This is because the derivative fx with respect to the particular parameter x gives the sensitivity of the function f to a change in x, in particular the error Δx. As they are assumed to be independent, the analysis describes the worst-case scenario. The absolute values of the component errors are used, because after simple computation, the derivative may have a negative sign. From this principle the error rules of summation, multiplication etc. are derived, e.g.:

Let f(a, b) = a × b;
Δf = faΔa + fbΔb; evaluating the derivatives
Δf = bΔa + aΔb; dividing by f, which is a × b
Δf/f = Δa/a + Δb/b

That is to say, in multiplication, the total relative error is the sum of the relative errors of the parameters.

To illustrate how this depends on the function considered, consider the case where the function is f(a, b) = a ln b instead. Then, it can be computed that the error estimate is

Δf/f = Δa/a + Δb/(b ln b)

with an extra 'ln b' factor not found in the case of a simple product. This additional factor tends to make the error smaller, as ln b is not as large as a bare b.