# Computational formula for the variance

In probability theory and statistics, the computational formula for the variance Var(X) of a random variable X is the formula

${\displaystyle \operatorname {Var} (X)=\operatorname {E} (X^{2})-[\operatorname {E} (X)]^{2}\,}$

where E(X) is the expected value of X. The result is called the König-Huygens theorem in French language literature.

A closely related identity can be used to calculate the sample variance, which is often used as an unbiased estimate of the population variance:

${\displaystyle {\hat {\sigma }}^{2}:={\frac {1}{N-1}}\sum _{i=1}^{N}(x_{i}-{\bar {x}})^{2}={\frac {N}{N-1}}\left({\frac {1}{N}}\left(\sum _{i=1}^{N}x_{i}^{2}\right)-{\bar {x}}^{2}\right)}$

The second result is sometimes, unwisely, used in practice to calculate the variance. The problem is that subtracting two values having a similar value can lead to catastrophic cancellation.[1]

## Proof

The computational formula for the population variance follows in a straightforward manner from the linearity of expected values and the definition of variance:

${\displaystyle {\begin{array}{ccl}\operatorname {Var} (X)&=&\operatorname {E} \left[(X-\operatorname {E} (X))^{2}\right]\\&=&\operatorname {E} \left[X^{2}-2X\operatorname {E} (X)+[\operatorname {E} (X)]^{2}\right]\\&=&\operatorname {E} (X^{2})-\operatorname {E} [2X\operatorname {E} (X)]+[\operatorname {E} (X)]^{2}\\&=&\operatorname {E} (X^{2})-2\operatorname {E} (X)\operatorname {E} (X)+[\operatorname {E} (X)]^{2}\\&=&\operatorname {E} (X^{2})-2[\operatorname {E} (X)]^{2}+[\operatorname {E} (X)]^{2}\\&=&\operatorname {E} (X^{2})-[\operatorname {E} (X)]^{2}\end{array}}}$

## Generalization to covariance

This formula can be generalized for covariance, with two random variables Xi and Xj:

${\displaystyle \operatorname {Cov} (X_{i},X_{j})=\operatorname {E} (X_{i}X_{j})-\operatorname {E} (X_{i})\operatorname {E} (X_{j})}$

as well as for the n by n covariance matrix of a random vector of length n:

${\displaystyle \operatorname {Var} (\mathbf {X} )=\operatorname {E} (\mathbf {XX^{\top }} )-\operatorname {E} (\mathbf {X} )\operatorname {E} (\mathbf {X} )^{\top }}$

and for the n by m cross-covariance matrix between two random vectors of length n and m:

${\displaystyle \operatorname {Cov} ({\textbf {X}},{\textbf {Y}})=\operatorname {E} (\mathbf {XY^{\top }} )-\operatorname {E} (\mathbf {X} )\operatorname {E} (\mathbf {Y} )^{\top }}$

where expectations are taken element-wise and ${\displaystyle \mathbf {X} =\{X_{1},X_{2},\ldots ,X_{n}\}}$ and ${\displaystyle \mathbf {Y} =\{Y_{1},Y_{2},\ldots ,Y_{m}\}}$ are random vectors of respective lengths n and m.

## Applications

Its applications in systolic geometry include Loewner's torus inequality.

{{ safesubst:#invoke:Unsubst||$N=Refimprove |date=__DATE__ |$B= {{#invoke:Message box|ambox}} }}