In statistics, the **observed information**, or **observed Fisher information**, is the negative of the second derivative (the Hessian matrix) of the "log-likelihood" (the logarithm of the likelihood function). It is a sample-based version of the Fisher information.

## Definition

Suppose we observe random variables $X_{1},\ldots ,X_{n}$, independent and identically distributed with density *f*(*X*; θ), where θ is a (possibly unknown) vector. Then the log-likelihood of the parameters $\theta$ given the data $X_{1},\ldots ,X_{n}$ is

- $\ell (\theta |X_{1},\ldots ,X_{n})=\sum _{i=1}^{n}\log f(X_{i}|\theta )$.

We define the **observed information matrix** at $\theta ^{*}$ as

- ${\mathcal {J}}(\theta ^{*})=-\left.\nabla \nabla ^{\top }\ell (\theta )\right|_{\theta =\theta ^{*}}$

- $=-\left.\left({\begin{array}{cccc}{\tfrac {\partial ^{2}}{\partial \theta _{1}^{2}}}&{\tfrac {\partial ^{2}}{\partial \theta _{1}\partial \theta _{2}}}&\cdots &{\tfrac {\partial ^{2}}{\partial \theta _{1}\partial \theta _{n}}}\\{\tfrac {\partial ^{2}}{\partial \theta _{2}\partial \theta _{1}}}&{\tfrac {\partial ^{2}}{\partial \theta _{2}^{2}}}&\cdots &{\tfrac {\partial ^{2}}{\partial \theta _{2}\partial \theta _{n}}}\\\vdots &\vdots &\ddots &\vdots \\{\tfrac {\partial ^{2}}{\partial \theta _{n}\partial \theta _{1}}}&{\tfrac {\partial ^{2}}{\partial \theta _{n}\partial \theta _{2}}}&\cdots &{\tfrac {\partial ^{2}}{\partial \theta _{n}^{2}}}\\\end{array}}\right)\ell (\theta )\right|_{\theta =\theta ^{*}}$

In many instances, the observed information is evaluated at the maximum-likelihood estimate.^{[1]}

## Fisher information

The Fisher information ${\mathcal {I}}(\theta )$ is the expected value of the observed information given a single observation $X$ distributed according to the hypothetical model with parameter $\theta$:

- ${\mathcal {I}}(\theta )={\mathrm {E} }({\mathcal {J}}(\theta ))$.

## Applications

In a notable article, Bradley Efron and David V. Hinkley ^{[2]} argued that the observed information should be used in preference to the expected information when employing normal approximations for the distribution of maximum-likelihood estimates.

## See also

## References

- ↑ Dodge, Y. (2003)
*The Oxford Dictionary of Statistical Terms*, OUP. ISBN 0-19-920613-9
- ↑ {{#invoke:Citation/CS1|citation
|CitationClass=journal
}}