# Normalization (statistics)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

{{#invoke:Hatnote|hatnote}}

In statistics and applications of statistics, normalization can have a range of meanings. In the simplest cases, normalization of ratings means adjusting values measured on different scales to a notionally common scale, often prior to averaging. In more complicated cases, normalization may refer to more sophisticated adjustments where the intention is bring the entire probability distributions of adjusted values into alignment. In the case of normalization of scores in educational assessment, there may be an intention to align distributions to a normal distribution. A different approach to normalization of probability distributions is quantile normalization, where the quantiles of the different measures are brought into alignment.

In another usage in statistics, normalization refers to the creation of shifted and scaled versions of statistics, where the intention is that these normalized values allow the comparison of corresponding normalized values for different datasets in a way that eliminates the effects of certain gross influences, as in an anomaly time series. Some types of normalization involve only a rescaling, to arrive at values relative to some size variable. In terms of levels of measurement, such ratios only make sense for ratio measurements (where ratios of measurements are meaningful), not interval measurements (where only distances are meaningful, but not ratios).

In theoretical statistics, parametric normalization can often lead to pivotal quantities – functions whose sampling distribution does not depend on the parameters – and to ancillary statistics – pivotal quantities that can be computed from observations, without knowing parameters.

## Examples

There are various normalizations in statistics – nondimensional ratios of errors, residuals, means and standard deviations, which are hence scale invariant – some of which may be summarized as follows. Note that in terms of levels of measurement, these ratios only make sense for ratio measurements (where ratios of measurements are meaningful), not interval measurements (where only distances are meaningful, but not ratios). See also Category:Statistical ratios.

Name Formula Use
Standard score ${\frac {X-\mu }{\sigma }}$ Normalizing errors when population parameters are known.
Student's t-statistic ${\frac {X-{\overline {X}}}{s}}$ Normalizing residuals when population parameters are unknown (estimated).
Studentized residual ${\frac {{\hat {\epsilon }}_{i}}{{\hat {\sigma }}_{i}}}={\frac {X_{i}-{\hat {\mu }}_{i}}{{\hat {\sigma }}_{i}}}$ Normalizing residuals when parameters are estimated, particularly across different data points in regression analysis.
Standardized moment ${\frac {\mu _{k}}{\sigma ^{k}}}$ Normalizing moments, using the standard deviation $\sigma$ as a measure of scale.
Coefficient of
variation
${\frac {\sigma }{\mu }}$ Normalizing dispersion, using the mean $\mu$ as a measure of scale, particularly for positive distribution such as the exponential distribution and Poisson distribution.

Note that some other ratios, such as the variance-to-mean ratio $\left({\frac {\sigma ^{2}}{\mu }}\right)$ , are also done for normalization, but are not nondimensional: the units do not cancel, and thus the ratio has units, and are not scale invariant.

## Applications

In an experimental context, normalizations are used to standardise microarray data to enable differentiation between real (biological) variations in gene expression levels and variations due to the measurement process.

In microarray analysis, normalization refers to the process of identifying and removing the systematic effects, and bringing the data from different microarrays onto a common scale.

## Related processes

In computer vision, combining images to a common scale is called image registration, in the sense of "aligning different images". For example, stitching together images in a panorama or combining pictures from different angles.

In physics, nondimensionalization is used.

{{ safesubst:#invoke:Unsubst||$N=Refimprove |date=__DATE__ |$B= {{#invoke:Message box|ambox}} }}