|
|
(One intermediate revision by one other user not shown) |
Line 1: |
Line 1: |
| {{Redirect|Mean deviation|the book|Mean Deviation (book)}}
| | The writer is called Bret Goldblatt although it is not his birth name. Since she was 18 she's been working as being a production and planning agent. To watch movies is whatever she is totally addicted time for. Connecticut is where I've always been living and Truly like every day living over here. If you want to uncover out more check out my website: http://iwangun.tumblr.com |
| {{refimprove|date=December 2013}}
| |
| | |
| In [[statistics]], the '''absolute deviation''' of an element of a [[data set]] is the [[absolute difference]] between that element and a given point. Typically the deviation is reckoned from the [[central tendency|central value]], being construed as some type of [[average]], most often the [[median]] or sometimes the [[mean]] of the data set.
| |
| | |
| :<math>D_i = |x_i-m(X)| </math>
| |
| | |
| where
| |
| | |
| : ''D''<sub>''i''</sub> is the absolute deviation,
| |
| : ''x''<sub>''i''</sub> is the data element
| |
| :and ''m''(''X'') is the chosen measure of [[central tendency]] of the data set—sometimes the [[mean]] (<math>\overline{x}</math>), but most often the [[median]].
| |
| | |
| == Measures of dispersion ==
| |
| Several measures of [[statistical dispersion]] are defined in terms of the absolute deviation.
| |
| | |
| === Average absolute deviation ===
| |
| The '''average absolute deviation,''' or simply '''average deviation''' of a data set is the [[average]] of the absolute deviations and is a [[summary statistics|summary statistic]] of [[statistical dispersion]] or variability. In its general form, the average used can be the [[arithmetic mean|mean]], [[median]], [[mode (statistics)|mode]], or the result of another measure of [[central tendency]].
| |
| | |
| The average absolute deviation of a set {''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>} is
| |
| | |
| :<math>\frac{1}{n}\sum_{i=1}^n |x_i-m(X)|.</math>
| |
| | |
| The choice of measure of central tendency, <math>m(X)</math>, has a marked effect on the value of the average deviation. For example, for the data set {2, 2, 3, 4, 14}:
| |
| | |
| {| class="wikitable" style="margin:auto;width:100%;"
| |
| |-
| |
| !Measure of central tendency <math>m(X)</math>
| |
| !Average absolute deviation
| |
| |-
| |
| | Mean = 5
| |
| | <MATH>\frac{|2 - 5| + |2 - 5| + |3 - 5| + |4 - 5| + |14 - 5|}{5} = 3.6</MATH>
| |
| |-
| |
| | Median = 3
| |
| | <MATH>\frac{|2 - 3| + |2 - 3| + |3 - 3| + |4 - 3| + |14 - 3|}{5} = 2.8</MATH>
| |
| |-
| |
| | Mode = 2
| |
| | <MATH>\frac{|2 - 2| + |2 - 2| + |3 - 2| + |4 - 2| + |14 - 2|}{5} = 3.0</MATH>
| |
| |}
| |
| | |
| The average absolute deviation from the median is less than or equal to the average absolute deviation from the mean. In fact, the average absolute deviation from the median is always less than or equal to the average absolute deviation from any other fixed number. | |
| | |
| The average absolute deviation from the mean is less than or equal to the [[standard deviation]]; one way of proving this relies on [[Jensen's inequality]].
| |
| | |
| For the [[normal distribution]], the ratio of mean absolute deviation to standard deviation is <math>\scriptstyle \sqrt{2/\pi} = 0.79788456\dots</math>. Thus if ''X'' is a normally distributed random variable with expected value 0 then, see Geary (1935):<ref>Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika, 27(3/4), 310-332.</ref>
| |
| | |
| : <math> w=\frac{ E|X| }{ \sqrt{E(X^2)} } = \sqrt{\frac{2}{\pi}}. </math>
| |
| | |
| In other words, for a normal distribution, mean absolute deviation is about 0.8 times the standard deviation.
| |
| However in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample ''n'' with the following bounds: <math> w_n \in [0,1] </math>, with a bias for small ''n''.<ref>See also Geary's 1936 and 1946 papers: Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation for normal samples. Biometrika, 28(3/4), 295-307 and Geary, R. C. (1947). Testing for normality. Biometrika, 34(3/4), 209-242.</ref>
| |
| | |
| ==== Mean absolute deviation (MAD) ====
| |
| The '''mean absolute deviation''' (MAD), also referred to as the mean deviation (or sometimes '''average absolute deviation''', though see above for a distinction), is the mean of the absolute deviations of a set of data about the data's mean. In other words, it is the average distance of the data set from its mean. MAD has been proposed to be used in place of [[standard deviation]] since it corresponds better to real life.<ref>http://www.edge.org/response-detail/25401</ref> Because the MAD is a simpler measure of variability than the [[standard deviation]], it can be used as pedagogical tool to help motivate the standard deviation.<ref name=Kader1999>{{cite journal|last=Kader|first=Gary|title=Means and MADS|journal=Mathematics Teaching in the Middle School|date=March 1999|volume=4|issue=6|pages=398–403|url=http://www.learner.org/courses/learningmath/data/overview/readinglist.html|accessdate=20 February 2013}}</ref><ref name=GAISE>{{cite book|last=Franklin|first=Christine, Gary Kader, Denise Mewborn, Jerry Moreno, Roxy Peck, Mike Perry, and Richard Scheaffer|title=Guidelines for Assessment and Instruction in Statistics Education|year=2007|publisher=American Statistical Association|isbn=978-0-9791747-1-1|url=http://www.amstat.org/education/gaise/GAISEPreK-12_Full.pdf}}</ref>
| |
| | |
| This method forecast accuracy is very closely related to the [[mean squared error]] (MSE) method which is just the average squared error of the forecasts. Although these methods are very closely related MAD is more commonly used{{citation needed|date=September 2013}} because it does not require squaring.
| |
| | |
| More recently, the mean absolute deviation about mean is expressed as a covariance between a random variable and its under/over indicator functions;<ref name=elamir2012>{{cite journal|last=Elamir|first=Elsayed A.H.|title=On uses of mean absolute deviation: decomposition, skewness and correlation coefficients|journal=Metron: International Journal of Statistics|year=2012|volume=LXX|issue=2-3|url=http://www.metronjournal.it/ultimo/home_en.lasso}}</ref> as
| |
| | |
| :<math>D_m = E|X-\mu|=2Cov(X,I_O) </math>
| |
| | |
| where | |
| | |
| : ''D''<sub>''m''</sub> is the expected value of the absolute deviation about mean,
| |
| : "Cov" is the covariance between the random variable X and the over indicator function (<math>I_{O}</math>).
| |
| and the over indicator function is defined as
| |
| | |
| :<math>\mathbf{I}_O :=
| |
| \begin{cases}
| |
| 1 &\text{if } x >\mu, \\
| |
| 0 &\text{else }
| |
| \end{cases}
| |
| </math>
| |
| | |
| Based on this representation new correlation coefficients are derived. These correlation coefficients ensure high stability of statistical inference when we deal with distributions that are not symmetric and for which the normal distribution is not an appropriate approximation. Moreover an easy and simple way for a semi decomposition of Pietra’s index of inequality is obtained.
| |
| | |
| === Average absolute deviation about median ===
| |
| Mean absolute deviation about median (MAD median) offers a direct measure of the scale of a random variable about its median
| |
| :<math>D_{med} = E|X-median| </math>
| |
| For the normal distribution we have <math>D_{med} = \sigma \sqrt(2/\pi) </math>. Since the median minimizes the average absolute distance, we have <math>D_{med} <= D_{mean} </math>. By using the general dispersion function Habib (2011) defined MAD about median as
| |
| :<math>D_{med} = E|X-median|=2Cov(X,I_O) </math>
| |
| where the indicator function is
| |
| :<math>\mathbf{I}_O :=
| |
| \begin{cases}
| |
| 1 &\text{if } x > median, \\
| |
| 0 &\text{else }
| |
| \end{cases}
| |
| </math>
| |
| This representation allows for obtaining MAD median correlation coefficients;<ref name=Habib2011>{{cite journal|last=Habib|first=Elsayed A.E.|title=Correlation coefficients based on mean absolute deviation about median|journal=International Journal of Statistics and Systems|year=2011|volume=6|issue=4|page=pp. 413–428|url=http://www.ripublication.com/ijss.html}}</ref>
| |
| | |
| ==== Median absolute deviation (MAD) ====
| |
| {{main|Median absolute deviation}}
| |
| | |
| The '''median absolute deviation''' (also MAD) is the ''median'' of the absolute deviation from the ''median''. It is a [[Robust measures of scale|robust estimator of dispersion]].
| |
| | |
| For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (reordered as {0, 1, 1, 1, 11}) with a median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation (also called MAD) is 1.
| |
| | |
| ==== Maximum absolute deviation ====
| |
| The '''maximum absolute deviation''' about a point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above with <math>m(X)=\text{max}(X)</math>, where <math>\text{max}(X)</math> is the [[sample maximum]]. The maximum absolute deviation cannot be less than half the [[range (statistics)|range]].
| |
| | |
| == Minimization ==
| |
| The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as ''minimizing'' dispersion:
| |
| The median is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows:
| |
| *[[L2 norm|''L''<sup>2</sup> norm]] statistics: the mean minimizes the [[mean squared error]]
| |
| *[[L1 norm|''L''<sup>1</sup> norm]] statistics: the median minimizes ''average'' absolute deviation,
| |
| *[[Uniform norm|''L''<sup>∞</sup> norm]] statistics: the [[mid-range]] minimizes the ''maximum'' absolute deviation
| |
| *trimmed [[Uniform norm|''L''<sup>∞</sup> norm]] statistics: for example, the [[midhinge]] (average of first and third [[quartile]]s) which minimizes the ''median'' absolute deviation of the whole distribution, also minimizes the ''maximum'' absolute deviation of the distribution after the top and bottom 25% have been trimmed off.
| |
| | |
| == Estimation ==
| |
| {{Expand section|date=March 2009}}
| |
| The mean absolute deviation of a sample is a [[biased estimator]] of the mean absolute deviation of the population.
| |
| In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For the population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore the absolute deviation is a biased estimator.
| |
| | |
| == See also ==
| |
| * [[Deviation (statistics)]]
| |
| * [[Errors and residuals in statistics]]
| |
| * [[Least absolute deviations]]
| |
| * [[Loss function]]
| |
| * [[Mean difference]]
| |
| * [[Median absolute deviation]]
| |
| * [[Squared deviations]]
| |
| | |
| == References ==
| |
| {{Reflist}}
| |
| | |
| == External links ==
| |
| *[http://www.leeds.ac.uk/educol/documents/00003759.htm Advantages of the mean absolute deviation]
| |
| {{Statistics}}
| |
| | |
| {{DEFAULTSORT:Absolute Deviation}}
| |
| [[Category:Statistical deviation and dispersion]]
| |