Difference between revisions of "Average"

From formulasearchengine
Jump to navigation Jump to search
en>EmausBot
m (r2.7.2+) (Robot: Modifying ca:Mitjana (matemàtiques))
 
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
In [[mathematics]], an '''average''' is a measure of the "middle" or "typical" value of a [[data set]].{{cn|date=May 2012}} It is thus a [[measure of central tendency]].  
+
In [[colloquial]] language '''average''' usually refers to the sum of a list of numbers divided by the size of the list, in other words the [[arithmetic mean]]. However, the word "average" can be used to refer to the [[median]], the [[Mode (statistics)|mode]], or some other central or typical value. In statistics, these are all known as [[measures of central tendency]]. Thus the concept of an average can be extended in various ways in mathematics, but in those contexts it is usually referred to as a [[mean]] (for example the [[Mean#Mean of a function|mean of a function]]).
  
In the most common case, the data set is a list of numbers. The average of a list of numbers is a single number intended to typify the numbers in the list. If all the numbers in the list are the same, then this number should be used. If the numbers are not the same, the average is calculated by combining the numbers from the list in a specific way and computing a single number as being the average of the list.
+
==Calculation==
 
 
Many different [[descriptive statistics|descriptive]] [[statistic]]s can be chosen as a measure of the central tendency of the data items. These include the [[arithmetic mean]], the [[median]], and the [[mode (statistics)|mode]]. Other statistics, such as the [[standard deviation]] and the [[Range (statistics)|range]], are called measures of [[Statistical dispersion|spread]] and describe how spread out the data is.
 
 
 
The most common statistic is the [[arithmetic mean]], but depending on the nature of the data other types of central tendency may be more appropriate. For example, the median is used most often when the [[Frequency distribution|distribution]] of the values is [[skewness|skewed]] with a small number of very high or low values, as seen with house prices or incomes. It is also used when extreme values are likely to be anomalous or less reliable than the other values (e.g. as a result of measurement error), because the median takes less account of extreme values than the mean does. <ref>An axiomatic approach to averages is provided by John Bibby (1974) "Axiomatisations of the average and a further generalization of monotonic sequences", Glasgow Mathematical Journal, vol. 15, pp. 63–65.</ref>
 
 
 
__TOC__
 
  
==Calculation==
 
[[Image:Comparison_Pythagorean_means.svg|thumb|right|Comparison of the arithmetic, geometric and harmonic means of a pair of numbers. The vertical dashed lines are [[asymptote]]s for the harmonic means.]]
 
The three most common averages are the [[Pythagorean means]] -- the arithmetic mean, the geometric mean, and the harmonic mean.
 
 
===Arithmetic mean===
 
===Arithmetic mean===
 
{{Main|Arithmetic mean}}
 
{{Main|Arithmetic mean}}
If ''n'' numbers are given, each number denoted by ''a<sub>i</sub>'', where ''i''&nbsp;=&nbsp;1,&nbsp;...,&nbsp;''n'', the arithmetic mean is the [sum] of the ''a<sub>i</sub>'s'' divided by ''n'' or
+
The most common type of average is the arithmetic mean. If ''n'' numbers are given, each number denoted by ''a<sub>i</sub>'', where ''i''&nbsp;=&nbsp;1,&nbsp;,&nbsp;''n'', the arithmetic mean is the [sum] of the ''a<sub>i</sub>'s'' divided by ''n'' or
  
:<math>AM=\frac{1}{n}\sum_{i=1}^na_i=\frac{a_1+a_2+\cdots+a_n}{n}. </math>
+
:<math>AM = \frac{1}{n}\sum_{i=1}^na_i = \frac{1}{n}\left(a_1 + a_2 + \cdots + a_n\right)</math>
  
The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. One may find that ''A'' =&nbsp;(2&nbsp;+&nbsp;8)/2 =&nbsp;5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum&nbsp;8. If we increase the number of terms in the list for which we want an average, we get, for example, that the arithmetic mean of 2, 8, and 11 is found by solving for the value of ''A'' in the equation 2&nbsp;+&nbsp;8&nbsp;+&nbsp;11 =&nbsp;''A''&nbsp;+&nbsp;''A''&nbsp;+&nbsp;''A''. One finds that ''A'' =&nbsp;(2&nbsp;+&nbsp;8&nbsp;+&nbsp;11)/3 =&nbsp;7.
+
The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. One may find that ''A'' =&nbsp;(2&nbsp;+&nbsp;8)/2 =&nbsp;5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum&nbsp;8. If we increase the number of terms in the list to 2, 8, and 11, the arithmetic mean is found by solving for the value of ''A'' in the equation 2&nbsp;+&nbsp;8&nbsp;+&nbsp;11 =&nbsp;''A''&nbsp;+&nbsp;''A''&nbsp;+&nbsp;''A''. One finds that ''A'' =&nbsp;(2&nbsp;+&nbsp;8&nbsp;+&nbsp;11)/3 =&nbsp;7.
  
===Geometric mean===
+
===Pythagorean means===
{{Main|Geometric mean}}
+
{{main|Pythagorean means}}
 +
{{see also|Mean#Pythagorean means}}
 +
Along with the arithmetic mean above, the geometric mean and the harmonic mean are known collectively as the Pythagorean means.
  
The geometric mean of ''n'' non-negative numbers is obtained by multiplying them all together and then taking the ''n''th root.  In algebraic terms, the geometric mean of ''a''<sub>1</sub>,&nbsp;''a''<sub>2</sub>,&nbsp;...,&nbsp;''a''<sub>''n''</sub> is defined as
+
====Geometric mean====
 +
The [[geometric mean]] of ''n'' non-negative numbers is obtained by multiplying them all together and then taking the ''n''th root.  In algebraic terms, the geometric mean of ''a''<sub>1</sub>,&nbsp;''a''<sub>2</sub>,&nbsp;,&nbsp;''a''<sub>''n''</sub> is defined as
  
: <math>\text{GM=} \sqrt[n]{\prod_{i=1}^n a_i}=\sqrt[n]{a_1 a_2 \cdots a_n}.</math>
+
: <math> GM= \sqrt[n]{\prod_{i=1}^n a_i} = \sqrt[n]{a_1 a_2 \cdots a_n}</math>
  
 
Geometric mean can be thought of as the [[antilog]] of the arithmetic mean of the [[logarithm|logs]] of the numbers.
 
Geometric mean can be thought of as the [[antilog]] of the arithmetic mean of the [[logarithm|logs]] of the numbers.
  
Example: Geometric mean of 2 and 8 is <math>GM = \sqrt{2 \cdot 8} = 4.</math>
+
Example: Geometric mean of 2 and 8 is <math>SM = \sqrt{2 \cdot 8} = 4</math>
  
====Average Percentage Return and CAGR====
+
====Harmonic mean====
{{Main|Compound annual growth rate}}
+
[[Harmonic mean]] for a non-empty collection of numbers ''a''<sub>1</sub>,&nbsp;''a''<sub>2</sub>,&nbsp;…,&nbsp;''a''<sub>''n''</sub>, all different from 0, is defined as the reciprocal of the arithmetic mean of the reciprocals of the ''a''<sub>''i''</sub>{{'}}s:
The average percentage return is a type of average used in finance. It is an example of a geometric mean. When the returns are annual, it is called the Compound Annual Growth Rate (CAGR). For example, if we are considering a period of two years, and the investment return in the first year is −10% and the return in the second year is +60%, then the average percentage return or CAGR, ''R'', can be obtained by solving the equation: {{nowrap|1= (1 − 10%) × (1 + 60%) = (1 − 0.1) × (1 + 0.6) = (1 + ''R'') × (1 + ''R'')}}. The value  of ''R'' that makes this equation true is 0.2, or 20%. This means that the total return over the 2-year period is the same as if there had been 20% growth each year. Note that the order of the years makes no difference - the average percentage returns of +60% and −10% is the same result as that for −10% and +60%.
 
  
This method can be generalized to examples in which the periods are not equal. For example, consider a period of a half of a year for which the return is −23% and a period of two and a half years for which the return is +13%. The average percentage return for the combined period is the single year return, ''R'', that is the solution of the following equation: {{nowrap|1= (1 − 0.23)<sup>0.5</sup> × (1 + 0.13)<sup>2.5</sup> = (1 + ''R'')<sup>0.5+2.5</sup>}}, giving an average percentage return ''R'' of 0.0600 or 6.00%.
+
: <math>HM = \frac{1}{\frac{1}{n}\sum_{i=1}^n \frac{1}{a_i}} = \frac{n}{\frac{1}{a_1} + \frac{1}{a_2} + \cdots + \frac{1}{a_n}}</math>
  
===Harmonic mean===
+
One example where the harmonic mean is useful is when examining the speed for a number of fixed-distance trips. For example, if the speed for going from point ''A'' to ''B'' was 60&nbsp;km/h, and the speed for returning from ''B'' to ''A'' was 40&nbsp;km/h, then the harmonic mean speed is given by
{{Main|Harmonic mean}}
 
Harmonic mean for a non-empty collection of numbers ''a''<sub>1</sub>,&nbsp;''a''<sub>2</sub>,&nbsp;...,&nbsp;''a''<sub>''n''</sub>, all different from 0, is defined as the reciprocal of the arithmetic mean of the reciprocals of the ''a''<sub>''i''</sub>{{'}}s:
 
  
: <math>HM = \frac{1}{\frac{1}{n}\sum_{i=1}^n \frac{1}{a_i}}=\frac{n}{\frac{1}{a_1}+\frac{1}{a_2}+\cdots+\frac{1}{a_n}}.</math>
+
: <math>\frac{2}{\frac{1}{60} + \frac{1}{40}} = 48</math>
  
One example where it is useful is calculating the average speed for a number of fixed-distance trips. For example, if the speed for going from point ''A'' to ''B'' was 60&nbsp;km/h, and the speed for returning from ''B'' to ''A'' was 40&nbsp;km/h, then the average speed is given by
+
====Inequality concerning AM, GM, and HM====
 +
A well known inequality concerning arithmetic, geometric, and harmonic means for any set of positive numbers is
  
: <math>\frac{2}{1/60+1/40}=48.</math>
+
: <math>AM \ge GM \ge HM</math>
  
===Inequality concerning AM, GM, and HM===
+
It is easy to remember noting that the alphabetical order of the letters ''A'', ''G'', and ''H'' is preserved in the inequality. See [[Inequality of arithmetic and geometric means]].
A well known inequality concerning arithmetic, geometric, and harmonic means for any set of positive numbers is
 
  
: <math>AM \ge GM \ge HM. \, </math>
+
Thus for the above harmonic mean example: AM = 50, GM = 49, and HM = 48 km/h.
  
It is easy to remember noting that the alphabetical order of the letters ''A'', ''G'', and ''H'' is preserved in the inequality. See [[Inequality of arithmetic and geometric means]].
+
===Statistical location===
 +
In addition to the [[mean]], the [[mode (statistics)|mode]], the [[median]], and the [[mid-range]] are often used in as estimates of [[central tendency]] in [[descriptive statistics]].
  
===Mode===
+
====Mode====
 
[[Image:Comparison mean median mode.svg|thumb|300px|Comparison of [[mean|arithmetic mean]], [[median]] and [[mode (statistics)|mode]] of two [[log-normal distribution]]s with different [[skewness]].]]
 
[[Image:Comparison mean median mode.svg|thumb|300px|Comparison of [[mean|arithmetic mean]], [[median]] and [[mode (statistics)|mode]] of two [[log-normal distribution]]s with different [[skewness]].]]
 
{{Main|Mode (statistics)}}
 
{{Main|Mode (statistics)}}
The most frequently occurring number in a list is called the mode. The mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The mode is not necessarily well defined, the list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3. The mode can be subsumed under the general method of defining averages by understanding it as taking the list and setting each member of the list equal to the most common value in the list if there is a most common value. This list is then equated to the resulting list with all values replaced by the same value.  Since they are already all the same, this does not require any change. The mode is more meaningful and potentially useful if there are many numbers in the list, and the frequency of the numbers progresses smoothly (e.g., if out of a group of 1000 people, 30 people weigh 61&nbsp;kg, 32 weigh 62&nbsp;kg, 29 weigh 63&nbsp;kg, and all the other possible weights occur less frequently, then 62&nbsp;kg is the mode).
+
The most frequently occurring number in a list is called the mode. For example, the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. It may happen that there are two or more numbers which occur equally often and more often than any other number. In this case there is no agreed definition of mode. Some authors say they are all modes and some say there is no mode.
 
 
The mode has the advantage that it can be used with non-numerical data (e.g., red cars are most frequent), while other averages cannot.
 
  
===Median===
+
====Median====
 
{{Main|Median}}
 
{{Main|Median}}
The median is the middle number of the group when they are ranked in order. (If there are an even number of numbers, the mean of the middle two is taken.)  
+
The median is the middle number of the group when they are ranked in order. (If there are an even number of numbers, the mean of the middle two is taken.)
  
 
Thus to find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5.
 
Thus to find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5.
  
==Types==
+
==Summary of types==
The [[table of mathematical symbols]] explains the symbols used below.
+
{{see also|Mean#Other means}}
 
{|class="wikitable" style="background:white;"
 
{|class="wikitable" style="background:white;"
 
|-
 
|-
 
! Name !! Equation or description
 
! Name !! Equation or description
 
|-
 
|-
| [[Arithmetic mean]] || <math>\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i  =  \frac{1}{n} (x_1+\cdots+x_n)</math>
+
| [[Arithmetic mean]] || <math>\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i  =  \frac{1}{n} (x_1 + \cdots + x_n)</math>
 
|-
 
|-
 
| [[Median]] || The middle value that separates the higher half from the lower half of the data set
 
| [[Median]] || The middle value that separates the higher half from the lower half of the data set
Line 85: Line 75:
 
| [[Harmonic mean]] || <math>\frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + \cdots + \frac{1}{x_n}}</math>
 
| [[Harmonic mean]] || <math>\frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + \cdots + \frac{1}{x_n}}</math>
 
|-
 
|-
| [[Quadratic mean]]<br />(or RMS) || <math>\sqrt{\frac{1}{n} \sum_{i=1}^{n} x_i^2} =
+
| [[Quadratic mean]]<br />(or RMS) || <math>\sqrt{\frac{1}{n} \sum_{i=1}^{n} x_i^2} = \sqrt{\frac{1}{n}\left(x_1^2 + x_2^2 + \cdots + x_n^2\right)}</math>
\sqrt {\frac{x_1^2 + x_2^2 + \cdots + x_n^2}{n}}
 
</math>
 
 
|-
 
|-
 
| [[Generalized mean]] || <math>\sqrt[p]{\frac{1}{n} \cdot \sum_{i=1}^n x_{i}^p}</math>
 
| [[Generalized mean]] || <math>\sqrt[p]{\frac{1}{n} \cdot \sum_{i=1}^n x_{i}^p}</math>
Line 97: Line 85:
 
| [[Interquartile mean]] || A special case of the truncated mean, using the [[interquartile range]]
 
| [[Interquartile mean]] || A special case of the truncated mean, using the [[interquartile range]]
 
|-
 
|-
| [[Midrange]] || <math>\frac{\max x + \min x}{2}</math>
+
| [[Midrange]] || <math>\frac{1}{2}\left(\max x + \min x\right)</math>
 
|-
 
|-
 
| [[Winsorized mean]] ||  Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain
 
| [[Winsorized mean]] ||  Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain
|-
 
| [[Compound annual growth rate|Annualization]] || <math> {\left[ \prod (1+R_i )^{t_i} \right] }^{1/\sum t_i} -1</math>
 
 
|}
 
|}
 +
The [[table of mathematical symbols]] explains the symbols used below.
  
==Solutions to variational problems==
+
==Miscellaneous types==
Several measures of central tendency can be characterized as solving a variational problem, in the sense of the [[calculus of variations]], namely minimizing variation from the center. That is, given a measure of [[statistical dispersion]], one asks for a measure of central tendency that minimizes variation: such that variation from the center is minimal among all choices of center. In a quip, "dispersion precedes location". In the sense of [[Lp space|''L''<sup>''p''</sup> spaces]], the correspondence is:
 
{| class="wikitable"
 
! ''L''<sup>''p''</sup> !! dispersion !! central tendency
 
|-
 
! ''L''<sup>1</sup>
 
| [[average absolute deviation]]
 
| [[median]]
 
|-
 
! ''L''<sup>2</sup>
 
| [[standard deviation]]
 
| [[mean]]
 
|-
 
! ''L''<sup>∞</sup>
 
| [[maximum deviation]]
 
| [[midrange]]
 
|}
 
 
 
Thus standard deviation about the mean is lower than standard deviation about any other point, and the maximum deviation about the midrange is lower than the maximum deviation about any other point. The uniqueness of this characterization of mean follows from [[convex optimization]].  Indeed, for a given (fixed) data set ''x'', the function
 
 
 
:<math>f_2(c) = \|x-c\|_2</math>
 
 
 
represents the dispersion about a constant value ''c'' relative to the ''L''<sup>2</sup> norm.  Because the function ''ƒ''<sub>2</sub> is a strictly [[convex function|convex]] [[coercive function]], the minimizer exists and is unique.
 
 
 
Note that the median in this sense is not in general unique, and in fact any point between the two central points of a discrete distribution minimizes average absolute deviation.  The dispersion in the ''L''<sup>1</sup> norm, given by
 
:<math>f_1(c) = \|x-c\|_1</math>
 
is not ''strictly'' convex, whereas strict convexity is needed to ensure uniqueness of the minimizer.  In spite of this, the minimizer is unique for the ''L''<sup>∞</sup> norm.
 
  
==Miscellaneous types==
 
 
Other more sophisticated averages are: [[trimean]], [[trimedian]], and [[normalized mean]], with their generalizations.<ref>{{cite journal |last1=Merigo |first1=Jose M. |last2=Cananovas |first2=Montserrat |title=The Generalized Hybrid Averaging Operator and its Application in Decision Making |year=2009 |journal=Journal of Quantitative Methods for Economics and Business Administration |volume=9 |pages=69–84 |issn=1886-516X |url=http://www.upo.es/RevMetCuant/art.php?id=38}}</ref>
 
Other more sophisticated averages are: [[trimean]], [[trimedian]], and [[normalized mean]], with their generalizations.<ref>{{cite journal |last1=Merigo |first1=Jose M. |last2=Cananovas |first2=Montserrat |title=The Generalized Hybrid Averaging Operator and its Application in Decision Making |year=2009 |journal=Journal of Quantitative Methods for Economics and Business Administration |volume=9 |pages=69–84 |issn=1886-516X |url=http://www.upo.es/RevMetCuant/art.php?id=38}}</ref>
  
 
One can create one's own average metric using the [[generalized f-mean|generalized ''f''-mean]]:
 
One can create one's own average metric using the [[generalized f-mean|generalized ''f''-mean]]:
  
: <math>y = f^{-1}\left(\frac{f(x_1)+f(x_2)+\cdots+f(x_n)}{n}\right),</math>
+
: <math>y = f^{-1}\left(\frac{1}{n}\left[f(x_1) + f(x_2) + \cdots + f(x_n)\right]\right)</math>
  
where ''f'' is any invertible function. The harmonic mean is an example of this using ''f''(''x'') = 1/''x'', and the geometric mean is another, using ''f''(''x'') = log&nbsp;''x''.  
+
where ''f'' is any invertible function. The harmonic mean is an example of this using ''f''(''x'') = 1/''x'', and the geometric mean is another, using ''f''(''x'') = log&nbsp;''x''.
  
However, this method for generating means is not general enough to capture all averages.  A more general method<ref name=Bibby/> for defining an average takes any function ''g''(''x''<sub>1</sub>,&nbsp;''x''<sub>2</sub>,&nbsp;...,&nbsp;''x''<sub>''n''</sub>) of a list of arguments that is [[Continuous function|continuous]], [[Monotonicity|strictly increasing]] in each argument, and symmetric (invariant under [[permutation]] of the arguments). The average ''y'' is then the value which, when replacing each member of the list, results in the same function value: {{nowrap|1=''g''(''y'', ''y'', ..., ''y'') =}} {{nowrap|''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>)}}.  This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself.  
+
However, this method for generating means is not general enough to capture all averages.  A more general method<ref name=Bibby/> for defining an average takes any function ''g''(''x''<sub>1</sub>,&nbsp;''x''<sub>2</sub>,&nbsp;,&nbsp;''x''<sub>''n''</sub>) of a list of arguments that is [[Continuous function|continuous]], [[Monotonicity|strictly increasing]] in each argument, and symmetric (invariant under [[permutation]] of the arguments). The average ''y'' is then the value that, when replacing each member of the list, results in the same function value: {{nowrap|1=''g''(''y'', ''y'', , ''y'') =}} {{nowrap|''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, , ''x''<sub>''n''</sub>)}}.  This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself. The function {{nowrap|1=''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, , ''x''<sub>''n''</sub>) =}} {{nowrap|''x''<sub>1</sub>+''x''<sub>2</sub>+ ··· + ''x''<sub>''n''</sub>}} provides the arithmetic mean. The function {{nowrap|1 = ''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, , ''x''<sub>''n''</sub>) =}} {{nowrap|''x''<sub>1</sub>''x''<sub>2</sub>···''x''<sub>''n''</sub>}} (where the list elements are positive numbers) provides the geometric mean. The function {{nowrap|1 = ''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, , ''x''<sub>''n''</sub>) =}} {{nowrap|−(''x''<sub>1</sub><sup>−1</sup>+''x''<sub>2</sub><sup>−1</sup>+ ··· + ''x''<sub>''n''</sub><sup>−1</sup>)}} (where the list elements are positive numbers) provides the harmonic mean.<ref name=Bibby>John Bibby (1974). “Axiomatisations of the average and a further generalisation of monotonic sequences”. ''[[Glasgow Mathematical Journal]]'', vol. 15, pp.&nbsp;63–65.</ref>
The function {{nowrap|1=''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>) =}} {{nowrap|''x''<sub>1</sub>+''x''<sub>2</sub>+ ··· + ''x''<sub>''n''</sub>}} provides the arithmetic mean.  
 
The function {{nowrap|1=''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>) =}} {{nowrap|''x''<sub>1</sub>''x''<sub>2</sub>···''x''<sub>''n''</sub>}} (where the list elements are positive numbers) provides the geometric mean.  
 
The function {{nowrap|1=''g''(''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub>) =}} {{nowrap|−(''x''<sub>1</sub><sup>−1</sup>+''x''<sub>2</sub><sup>−1</sup>+ ··· + ''x''<sub>''n''</sub><sup>−1</sup>)}} (where the list elements are positive numbers) provides the harmonic mean.<ref name=Bibby>John Bibby (1974). “Axiomatisations of the average and a further generalisation of monotonic sequences”. ''[[Glasgow Mathematical Journal]]'', vol. 15, pp.&nbsp;63–65.</ref>
 
  
==In data streams==
+
===Average percentage return and CAGR===
The concept of an average can be applied to a stream of data as well as a bounded set, the goal being to find a value about which recent data is in some way clustered.  The stream may be distributed in time, as in samples taken by some data acquisition system from which we want to remove noise, or in space, as in pixels in an image from which we want to extract some propertyAn easy-to-understand and widely used application of average to a stream is the simple [[moving average]] in which we compute the arithmetic mean of the most recent N data items in the stream.  To advance one position in the stream, we add 1/N times the new data item and subtract 1/N times the data item N places back in the stream.
+
{{Main|Compound annual growth rate}}
 +
A type of average used in finance is the average percentage return. It is an example of a geometric mean. When the returns are annual, it is called the Compound Annual Growth Rate (CAGR). For example, if we are considering a period of two years, and the investment return in the first year is −10% and the return in the second year is +60%, then the average percentage return or CAGR, ''R'', can be obtained by solving the equation: {{nowrap|1= (1 − 10%) × (1 + 60%) = (1 − 0.1) × (1 + 0.6) = (1 + ''R'') × (1 + ''R'')}}. The value of ''R'' that makes this equation true is 0.2, or 20%. This means that the total return over the 2-year period is the same as if there had been 20% growth each year. Note that the order of the years makes no difference – the average percentage returns of +60% and −10% is the same result as that for −10% and +60%.
  
:Update rule for a window of size <math>k</math> upon seeing new element <math>x_n</math>:
+
This method can be generalized to examples in which the periods are not equal. For example, consider a period of a half of a year for which the return is −23% and a period of two and a half years for which the return is +13%. The average percentage return for the combined period is the single year return, ''R'', that is the solution of the following equation: {{nowrap|1= (1 − 0.23)<sup>0.5</sup> × (1 + 0.13)<sup>2.5</sup> = (1 + ''R'')<sup>0.5+2.5</sup>}}, giving an average percentage return ''R'' of 0.0600 or 6.00%.
<center>
+
 
<math>\mu_{n,{n-k}} = \mu_{n-1,n-k-1} + \frac{x_n}{k} - \frac{x_{n-k-1}}{k}</math>
+
==Moving average==
</center>
+
{{main|Moving average}}
  
==Averages of functions==
+
Given a [[time series]] such as daily stock market prices or yearly temperatures people often want to create a smoother series.<ref>{{cite book | first1=George E.P. | last1= Box |first2=Gwilym M.| last2= Jenkins| title= Time Series Analysis: Forecasting and Control | edition= revised edition| publisher=Holden-Day | year=1976 | isbn=0816211043}}</ref> This helps to show underlying trends or perhaps periodic behavior. An easy way to do this is to choose a number ''n'' and create a new series by taking the arithmetic mean of the first ''n'' values, then moving forward one place and so on. This is the simplest form of moving average. More complicated forms involve using a [[weighted average]]. The weighting can be used to enhance or suppress various periodic behavior and there is very extensive analysis of what weightings to use in the literature on [[Digital filter|filtering]]. In [[digital signal processing]] the term “moving average” is used even when the sum of the weights is not 1.0 (so the output series is a scaled version of the averages).<ref>{{cite book | first1=Simon | last1= Haykin | title= Adaptive Filter Theory | publisher=Prentice-Hall | year=1986 | isbn=0130040525}}</ref> The reason for this is that the analyst is usually interested only in the trend or the periodic behavior. A further generalization is an [[Autoregressive moving average model|“autoregressive moving average”]]. In this case the average also includes some of the recently calculated outputs. This allows samples from further back in the history to affect the current output.
The concept of average can be extended to functions.<ref>G. H. Hardy, J. E. Littlewood, and G. Pólya. ''Inequalities'' (2nd ed.), Cambridge University Press, ISBN 978-0-521-35880-4, 1988.</ref>  In [[calculus]], the average value of an [[integral|integrable]] function ''ƒ'' on an interval [''a'',''b''] is defined by
 
:<math>\overline{f} = \frac{1}{b-a}\int_a^bf(x)\,dx.</math>
 
  
 
==Etymology==
 
==Etymology==
An early meaning (c. 1500) of the word ''average'' is "damage sustained at sea". The root is found in Arabic as ''awar'', in Italian as ''avaria'', in French as ''avarie'' and in Dutch as ''averij''.  Hence an ''average adjuster'' is a person who assesses an insurable loss.
+
"Few words have received more etymological investigation." <ref>Oxford English Dictionary</ref> In the 16th century ''average'' meant a customs duty, or the like, and was used in the Mediterranean area. It came to mean the cost of damage sustained at sea. From that came an "average adjuster" who decided how to apportion a loss between the owners and insurers of a ship and cargo.
  
 
Marine damage is either ''particular average'', which is borne only by the owner of the damaged property, or [[general average]], where the owner can claim a proportional contribution from all the parties to the marine venture.  The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".
 
Marine damage is either ''particular average'', which is borne only by the owner of the damaged property, or [[general average]], where the owner can claim a proportional contribution from all the parties to the marine venture.  The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".
  
However, according to the Oxford English Dictionary, the earliest usage in English (1489 or earlier) appears to be an old legal term for a tenant's day labour obligation to a sheriff, probably anglicised from "avera" found in the English [[Domesday Book]] (1085). This pre-existing term thus lay to hand when an equivalent for ''avarie'' was wanted.
+
The root is found in Arabic as ''awar'', in Italian as ''avaria'', in French as ''avarie'' and in Dutch as ''averij''. It is unclear in which language the word first appeared.
 +
 
 +
There is earlier (from at least the 11th century), unrelated use of the word. It appears to be an old legal term for a tenant's day labour obligation to a sheriff, probably anglicised from "avera" found in the English [[Domesday Book]] (1085).
  
 
==See also==
 
==See also==
 
{{Portal|Statistics}}
 
{{Portal|Statistics}}
*[[Algorithms for calculating variance|Algorithms for calculating mean and variance]]
 
 
*[[Law of averages]]
 
*[[Law of averages]]
*[[Spherical mean]]
+
*[[Expected value]]
  
==Notes==
+
==References==
 
{{Reflist}}
 
{{Reflist}}
 
==References==
 
*{{Cite book|first1=G.H.|last1=Hardy|authorlink1=G.H. Hardy|first2=J.E.|last2=Littlewood|authorlink2=John Edensor Littlewood|first3=G.|last3=Pólya|authorlink3=George Pólya|title=Inequalities|year=1988|publisher=Cambridge University Press|edition=2nd|isbn=978-0-521-35880-4|postscript=<!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}}}}
 
  
 
==External links==
 
==External links==
Line 185: Line 139:
 
[[Category:Means]]
 
[[Category:Means]]
 
[[Category:Statistical terminology]]
 
[[Category:Statistical terminology]]
[[Category:Arabic words and phrases]]
+
[[Category:Arithmetic functions]]
 
 
[[ar:متوسط رياضي]]
 
[[bn:গড়]]
 
[[ca:Mitjana (matemàtiques)]]
 
[[cs:Míra polohy]]
 
[[sn:Chipakati]]
 
[[de:Mittelwert]]
 
[[el:Μέσος όρος]]
 
[[es:Media (matemáticas)]]
 
[[eo:Centra dispozicio]]
 
[[eu:Batezbesteko]]
 
[[fr:Moyenne]]
 
[[gl:Media (matemáticas)]]
 
[[it:Media (statistica)]]
 
[[he:ממוצע]]
 
[[kn:ಸರಾಸರಿ]]
 
[[lt:Vidurkis]]
 
[[mk:Просек]]
 
[[mr:सरासरी]]
 
[[nl:Gemiddelde]]
 
[[ja:平均]]
 
[[no:Gjennomsnitt]]
 
[[nn:Sentraltendens]]
 
[[pl:Średnia]]
 
[[pt:Média]]
 
[[scn:Media (statìstica)]]
 
[[sk:Priemer (štatistika)]]
 
[[sl:Srednja vrednost]]
 
[[su:Average]]
 
[[fi:Keskiluku]]
 
[[sv:Lägesmått]]
 
[[th:แนวโน้มสู่ส่วนกลาง]]
 
[[tr:Ortalama]]
 
[[ur:اوسط]]
 
[[wuu:平均]]
 
[[yi:דורכשניט]]
 

Latest revision as of 20:20, 6 January 2015

In colloquial language average usually refers to the sum of a list of numbers divided by the size of the list, in other words the arithmetic mean. However, the word "average" can be used to refer to the median, the mode, or some other central or typical value. In statistics, these are all known as measures of central tendency. Thus the concept of an average can be extended in various ways in mathematics, but in those contexts it is usually referred to as a mean (for example the mean of a function).

Calculation

Arithmetic mean

{{#invoke:main|main}} The most common type of average is the arithmetic mean. If n numbers are given, each number denoted by ai, where i = 1, …, n, the arithmetic mean is the [sum] of the ai's divided by n or

The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. One may find that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum 8. If we increase the number of terms in the list to 2, 8, and 11, the arithmetic mean is found by solving for the value of A in the equation 2 + 8 + 11 = A + A + A. One finds that A = (2 + 8 + 11)/3 = 7.

Pythagorean means

{{#invoke:main|main}} {{#invoke:see also|seealso}} Along with the arithmetic mean above, the geometric mean and the harmonic mean are known collectively as the Pythagorean means.

Geometric mean

The geometric mean of n non-negative numbers is obtained by multiplying them all together and then taking the nth root. In algebraic terms, the geometric mean of a1a2, …, an is defined as

Geometric mean can be thought of as the antilog of the arithmetic mean of the logs of the numbers.

Example: Geometric mean of 2 and 8 is

Harmonic mean

Harmonic mean for a non-empty collection of numbers a1a2, …, an, all different from 0, is defined as the reciprocal of the arithmetic mean of the reciprocals of the aiTemplate:'s:

One example where the harmonic mean is useful is when examining the speed for a number of fixed-distance trips. For example, if the speed for going from point A to B was 60 km/h, and the speed for returning from B to A was 40 km/h, then the harmonic mean speed is given by

Inequality concerning AM, GM, and HM

A well known inequality concerning arithmetic, geometric, and harmonic means for any set of positive numbers is

It is easy to remember noting that the alphabetical order of the letters A, G, and H is preserved in the inequality. See Inequality of arithmetic and geometric means.

Thus for the above harmonic mean example: AM = 50, GM = 49, and HM = 48 km/h.

Statistical location

In addition to the mean, the mode, the median, and the mid-range are often used in as estimates of central tendency in descriptive statistics.

Mode

Comparison of arithmetic mean, median and mode of two log-normal distributions with different skewness.

{{#invoke:main|main}} The most frequently occurring number in a list is called the mode. For example, the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. It may happen that there are two or more numbers which occur equally often and more often than any other number. In this case there is no agreed definition of mode. Some authors say they are all modes and some say there is no mode.

Median

{{#invoke:main|main}} The median is the middle number of the group when they are ranked in order. (If there are an even number of numbers, the mean of the middle two is taken.)

Thus to find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5.

Summary of types

{{#invoke:see also|seealso}}

Name Equation or description
Arithmetic mean
Median The middle value that separates the higher half from the lower half of the data set
Geometric median A rotation invariant extension of the median for points in Rn
Mode The most frequent value in the data set
Geometric mean
Harmonic mean
Quadratic mean
(or RMS)
Generalized mean
Weighted mean
Truncated mean The arithmetic mean of data values after a certain number or proportion of the highest and lowest data values have been discarded
Interquartile mean A special case of the truncated mean, using the interquartile range
Midrange
Winsorized mean Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain

The table of mathematical symbols explains the symbols used below.

Miscellaneous types

Other more sophisticated averages are: trimean, trimedian, and normalized mean, with their generalizations.[1]

One can create one's own average metric using the generalized f-mean:

where f is any invertible function. The harmonic mean is an example of this using f(x) = 1/x, and the geometric mean is another, using f(x) = log x.

However, this method for generating means is not general enough to capture all averages. A more general method[2] for defining an average takes any function g(x1x2, …, xn) of a list of arguments that is continuous, strictly increasing in each argument, and symmetric (invariant under permutation of the arguments). The average y is then the value that, when replacing each member of the list, results in the same function value: g(y, y, …, y) = g(x1, x2, …, xn). This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself. The function g(x1, x2, …, xn) = x1+x2+ ··· + xn provides the arithmetic mean. The function g(x1, x2, …, xn) = x1x2···xn (where the list elements are positive numbers) provides the geometric mean. The function g(x1, x2, …, xn) = −(x1−1+x2−1+ ··· + xn−1) (where the list elements are positive numbers) provides the harmonic mean.[2]

Average percentage return and CAGR

{{#invoke:main|main}} A type of average used in finance is the average percentage return. It is an example of a geometric mean. When the returns are annual, it is called the Compound Annual Growth Rate (CAGR). For example, if we are considering a period of two years, and the investment return in the first year is −10% and the return in the second year is +60%, then the average percentage return or CAGR, R, can be obtained by solving the equation: (1 − 10%) × (1 + 60%) = (1 − 0.1) × (1 + 0.6) = (1 + R) × (1 + R). The value of R that makes this equation true is 0.2, or 20%. This means that the total return over the 2-year period is the same as if there had been 20% growth each year. Note that the order of the years makes no difference – the average percentage returns of +60% and −10% is the same result as that for −10% and +60%.

This method can be generalized to examples in which the periods are not equal. For example, consider a period of a half of a year for which the return is −23% and a period of two and a half years for which the return is +13%. The average percentage return for the combined period is the single year return, R, that is the solution of the following equation: (1 − 0.23)0.5 × (1 + 0.13)2.5 = (1 + R)0.5+2.5, giving an average percentage return R of 0.0600 or 6.00%.

Moving average

{{#invoke:main|main}}

Given a time series such as daily stock market prices or yearly temperatures people often want to create a smoother series.[3] This helps to show underlying trends or perhaps periodic behavior. An easy way to do this is to choose a number n and create a new series by taking the arithmetic mean of the first n values, then moving forward one place and so on. This is the simplest form of moving average. More complicated forms involve using a weighted average. The weighting can be used to enhance or suppress various periodic behavior and there is very extensive analysis of what weightings to use in the literature on filtering. In digital signal processing the term “moving average” is used even when the sum of the weights is not 1.0 (so the output series is a scaled version of the averages).[4] The reason for this is that the analyst is usually interested only in the trend or the periodic behavior. A further generalization is an “autoregressive moving average”. In this case the average also includes some of the recently calculated outputs. This allows samples from further back in the history to affect the current output.

Etymology

"Few words have received more etymological investigation." [5] In the 16th century average meant a customs duty, or the like, and was used in the Mediterranean area. It came to mean the cost of damage sustained at sea. From that came an "average adjuster" who decided how to apportion a loss between the owners and insurers of a ship and cargo.

Marine damage is either particular average, which is borne only by the owner of the damaged property, or general average, where the owner can claim a proportional contribution from all the parties to the marine venture. The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".

The root is found in Arabic as awar, in Italian as avaria, in French as avarie and in Dutch as averij. It is unclear in which language the word first appeared.

There is earlier (from at least the 11th century), unrelated use of the word. It appears to be an old legal term for a tenant's day labour obligation to a sheriff, probably anglicised from "avera" found in the English Domesday Book (1085).

See also

{{#invoke:Portal|portal}}

References

  1. {{#invoke:Citation/CS1|citation |CitationClass=journal }}
  2. 2.0 2.1 John Bibby (1974). “Axiomatisations of the average and a further generalisation of monotonic sequences”. Glasgow Mathematical Journal, vol. 15, pp. 63–65.
  3. {{#invoke:citation/CS1|citation |CitationClass=book }}
  4. {{#invoke:citation/CS1|citation |CitationClass=book }}
  5. Oxford English Dictionary

External links

Template:Sister