|
|
Line 1: |
Line 1: |
| {{For|the similarly named inequality involving series|Chebyshev's sum inequality}} ''Not to be confused either with the Chebychev's inequalities on the size of the number-theoretic function <math>\scriptstyle{\pi(x)}</math>.''
| | I gained a great deal of weight whenever I was expecting with my son, plus my body happened to shop only regarding each ounce left over for the next 4 years. It didn't feel superior understanding I hadn't loss anything, and I'd had enough of feeling fat all of the time. After seeing how heavy I certainly looked inside my brother's marriage photos, I decided to lose weight by setting a New Year's resolution!<br><br>In conjunction to the BMI, its a advantageous idea to head over to the calculator page over at freedieting plus run the Ideal Body Weight calculator and the Body Fat Calculator. These are perfect tools in helping you track a weight lose. The Goal Estimate Calculator is additionally greatly useful.<br><br>So a individual is 'skinny' (translation - not weigh a lot on a scale) however, have fairly little muscle mass and more fat than somebody that is bigger nevertheless has more muscle and less fat. For this reason utilizing your BMI alone to calculate a weight is not a valid choice. A guy that is 610" and weighs 270 pounds is considered fat by BMI practices. However, the calculation does not take into account how much muscle the guy has (think WW wrestlers or pro football players). The BMI is helpful just whenever employed in the full context of height, weight, fat mass, plus muscle mass.<br><br>A [http://safedietplansforwomen.com/bmi-calculator bmi calculator women] is pretty important tool to have specifically for women since they are more prone to receive fat deposits than guys. Hormonal changes are the biggest reason. Women are moreover proven to emotionally devour food than man. Whenever women are depressed, happy or tired they tend to look for comfort foods. Women are structurally different than man to accommodate the growing baby inside the abdominal area. Plus a healthy woman (21-36%) would have twice more fat compared to a healthy male (8-25%). Because women commence off with a greater fat percentage than males, then it's not surprising that they are more liable to be obese.<br><br>How do you know should you child is truly overweight or obese? You physician may assist you determine whether a child meets the criteria for medical weight, though we may be capable to determine at house whether a child meets the criteria to be significantly overweight or overweight..<br><br>Then we recognize how to calculate BMI for women and men. Calculating BMI using this formula is very easy yet what does the amount mean that you get following calculation. Don't worry, a BMI chart for women that is equally selected by guys comes for your rescue at this point.<br><br>This really is a preferred method considering it is very a cheap and convenient signifies. However, it is topic to possible inaccuracy because the topic can cheat on it by carrying in the stomach or flaring out the neck to enlarge these areas. |
| In [[probability theory]], '''Chebyshev's inequality''' (also spelled as '''Tchebysheff's inequality''', Нера́венство Чебышева) guarantees that in any [[probability distribution]], "nearly all" values are close to the [[expected value|mean]] — the precise statement being that no more than 1/''k''<sup>2</sup> of the distribution's values can be more than ''k'' [[standard deviations]] away from the mean (or equivalently, at least 1 - 1/''k''<sup>2</sup> of the distribution's values are within ''k'' standard deviations of the mean). The inequality has great utility because it can be applied to completely arbitrary distributions (unknown except for mean and variance), for example it can be used to prove the [[weak law of large numbers]]. | |
| | |
| In practical usage, in contrast to the [[empirical rule]], which applies to [[normal distribution]]s, under Chebyshev's inequality a minimum of just 75% of values must lie within two standard deviations of the mean and 89% within three standard deviations.<ref name=Kvanli>{{cite book |last1=Kvanli |first1=Alan H. |last2=Pavur |first2=Robert J. |last3=Keeling |first3= Kellie B. |title=Concise Managerial Statistics |url=http://books.google.com/books?id=h6CQ1J0gwNgC&pg=PT95 |year=2006 |publisher=[[cEngage Learning]] |isbn=9780324223880 |pages=81–82}}</ref><ref name=Chernick>{{cite book |last=Chernick |first=Michael R. |title=The Essentials of Biostatistics for Physicians, Nurses, and Clinicians |url=http://books.google.com/books?id=JP4azqd8ONEC&pg=PA50 |year=2011 |publisher=[[John Wiley & Sons]] |isbn=9780470641859 |pages=49–50}}</ref>
| |
| | |
| The term ''Chebyshev's inequality'' may also refer to the [[Markov's inequality]], especially in the context of analysis.
| |
| | |
| ==History==
| |
| The theorem is named after Russian mathematician [[Pafnuty Chebyshev]], although it was first formulated by his friend and colleague [[Irénée-Jules Bienaymé]].<ref>{{cite book
| |
| |title=The Art of Computer Programming: Fundamental Algorithms, Volume 1
| |
| |edition=3rd
| |
| |last1=Knuth |first1=Donald |authorlink1=Donald Knuth
| |
| |year=1997
| |
| |publisher=Addison–Wesley
| |
| |location=Reading, Massachusetts
| |
| |isbn=0-201-89683-4
| |
| |url=http://www-cs-faculty.stanford.edu/~uno/taocp.html
| |
| |accessdate=1 October 2012
| |
| |ref=KnuthTAOCP1
| |
| }}
| |
| </ref>{{rp|98}} The theorem was first stated without proof by Bienaymé in 1853<ref name=Bienaymé1853>Bienaymé LJ (1853) Considérations àl'appui de la découverte de Laplace. Comptes Rendus de l'Académie des Sciences 37: 309–324
| |
| </ref> and later proved by Chebyshev in 1867.<ref name=Chebyshev1867>{{cite journal|last=Tchebichef|first=P.|title=Des valeurs moyennes|journal=Journal de mathématiques pures et appliquées|year=1867|volume=12|series=2|pages=177–184}}</ref> His student [[Andrey Markov]] provided another proof in his 1884 PhD thesis.<ref name=Markov1884>Markov A (1884) On certain applications of algebraic continued fractions, Ph.D. thesis, St Petersburg</ref>
| |
| | |
| == Statement ==
| |
| Chebyshev's inequality is usually stated for [[random variable]]s, but can be generalized to a statement about [[measure theory|measure spaces]].
| |
| | |
| === Probabilistic statement ===
| |
| Let ''X'' be a [[random variable]] with finite [[expected value]] ''μ'' and finite non-zero [[variance]] ''σ''<sup>2</sup>. Then for any [[real number]] {{nowrap|''k'' > 0}},
| |
| : <math>
| |
| \Pr(|X-\mu|\geq k\sigma) \leq \frac{1}{k^2}.
| |
| </math>
| |
| Only the case {{nowrap|''k'' > 1}} provides useful information. When {{nowrap|''k'' < 1}} the right-hand side is greater than one, so the inequality becomes vacuous, as the probability of any event cannot be greater than one. When {{nowrap|''k'' {{=}} 1}} it just says the probability
| |
| is less than or equal to one, which is always true for probabilities.
| |
| | |
| As an example, using {{nowrap|''k'' {{=}} {{sqrt|2}}}} shows that at least half of the values lie in the interval {{nowrap|(''μ'' − {{sqrt|2}}''σ'', ''μ'' + {{sqrt|2}}''σ'')}}.
| |
| | |
| Because it can be applied to completely arbitrary distributions (unknown except for mean and variance), the inequality generally gives a poor bound compared to what might be possible if something is known about the distribution involved.
| |
| | |
| {|class="wikitable" style="background-color:#FFFFFF"
| |
| |-
| |
| ! k
| |
| ! Min % within k standard deviations of mean
| |
| ! Max % beyond k standard deviations from mean
| |
| |-
| |
| | 1
| |
| || 0%
| |
| || 100%
| |
| |-
| |
| | {{sqrt|2}}
| |
| || 50%
| |
| || 50%
| |
| |-
| |
| | 2
| |
| || 75%
| |
| || 25%
| |
| |-
| |
| | 3
| |
| || 88.8889%
| |
| || 11.1111%
| |
| |-
| |
| | 4
| |
| || 93.75%
| |
| || 6.25%
| |
| |-
| |
| | 5
| |
| || 96%
| |
| || 4%
| |
| |-
| |
| | 6
| |
| || 97.2222%
| |
| || 2.7778%
| |
| |-
| |
| | 7
| |
| || 97.9592%
| |
| || 2.0408%
| |
| |-
| |
| | 8
| |
| || 98.4375%
| |
| || 1.5625%
| |
| |-
| |
| | 9
| |
| || 98.7654%
| |
| || 1.2346%
| |
| |-
| |
| | 10
| |
| || 99%
| |
| || 1%
| |
| |}
| |
| | |
| === Measure-theoretic statement ===
| |
| Let (''X'', Σ, μ) be a [[measure space]], and let ''f'' be an [[extended real number line|extended real]]-valued [[measurable function]] defined on ''X''. Then for any real number ''t'' > 0,{{citation needed|date=May 2012}}
| |
| | |
| :<math>\mu(\{x\in X\,:\,\,|f(x)|\geq t\}) \leq {1\over t^2} \int_X |f|^2 \, d\mu.</math>
| |
| | |
| More generally, if ''g'' is an extended real-valued measurable function, nonnegative and nondecreasing on the range of ''f'', then{{citation needed|date=May 2012}}
| |
| | |
| :<math>\mu(\{x\in X\,:\,\,f(x)\geq t\}) \leq {1\over g(t)} \int_X g\circ f\, d\mu.</math>
| |
| | |
| The previous statement then follows by defining <math>g(t)</math> as <math>t^2</math> if <math>t\ge 0</math> and <math>0</math> otherwise, and taking <math>|f|</math> instead of <math>f</math>.
| |
| | |
| ==Example==
| |
| Suppose we randomly select a journal article from a source with an average of 1000 words per article, with a standard deviation of 200 words. We can then infer that the probability that it has between 600 and 1400 words (i.e. within ''k'' = 2 SDs of the mean) must be more than 75%, because there is less than {{nowrap|{{frac|1|''k''{{su|p=2}}}} {{=}} {{frac2|1|4}}}} chance to be outside that range, by Chebyshev's inequality. But if we additionally know that the distribution is normal, we can say that is a 75% chance the word count is between 770 and 1230 (which is an even tighter bound).
| |
| | |
| ;Note
| |
| | |
| This example should be treated with caution as the inequality is only stated for probability distributions rather than for finite sample sizes. The inequality has since been extended to apply to finite sample sizes ([[Chebyshev's_inequality#Finite_samples|see below]]).
| |
| | |
| ==Sharpness of bounds==
| |
| As shown in the example above, the theorem will typically provide rather loose bounds. However, the bounds provided by Chebyshev's inequality cannot, in general (remaining sound for variables of arbitrary distribution), be improved upon. For example, for any ''k'' ≥ 1, the following example meets the bounds exactly.
| |
| : <math>
| |
| X = \begin{cases}
| |
| -1, & \text{with probability }\frac{1}{2k^2} \\
| |
| 0, & \text{with probability }1 - \frac{1}{k^2} \\
| |
| 1, & \text{with probability }\frac{1}{2k^2}
| |
| \end{cases}
| |
| </math>
| |
| | |
| For this distribution, mean ''μ'' = 0 and standard deviation ''σ'' = {{frac2|1|''k''}}, so
| |
| : <math>
| |
| \Pr(|X-\mu| \ge k\sigma) = \Pr(|X|\ge1) = \frac{1}{k^2}.
| |
| </math>
| |
| Equality holds only for distributions that are a linear transformation of this one.
| |
| | |
| == Proof (of the two-sided version) ==
| |
| === Probabilistic proof ===
| |
| [[Markov's inequality]] states that for any real-valued random variable ''Y'' and any positive number ''a'', we have Pr(|''Y''| > ''a'') ≤ E(|''Y''|)/''a''. One way to prove Chebyshev's inequality is to apply Markov's inequality to the random variable ''Y'' = (''X'' − μ)<sup>2</sup> with ''a'' = (σ''k'')<sup>2</sup>.
| |
| | |
| It can also be proved directly. For any event ''A'', let ''I''<sub>''A''</sub> be the indicator random variable of ''A'', i.e. ''I''<sub>''A''</sub> equals 1 if ''A'' occurs and 0 otherwise. Then
| |
| | |
| :<math>
| |
| \begin{align}
| |
| & {} \qquad \Pr(|X-\mu| \geq k\sigma) = \operatorname{E}(I_{|X-\mu| \geq k\sigma})
| |
| = \operatorname{E}(I_{[(X-\mu)/(k\sigma)]^2 \geq 1}) \\[6pt]
| |
| & \leq \operatorname{E}\left(\left({X-\mu \over k\sigma} \right)^2 \right)
| |
| = {1 \over k^2} {\operatorname{E}((X-\mu)^2) \over \sigma^2} = {1 \over k^2}.
| |
| \end{align}
| |
| </math>
| |
| | |
| The direct proof shows why the bounds are quite loose in typical cases: the number 1 to the right of "≥" is replaced by [(''X'' − μ)/(''k''σ)]<sup>2</sup> to the left of "≥" whenever the latter exceeds 1. In some cases it exceeds 1 by a very wide margin.
| |
| | |
| === Measure-theoretic proof ===
| |
| Fix <math>t</math> and let <math>A_t</math> be defined as <math>A_t = \{x\in X\mid f(x)\ge t\}</math>, and let <math>1_{A_t}</math> be the [[indicator function]] of the set <math>A_t</math>. Then, it is easy to check that, for any <math>x</math>,
| |
| | |
| :<math>0\leq g(t) 1_{A_t}\leq g(f(x))\,1_{A_t},</math>
| |
| | |
| since ''g'' is nondecreasing on the range of ''f'', and therefore,
| |
| | |
| :<math>\begin{align}g(t)\mu(A_t)&=\int_X g(t)1_{A_t}\,d\mu\\ &\leq\int_{A_t} g\circ f\,d\mu\\ &\leq\int_X g\circ f\,d\mu.\end{align}</math>
| |
| | |
| The desired inequality follows from dividing the above inequality by ''g''(''t'').
| |
| | |
| ==Extensions==
| |
| Several extensions of Chebyshev's inequality have been developed.
| |
| | |
| ===Asymmetric two-sided case===
| |
| An asymmetric two-sided version of this inequality is also known.<ref name=Steliga2010>{{cite journal |last1=Steliga |first1=Katarzyna |last2=Szynal |first2=Dominik |title=On Markov-Type Inequalities |journal=International Journal of Pure and Applied Mathematics |year=2010 |volume=58 |issue=2 |pages=137–152 |url=http://ijpam.eu/contents/2010-58-2/2/2.pdf |accessdate=10 October 2012 |issn=1311-8080}}</ref>
| |
| <!--
| |
| When the distribution is known to be symmetric
| |
| | |
| : <math> P( k_1 < X < k_2) \ge 1 - \frac{ 4 \sigma^2 }{ ( k_2 - k_1 )^2 }</math>
| |
| | |
| where ''σ''<sup>2</sup> is the [[variance]].
| |
| [The foregoing is incorrect as it stands, since if k_1 goes to infinity with k_2 fixed, the limit is 1.] -->
| |
| | |
| When the distribution is asymmetric or is unknown
| |
| | |
| : <math> P( k_1 < X < k_2 ) \ge \frac{ 4 [ ( \mu - k_1 )( k_2 - \mu ) - \sigma^2 ] }{ ( k_2 - k_1 )^2 } ,</math>
| |
| | |
| where ''σ''<sup>2</sup> is the variance and ''μ'' is the [[mean]].
| |
| | |
| ===Bivariate case===
| |
| A version for the bivariate case is known.<ref name=Ferentinos1982>Ferentinos K (1982) "On Tchebycheff type inequalities". ''Trabajos Estadıst Investigacion Oper'', 33: 125–132</ref>
| |
| | |
| Let ''X''<sub>1</sub> and ''X''<sub>2</sub> be two random variables with means and finite variances of ''μ''<sub>1</sub> and ''μ''<sub>2</sub> and ''σ''<sub>1</sub> and ''σ''<sub>2</sub> respectively. Then
| |
| | |
| :<math> P( k_{ 11 } \le X_1 \le k_{ 12 }, k_{ 21 } \le X_2 \le k_{ 22 }) \ge 1 - \sum T_i</math>
| |
| | |
| where for ''i'' = 1,2,
| |
| | |
| :<math> T_i = \frac{ 4 \sigma_i^2 + [ 2 \mu_i - ( k_{ i1 } + k_{ i2 } ) ]^2 } { ( k_{ i2 } - k_{ i1 } ) }.</math>
| |
| | |
| ===Two correlated variables===
| |
| Berge derived an inequality for two correlated variables ''X''<sub>1</sub> and ''X''<sub>2</sub>.<ref name=Berge1938>Berge PO (1938) A note on a form of Tchebycheff's theorem for two variables. Biometrika 29, 405–406</ref> Let ''ρ'' be the correlation coefficient between ''X''<sub>1</sub> and ''X''<sub>2</sub> and let ''σ''<sub>''i''</sub><sup>2</sup> be the variance of ''X''<sub>''i''</sub>. Then
| |
| | |
| : <math> P\left( \bigcap_{ i = 1}^2 \left[ \frac{ | X_i - \mu_i | } { \sigma_i } < k \right] \right) \ge 1 - \frac{ 1 + \sqrt{ 1 - \rho^2 } } { k^2 }.</math>
| |
| | |
| Lal later obtained an alternative bound<ref name=Lal1955>Lal DN (1955) A note on a form of Tchebycheff's inequality for two or more variables. [[Sankhya (journal)|Sankhya]] 15(3):317–320</ref>
| |
| | |
| : <math> P\left( \bigcap_{ i = 1}^2 \left[ \frac{ | X_i - \mu_i | }{ \sigma_i } \le k_i \right] \right) \ge 1 - \frac{ k_1^2 + k_2^2 + \sqrt{ ( k_1^2 + k_2^2 )^2 - 4 k_1^2 k_2^2 \rho } } { 2 ( k_1 k_2 )^2 } </math>
| |
| | |
| Isii derived a further generalisation.<ref name=Isii1959>Isii K (1959) On a method for generalizations of Tchebycheff's inequality. Ann Inst Stat Math 10: 65–88</ref> Let
| |
| | |
| : <math> Z = P\left( \bigcap_{ i = 1}^2 ( - k_1 < X_i < k_2 )\right) </math>
| |
| | |
| with 0 < ''k''<sub>1</sub> ≤ ''k''<sub>2</sub>.
| |
| | |
| There are now three cases.
| |
| | |
| <strong>Case A</strong>: If <math> 2k_1^2 > 1 - \rho </math> and <math> k_2 - k_1 \ge 2 \lambda </math> where
| |
| | |
| : <math> \lambda = \frac{ k_1( 1 + \rho ) + \sqrt{ ( 1 - \rho^2 )( k_1^2 + \rho ) } }{ 2k_1 - 1 + \rho } </math>
| |
| | |
| then
| |
| | |
| : <math> Z \le \frac{ 2 \lambda^2 } { 2 \lambda^2 + 1 + \rho }. </math>
| |
| | |
| <strong>Case B</strong>: If the conditions in case A are not met but ''k''<sub>1</sub>''k''<sub>2</sub> ≥ 1 and
| |
| | |
| : <math> 2 ( k_1 k_2 - 1 )^2 \ge 2( 1 - \rho^2 ) + ( 1 - \rho )( k_2 - k_1 )^2 </math>
| |
| | |
| then
| |
| | |
| : <math> Z \le \frac{ ( k_2 - k_1 )^2 + 4 + \sqrt{ 16 ( 1 - \rho^2 ) + 8 ( 1 - \rho )( k_2 - k_1 ) } }{ ( k_1 +k_2 )^2 }.</math>
| |
| | |
| <strong>Case C</strong>: If the conditions in cases A or B are not met there is no universal bound other than 1.
| |
| | |
| ===Multivariate case===
| |
| The general case is known as the Birnbaum–Raymond–Zuckerman inequality after the authors who proved it for two dimensions.<ref name=Birnbaum1947>{{cite journal |last1=Birnbaum |first1=Z. W. |last2=Raymond |first2=J. |last3=Zuckerman |first3=H. S. |title=A Generalization of Tshebyshev's Inequality to Two Dimensions |journal=The Annals of Mathematical Statistics |issn=0003-4851 |year=1947 |volume=18 |issue=1 |pages=70–79 |doi=10.1214/aoms/1177730493 |mr=19849 |zbl=0032.03402 |url=http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoms/1177730493 |accessdate=7 October 2012}}</ref>
| |
| | |
| : <math> P\left[ \sum_{ i = 1 }^n \frac{ ( X_i - \mu_i )^2 }{ \sigma_i^2 t_i^2 } \ge k^2 \right] \le \frac{ 1 }{ k^2 } \sum_{ i = 1 }^n \frac{ 1 }{ t_i^2 } </math>
| |
| | |
| where ''X''<sub>i</sub> is the ''i''<sup>th</sup> random variable, ''μ''<sub>i</sub> is the ''i''<sup>th</sup> mean and ''σ''<sub>i</sub><sup>2</sup> is the ''i''<sup>th</sup> variance.
| |
| | |
| If the variables are independent this inequality can be sharpened.<ref name=Kotz2000>{{cite book |last1=Kotz |first1=Samuel |authorlink1=Samuel Kotz |last2=Balakrishnan |first2=N. |last3= Johnson |first3=Norman L. |authorlink3=Norman Lloyd Johnson |title=Continuous Multivariate Distributions, Volume 1, Models and Applications |year=2000 |publisher=Houghton Mifflin |location=Boston [u.a.] |isbn=978-0-471-18387-7 |url=http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471183873.html |edition=2nd |accessdate=7 October 2012}}</ref>
| |
| | |
| : <math> P\left[ \bigcap_{i = 1 }^n \frac{ | X_i - \mu_i | }{ \sigma_i } \le k_i \right] \ge \prod{ ( 1 - \frac{ 1 }{ k_i^2 } ) } </math>
| |
| | |
| Olkin and Pratt derived an inequality for ''n'' correlated variables.<ref name=Olkin1958>{{cite journal|last1=Olkin|first1=Ingram |authorlink1=Ingram Olkin | last2=Pratt |first2=John W. |authorlink2=John W. Pratt |title=A Multivariate Tchebycheff Inequality|journal=The Annals of Mathematical Statistics|year=1958|volume=29|issue=1|pages=226–234|doi=10.1214/aoms/1177706720|url=http://projecteuclid.org/euclid.aoms/1177706720|zbl=0085.35204 |mr=MR93865 |accessdate=2 October 2012}}</ref>
| |
| | |
| : <math> P\left( \bigcap_{i = 1 }^n \frac{ | X_i - \mu_i | }{ \sigma_i } < k_i \right) \ge 1 - \frac{ [ \sqrt{ u } + \sqrt{ n - 1 } \sqrt{ n \sum{ \frac{ 1 }{ k_i^2 } - u } } ]^2 }{ n^2 } </math>
| |
| | |
| where the sum is taken over the ''n'' variables and
| |
| | |
| : <math> u = \sum_{ i = 1 }^n{ \frac{ 1 }{ k_i^2 } } + 2 \sum_{ i = 1 }^n \sum_{ j < i } \frac{ \rho_{ ij } } { k_i k_j } </math>
| |
| | |
| where ''ρ''<sub>''ij''</sub> is the correlation between ''X''<sub>''i''</sub> and ''X''<sub>''j''</sub>
| |
| | |
| Olkin and Pratt's inequality was subsequently generalised by Godwin.<ref name=Godwin1964>Godwin HJ (1964) Inequalities on distribution functions. New York, Hafner Pub Co</ref>
| |
| | |
| ===Vector version===
| |
| Ferentinos<ref name="Ferentinos1982"/> has shown that for a [[Multivariate random variable|vector]] ''X'' = (''x''<sub>1</sub>, ''x''<sub>2</sub>, ''x''<sub>3</sub>, ...) with mean ''μ'' = (''μ''<sub>1</sub>, ''μ''<sub>2</sub>, ''μ''<sub>3</sub>, ...), variance ''σ''<sup>2</sup> = (''σ''<sub>1</sub><sup>2</sup>, ''σ''<sub>2</sub><sup>2</sup>, ''σ''<sub>3</sub><sup>2</sup>, ...) and an arbitrary [[Normed vector space|norm]] (|| ||) that
| |
| | |
| : <math> P(|| X - \mu || \ge k || \sigma ||) \le \frac{ 1 } { k^2 }. </math>
| |
| | |
| A second related inequality has also been derived by Chen.<ref name=Chen2007>{{cite arXiv |author=Xinjia Chen |eprint=0707.0805 |title=A New Generalization of Chebyshev Inequality for Random Vectors |year=2007 |version=v2}}</ref> Let ''N'' be the [[dimension]] of the stochastic vector ''X'' and let ''E''[''X''] be the mean of ''X''. Let ''S'' be the [[covariance matrix]] and ''k'' > 0. Then
| |
| | |
| : <math> P( ( X - E[ X ] )^T S^{ -1 } ( X - E[ X ] ) < k ) \ge 1 - \frac{ N }{ k } </math>
| |
| | |
| where ''Y''<sup>T</sup> is the [[transpose]] of ''Y''.
| |
| | |
| ===Infinite Dimensions===
| |
| | |
| There is a straightforward extension of the vector version of Chebyshev's inequality to infinite dimensional settings. Let <math>X</math> be a random variable which takes values in a [[Fréchet space]] <math>\mathcal X</math> (equipped with seminorms <math>\|\cdot\|_\alpha</math>). This includes most common settings of vector-valued random variables, e.g., when <math>\mathcal X</math> is a [[Banach space]] (equipped with a single norm), a [[Hilbert space]], or the finite-dimensional setting as described above.
| |
| | |
| Suppose that <math>X</math> is of "[[strong order two]]", meaning that
| |
| : <math> \mathbb E\big(\| X\|_\alpha^2 \big) < \infty </math>
| |
| for every seminorm <math>\|\cdot\|_\alpha</math>. This is a generalization of the requirement that <math>X</math> have finite variance, and is necessary for this strong form of Chebyshev's inequality in infinite dimensions. The terminology "strong order two" is due to [[Vakhania]].<ref>Vakhania, Nikolai Nikolaevich. Probability distributions on linear spaces. New York: North Holland, 1981.</ref>
| |
| | |
| Let <math>\mu \in \mathcal X</math> be the [[Pettis integral]] of <math>X</math> (i.e., the vector generalization of the mean), and let <math>\sigma_a := \sqrt{\mathbb E\|X - \mu\|_\alpha^2}</math>be the standard deviation with respect to the seminorm <math>\|\cdot\|_\alpha</math>.
| |
| | |
| In this setting, the general version of Chebyshev's inequality states that
| |
| | |
| : <math> \mathbb P\big( \|X - \mu\|_\alpha \ge k \sigma_\alpha \big) \le \frac{ 1 } { k^2 }</math>
| |
| | |
| for all <math>k > 0</math>.
| |
| | |
| '''Proof.''' The proof is straightforward, and essentially the same as the finitary version. If <math>\sigma_\alpha = 0</math>, then <math>X</math> is constant (and equal to <math>\mu</math>) almost surely, so the inequality is trivial.
| |
| | |
| On the event <math>\|X - \mu\|_\alpha \ge k \sigma_\alpha^2</math> we have that <math>\|X - \mu\|_\alpha > 0</math>, so we may safely divide by <math>\|X - \mu\|_\alpha</math>. The crucial trick in Chebyshev's inequality is to recognize that <math> 1 = \tfrac{\|X - \mu\|_\alpha^2}{\|X - \mu\|_\alpha^2}</math>.
| |
| | |
| We calculate:
| |
| : <math> \mathbb P\big( \|X - \mu\|_\alpha \ge k \sigma_\alpha \big) = \int_\Omega 1_{\|X - \mu\|_\alpha \ge k \sigma_\alpha} \, \mathrm d \mathbb P = \int_\Omega \frac{\|X - \mu\|_\alpha^2}{\|X - \mu\|_\alpha^2} \cdot 1_{\|X - \mu\|_\alpha \ge k \sigma_\alpha} \, \mathrm d \mathbb P \le \int_\Omega \frac{\|X - \mu\|_\alpha^2}{(k\sigma_\alpha)^2} \cdot 1_{\|X - \mu\|_\alpha \ge k \sigma_\alpha} \, \mathrm d \mathbb P. </math>
| |
| Next, we use the fact that an indicator function is bounded above by 1 to calculate that this is bounded by
| |
| : <math> \frac{1}{k^2 \sigma_\alpha^2} \int_\Omega \|X - \mu\|_\alpha^2 \, \mathrm d \mathbb P = \frac{\mathbb E\|X - \mu\|_\alpha^2}{k^2 \sigma_\alpha^2} = \frac{\sigma_\alpha^2}{k^2 \sigma_\alpha^2} = \frac{1}{k^2}. </math>
| |
| This completes the proof. ⃞
| |
| | |
| ===Higher moments===
| |
| An extension to higher moments is also possible:
| |
| | |
| :<math> P( | X - \operatorname{ E } ( X ) | \ge k ) \le \frac{ \operatorname{ E }(| X - \operatorname{ E }( X ) |^n ) } { k^n } </math>
| |
| | |
| where ''k'' > 0 and ''n'' ≥ 2.
| |
| | |
| ===Exponential version===
| |
| A related inequality sometimes known as the exponential Chebyshev's inequality<ref name=RassoulAgha2010>[http://www.math.utah.edu/~firas/Papers/rassoul-seppalainen-ldp.pdf Section 2.1]</ref> is the inequality
| |
| | |
| :<math> P(X \ge \varepsilon) \le e^{ -t \varepsilon } \operatorname{ E } (e^{ t X }) </math>
| |
| | |
| where ''t'' > 0.
| |
| | |
| Let ''K''( ''x'', ''t'' ) be the [[cumulant generating function]],
| |
| | |
| : <math> K( x , t ) = \log( \operatorname{ E } ( e^{ t x } ) ). </math>
| |
| | |
| Taking the [[Legendre–Fenchel transformation]]{{clarify|reason=articles should be reasonably self contained, more explanation needed|date=May 2012}} of ''K''(''x'', ''t'') and using the exponential Chebyshev's inequality we have
| |
| | |
| : <math> - \log( P \ge \varepsilon ) \le \sup_t( t \varepsilon - K( x , t ) ). </math>
| |
| | |
| This inequality may be used to obtain exponential inequalities for unbounded variables.<ref name=Baranoski2001>{{cite journal |last1=Baranoski |first1=Gladimir V. G. |last2=Rokne |first2=Jon G. |author3=Guangwu Xu |title=Applying the exponential Chebyshev inequality to the nondeterministic computation of form factors |journal=Journal of Quantitative Spectroscopy and Radiative Transfer |date=15 May 2001 |volume=69 |issue=4 |pages=199–200 |doi=10.1016/S0022-4073(00)00095-9 |url=http://www.sciencedirect.com/science/article/pii/S0022407300000959 |accessdate=2 October 2012}} (the references for this article are corrected by {{cite journal|last1=Baranoski |first1=Gladimir V. G. |last2=Rokne |first2=Jon G. |author3=Guangwu Xu |title=Corrigendum to: 'Applying the exponential Chebyshev inequality to the nondeterministic computation of form factors' |journal=Journal of Quantitative Spectroscopy and Radiative Transfer |date=15 January 2002 |volume=72 |issue=2 |pages=199–200 |doi=10.1016/S0022-4073(01)00171-6 |url=http://www.sciencedirect.com/science/article/pii/S0022407301001716 |accessdate=2 October 2012}})</ref>
| |
| | |
| ===Inequalities for bounded variables===
| |
| If P(''x'') has finite support based on the interval [''a'', ''b''], let ''M'' = max( |''a''|, |''b''| ) where |''x''| is the [[absolute value]] of ''x''. If the mean of P(''x'') is zero then for all ''k'' > 0<ref name=Dufour2003>Dufour (2003) [http://www2.cirano.qc.ca/~dufourj/Web_Site/ResE/Dufour_1999_C_TS_Moments.pdf Properties of moments of random variables]</ref>
| |
| | |
| : <math> \frac{ E( | X |^r ) - k^r }{ M^r } \le P( | X | \ge k ) \le \frac{ E( | X |^r ) }{ k^r }.</math>
| |
| | |
| The second of these inequalities with ''r'' = 2 is the Chebyshev bound. The first provides a lower bound for the value of P(''x'').
| |
| | |
| Sharp bounds for a bounded variate have been derived by Niemitalo<ref name=Niemitalo2012>Niemitalo O (2012) [http://yehar.com/blog/?p=1225 One-sided Chebyshev-type inequalities for bounded probability distributions.]</ref>
| |
| | |
| Let 0 ≤ ''X'' ≤ ''M'' where ''M'' > 0. Then
| |
| | |
| ;Case 1:
| |
| | |
| <math> P( X < k ) = 0 \text{ if } E( X ) > k \text{ and } E( X^2 ) < k E( X ) + M E( X ) - kM </math>
| |
| | |
| ;Case 2:
| |
| | |
| <math> P( X < k ) \ge 1 - \frac{ k E( X ) + M E( X ) - E( X^2 ) }{ kM } </math>
| |
| | |
| <math> \text{ if } [ E( X )> k \text{ and } E( X^2 ) \ge kE( X ) + ME( X ) - kM ] \text{ or } [ E( X ) \le k \text{ and } E( X^2 ) \ge kE( X ) ]</math>
| |
| | |
| ;Case 3:
| |
| | |
| <math> P( X < k ) \ge \frac{ E( X )^2 - 2 k E( X ) + k^2 }{ E( X^2 ) - 2 k E( X ) + k^2 } \text{ if } E( X ) \le k \text{ and } E( X^2 )< kE( X )</math>
| |
| | |
| ==Finite samples==
| |
| Saw ''et al'' extended Chebyshev's inequality to cases where the population mean and variance are not known but are instead replaced by their sample estimates.<ref name=Saw1984>{{cite doi/10.2307.2F2683249}}</ref>
| |
| | |
| : <math> P( | X - m | \ge ks ) \le \frac{ g_{ N + 1 }\left( \frac{ N k^2 }{ N - 1 + k^2 } \right) }{ N + 1 } \left( \frac{ N }{ N + 1 } \right)^{ 1 / 2 } </math>
| |
| | |
| where ''N'' is the sample size, ''m'' is the sample mean, ''k'' is a constant and ''s'' is the sample standard deviation. g(''x'') is defined as follows:
| |
| | |
| Let ''x'' ≥ 1, ''Q'' = ''N'' + 1, and ''R'' be the greatest integer less than ''Q'' / ''x''. Let
| |
| | |
| : <math> a^ 2 = \frac{ Q( Q - R ) } { 1 + R( Q - R ) }. </math>
| |
| | |
| Now
| |
| | |
| : <math> g_Q(x) = R \quad \text{if }R \text{ is even} </math>
| |
| : <math> g_Q(x) = R \quad \text{if }R \text{ is odd and }x < a^2 </math>
| |
| : <math> g_Q(x) = R - 1 \quad \text{if } R \text{ is odd and } x \ge a^2.</math>
| |
| | |
| This inequality holds when the population moments do not exist and when the sample is weakly exchangeably distributed.
| |
| | |
| Kabán gives a somewhat less complex version of this inequality.<ref name="Kabán2011">{{cite doi|10.1007/s11222-011-9229-0}}</ref>
| |
| | |
| : <math>P( | X - m | \ge ks ) \le \frac{ 1 }{ [ N( N + 1 ) ]^{ 1 / 2 } }\left[ \left( \frac{ N - 1 }{ k^2 } + 1 \right) \right]</math>
| |
| | |
| If the standard deviation is a multiple of the mean then a further inequality can be derived,<ref name="Kabán2011" />
| |
| | |
| : <math>P( | X - m | \ge ks ) \le \frac{ N - 1 }{ N } \frac{ 1 }{ k^2 } \frac{ s^2 }{ m^2 } + \frac{ 1 }{ N }.</math>
| |
| | |
| A table of values for the Saw–Yang–Mo inequality for finite sample sizes (''n'' < 100) has been determined by Konijn.<ref name=Konijn1987>{{cite journal |last=Konijn |first=Hendrik S. |title=Distribution-Free and Other Prediction Intervals |journal=The American Statistician |date=February 1987 |volume=41 |issue=1 |pages=11–15 |publisher=American Statistical Association |jstor=2684311}}</ref>
| |
| | |
| For fixed ''N'' and large ''m'' the Saw–Yang–Mo inequality is approximately<ref name=Beasley2004>{{cite journal |last1=Beasley |first1=T. Mark |last2=Page |first2=Grier P. |last3=Brand |first3=Jaap P. L. |last4=Gadbury |first4=Gary L. |last5=Mountz |first5=John D. |last6=Allison |first6=David B. |authorlink6=David B. Allison |title=Chebyshev's inequality for nonparametric testing with small ''N'' and α in microarray research |journal=Journal of the Royal Statistical Society |issn=1467-9876 |date=January 2004 |volume=53 |series=C (Applied Statistics) |issue=1 |pages=95–108 |doi=10.1111/j.1467-9876.2004.00428.x |url=http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9876.2004.00428.x/abstract |accessdate=3 October 2012}}</ref>
| |
| | |
| : <math> P( | X - m | \ge ks ) \le \frac{ 1 }{ N + 1 }. </math>
| |
| | |
| Beasley ''et al'' have suggested a modification of this inequality<ref name=Beasley2004 />
| |
| | |
| : <math> P( | X - m | \ge ks ) \le \frac{ 1 }{ k^2( N + 1 ) }. </math>
| |
| | |
| In empirical testing this modification is conservative but appears to have low statistical power. Its theoretical basis currently remains unexplored.
| |
| | |
| ===Dependence of sample size===
| |
| The bounds these inequalities give on a finite sample are less tight than those the Chebyshev inequality gives for a distribution. To illustrate this let the sample size ''n'' = 100 and let ''k'' = 3. Chebyshev's inequality states that approximately 11.11% of the distribution will lie outside these limits. Kabán's version of the inequality for a finite sample states that approximately 12.05% of the sample lies outside these limits. The dependence of the confidence intervals on sample size is further illustrated below.
| |
| | |
| For ''N'' = 10, the 95% confidence interval is approximately ±13.5789 standard deviations.
| |
| | |
| For ''N'' = 100 the 95% confidence interval is approximately ±4.9595 standard deviations; the 99% confidence interval is approximately ±140.0 standard deviations.
| |
| | |
| For ''N'' = 500 the 95% confidence interval is approximately ±4.5574 standard deviations; the 99% confidence interval is approximately ±11.1620 standard deviations.
| |
| | |
| For ''N'' = 1000 the 95% and 99% confidence intervals are approximately ±4.5141 and approximately ±10.5330 standard deviations respectively.
| |
| | |
| The Chebyshev inequality for the distribution gives 95% and 99% confidence intervals of approximately ±4.472 standard deviations and ±10 standard deviations respectively.
| |
| | |
| ===Comparative bounds===
| |
| Although Chebyshev's inequality is the best possible bound for an arbitrary distribution, this is not necessarily true for finite samples. [[Samuelson's inequality]] states that all values of a sample will lie within √(''N'' − 1) standard deviations of the mean. Chebyshev's bound improves as the sample size increases.
| |
| | |
| When ''N'' = 10, Samuelson's inequality states that all members of the sample lie within 3 standard deviations of the mean: in contrast Chebyshev's states that 95% of the sample lies within 13.5789 standard deviations of the mean.
| |
| | |
| When ''N'' = 100, Samuelson's inequality states that all members of the sample lie within approximately 9.9499 standard deviations of the mean: Chebyshev's states that 99% of the sample lies within 140.0 standard deviations of the mean.
| |
| | |
| When ''N'' = 500, Samuelson's inequality states that all members of the sample lie within approximately 22.3383 standard deviations of the mean: Chebyshev's states that 99% of the sample lies within 11.1620 standard deviations of the mean.
| |
| | |
| It is likely that better bounds for finite samples than these exist.
| |
| | |
| ==Sharpened bounds==
| |
| Chebyshev's inequality is important because of its applicability to any distribution. As a result of its generality it may not (and usually does not) provide as sharp a bound as alternative methods that can be used if the distribution of the random variable is known. To improve the sharpness of the bounds provided by Chebyshev's inequality a number of methods have been developed.
| |
| | |
| ===Standardised variables===
| |
| Sharpened bounds can be derived by first standardising the random variable.<ref name=Ion2001>{{cite book|last=Ion|first=Roxana Alice|title=Nonparametric Statistical Process Control|year=2001|publisher=Universiteit van Amsterdam|isbn=9057760762|url=http://dare.uva.nl/document/60326|accessdate=1 October 2012|chapter=Chapter 4: Sharp Chebyshev-type inequalities}}</ref>
| |
| | |
| Let ''X'' be a random variable with finite variance ''Var''(''x''). Let ''Z'' be the standardised form defined as
| |
| | |
| :<math> Z = \frac {X - \operatorname{E}(X) } { \operatorname{Var}(X)^{ 1/2 } }.</math>
| |
| | |
| Cantelli's lemma is then
| |
| | |
| :<math> P(Z \ge k) \le \frac{ 1 } { 1 + k^2 }.</math>
| |
| | |
| This inequality is sharp and is attained by ''k'' and −1/''k'' with probability 1/(1 + ''k''<sup>2</sup>) and ''k''<sup>2</sup>/(1 + ''k''<sup>2</sup>) respectively.
| |
| | |
| If ''k'' > 1 and the distribution of ''X'' is symmetric then we have
| |
| | |
| :<math> P(Z \ge k) \le \frac { 1 } { 2 k^2 } .</math>
| |
| | |
| Equality holds if and only if ''Z'' = −''k'', 0 or ''k'' with probabilities {{nowrap|1= 1 / 2 ''k''<sup>2</sup>}}, {{nowrap|1 − 1 / ''k''<sup>2</sup>}} and {{nowrap|1 / 2 ''k''<sup>2</sup>}} respectively.<ref name=Ion2001/>
| |
| An extension to a two-sided inequality is also possible.
| |
| | |
| Let ''u'', ''v'' > 0. Then we have<ref name=Ion2001/>
| |
| :<math> P(Z \le -u \text{ or } Z \ge v) \le \frac{ 4 + (u - v)^2 } { (u + v)^2 } .</math>
| |
| | |
| ===Semivariances===
| |
| An alternative method of obtaining sharper bounds is through the use of [[semivariance]]s (partial moments). The upper (''σ''<sub>+</sub><sup>2</sup>) and lower (''σ''<sub>−</sub><sup>2</sup>) semivariances are defined
| |
| | |
| : <math> \sigma_+^2 = \frac { \sum (x - m)^2 } { n - 1 } </math>
| |
| | |
| : <math> \sigma_-^2 = \frac { \sum (m - x)^2 } { n - 1 } </math>
| |
| | |
| where ''m'' is the arithmetic mean of the sample, ''n'' is the number of elements in the sample and the sum for the upper (lower) semivariance is taken over the elements greater (less) than the mean.
| |
| | |
| The variance of the sample is the sum of the two semivariances
| |
| | |
| : <math> \sigma^2 = \sigma_+^2 + \sigma_-^2. </math>
| |
| | |
| In terms of the lower semivariance Chebyshev's inequality can be written<ref name=Berck1982>{{cite journal |last1=Berck |first1=Peter |last2=Hihn |first2=Jairus M. |title=Using the Semivariance to Estimate Safety-First Rules |journal=American Journal of Agricultural Economics |date=May 1982 |volume=64 |issue=2 |pages=298–300 |doi=10.2307/1241139 |url=http://ajae.oxfordjournals.org/content/64/2/298.full.pdf+html |accessdate=8 October 2012 |publisher=Oxford University Press |issn=0002-9092}}</ref>
| |
| | |
| : <math> \Pr(x \le m - a \sigma_-) \le \frac { 1 } { a^2 }.</math>
| |
| | |
| Putting
| |
| | |
| : <math> a = \frac{ k \sigma } { \sigma_- }. </math>
| |
| | |
| Chebyshev's inequality can now be written
| |
| | |
| : <math> \Pr(x \le m - k \sigma) \le \frac { 1 } { k^2 } \frac { \sigma_-^2 } { \sigma^2 }.</math>
| |
| | |
| A similar result can also be derived for the upper semivariance.
| |
| | |
| If we put
| |
| | |
| : <math> \sigma_u^2 = \max(\sigma_-^2, \sigma_+^2) , </math>
| |
| | |
| Chebyshev's inequality can be written
| |
| | |
| : <math> \Pr(| x \le m - k \sigma |) \le \frac { 1 } { k^2 } \frac { \sigma_u^2 } { \sigma^2 } .</math>
| |
| | |
| Because ''σ''<sub>u</sub><sup>2</sup> ≤ ''σ''<sup>2</sup>, use of the semivariance sharpens the original inequality.
| |
| | |
| If the distribution is known to be symmetric, then
| |
| | |
| : <math> \sigma_+^2 = \sigma_-^2 = \frac{ 1 } { 2 } \sigma^2 </math>
| |
| | |
| and
| |
| | |
| : <math> \Pr(x \le m - k \sigma) \le \frac { 1 } { 2 k^2 } .</math>
| |
| | |
| This result agrees with that derived using standardised variables.
| |
| | |
| ;Note: The inequality with the lower semivariance has been found to be of use in estimating downside risk in finance and agriculture.<ref name="Berck1982"/><ref name=Nantell1979>{{cite journal |last1=Nantell |first1=Timothy J. |last2=Price |first2=Barbara |title=An Analytical Comparison of Variance and Semivariance Capital Market Theories |journal=The Journal of Financial and Quantitative Analysis |publisher=University of Washington School of Business Administration |date=June 1979 |volume=14 |issue=2 |pages=221–242 |doi=10.2307/2330500 |jstor=2330500 |url=http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=4470644&fulltextType=RA&fileId=S0022109000005275 |accessdate=8 October 2012}}</ref><ref name=Neave2008>Neave EH, Ross MN, Yang J (2008) Distinguishing upside potential from downside risk. Management Research News. 32 (1) 26–36</ref>
| |
| | |
| ===Selberg's inequality===
| |
| Selberg derived an inequality for ''P''(''x'') when ''a'' ≤ ''x'' ≤ ''b''.<ref name=Selberg1940>{{cite journal |last=Selberg |first=Henrik L. |title=Zwei Ungleichungen zur Ergänzung des Tchebycheffschen Lemmas |trans_title=Two Inequalities Supplementing the Tchebycheff Lemma |journal=Skandinavisk Aktuarietidskrift (Scandinavian Actuarial Journal) |year=1940 |volume=1940 |issue=3–4 |pages=121–125 |doi=10.1080/03461238.1940.10404804 |url=http://www.tandfonline.com/doi/abs/10.1080/03461238.1940.10404804#preview |accessdate=7 October 2012 |language=German |issn=0346-1238 |oclc=610399869}}</ref> To simplify the notation let
| |
| | |
| : <math> Y = \alpha X + \beta </math>
| |
| | |
| where
| |
| | |
| : <math> \alpha = \frac{ 2 k }{ b - a }</math>
| |
| | |
| and
| |
| | |
| : <math> \beta = \frac{ - ( b + a ) k }{ b - a }. </math>
| |
| | |
| The result of this linear transformation is to make ''P''(''a'' ≤ ''X'' ≤ ''b'') equal to ''P''(|''Y''| ≤ ''k'').
| |
| | |
| The mean (''μ''<sub>''X''</sub>) and variance (''σ''<sub>''X''</sub>) of ''X'' are related to the mean (''μ''<sub>''Y''</sub>) and variance (''σ''<sub>''Y''</sub>) of ''Y'':
| |
| | |
| : <math> \mu_Y = \alpha \mu_X + \beta </math>
| |
| | |
| : <math> \sigma_Y^2 = \alpha^2 \sigma_X^2. </math>
| |
| | |
| With this notation Selberg's inequality states that
| |
| | |
| : <math> P( | Y | < k ) \ge \frac{ ( k - \mu_Y )^ 2 }{ ( k - \mu_Y )^2 + \sigma_Y^2 } \quad\text{ if }\quad \sigma_Y^2 \le \mu_Y ( k - \mu_Y ) </math>
| |
| | |
| : <math> P( | Y | < k ) \ge 1 - \frac{ \sigma_Y^2 + \mu_Y^2 }{ k^2 } \quad\text{ if }\quad \mu_Y ( k - \mu_Y ) \le \sigma_Y^2 \le k^2 - \mu_Y^2 </math>
| |
| | |
| : <math> P( | Y | < k ) \ge 0 \quad\text{ if }\quad k^2 - \mu_Y^2 \le \sigma_Y^2.</math>
| |
| | |
| These are known to be the best possible bounds.<ref name=Conlon00>{{cite journal |last1=Conlon |first1=J. |last2=Dulá |first2=J. H. |title=A geometric derivation and interpretation of Tchebyscheff's Inequality |url=http://www.people.vcu.edu/~jdula/WORKINGPAPERS/tcheby.pdf |accessdate=2 October 2012}}</ref>
| |
| | |
| ===Cantelli's inequality===
| |
| [[Cantelli's inequality]]<ref name=Cantelli1910>Cantelli F (1910)Intorno ad un teorema fondamentale della teoria del rischio. Bolletino dell Associazione degli Attuari Italiani</ref> due to [[Francesco Paolo Cantelli]] states that for a real random variable (''X'') with mean (''μ'') and variance (''σ''<sup>2</sup>)
| |
| | |
| : <math> P(X - \mu \ge a) \le \frac{\sigma^2}{ \sigma^2 + a^2 } </math>
| |
| | |
| where ''a'' ≥ 0.
| |
| | |
| This inequality can be used to prove a one tailed variant of Chebyshev's inequality with ''k'' > 0<ref name=Grimmett00>Grimmett and Stirzaker, problem 7.11.9. Several proofs of this result can be found in [http://www.mcdowella.demon.co.uk/Chebyshev.html Chebyshev's Inequalities] by A. G. McDowell.</ref>
| |
| | |
| :<math> \Pr(X - \mu \geq k \sigma) \leq \frac{ 1 }{ 1 + k^2 }. </math>
| |
| | |
| The bound on the one tailed variant is known to be sharp. To see this consider the random variable ''X'' that takes the values
| |
| | |
| : <math> X = 1 </math> with probability <math> \frac{ \sigma^2 } { 1 + \sigma^2 }</math>
| |
| : <math> X = - \sigma^2 </math> with probability <math> \frac{ 1 } { 1 + \sigma^2 }.</math>
| |
| | |
| Then ''E''(''X'') = 0 and ''E''(''X''<sup>2</sup>) = ''σ''<sup>2</sup> and ''P''(''X'' < 1) = 1 / (1 + ''σ''<sup>2</sup>).
| |
| | |
| ; An application – distance between the mean and the median <!-- This section is linked from [[median]]. -->
| |
| | |
| The one-sided variant can be used to prove the proposition that for [[probability distribution]]s having an [[expected value]] and a [[median]], the mean and the median can never differ from each other by more than one [[standard deviation]]. To express this in symbols let ''μ'', ''ν'', and ''σ'' be respectively the mean, the median, and the standard deviation. Then
| |
| | |
| :<math> \left | \mu - \nu \right | \leq \sigma. </math>
| |
| | |
| There is no need to assume that the variance is finite because this inequality is trivially true if the variance is infinite.
| |
| | |
| The proof is as follows. Setting ''k'' = 1 in the statement for the one-sided inequality gives:
| |
| | |
| :<math>\Pr(X - \mu \geq \sigma) \leq \frac{ 1 }{ 2 }. </math>
| |
| | |
| Changing the sign of ''X'' and of ''μ'', we get
| |
| | |
| :<math>\Pr(X \leq \mu - \sigma) \leq \frac{ 1 }{ 2 }. </math>
| |
| | |
| Thus the median is within one standard deviation of the mean.
| |
| | |
| For a proof using Jensen's inequality see [[Median#An inequality relating means and medians|An inequality relating means and medians]].
| |
| | |
| ===Bhattacharyya's inequality===
| |
| Bhattacharyya<ref name=Bhattacharyya1987>{{cite journal |last=Bhattacharyya |first=B. B. |title=One-sided chebyshev inequality when the first four moments are known |journal=Communications in Statistics - Theory and Methods |date=27 June 2007 |year=1987 |volume=16 |issue=9|pages=2789–2791 |doi=10.1080/03610928708829540 |url=http://www.tandfonline.com/doi/abs/10.1080/03610928708829540 |accessdate=6 October 2012 |issn=0361-0926}}</ref> extended Cantelli's inequality using the third and fourth moments of the distribution.
| |
| | |
| Let ''μ'' = 0 and ''σ''<sup>2</sup> be the variance. Let γ = ''E''(''X''<sup>3</sup>) / ''σ''<sup>3</sup> and κ = ''E''(''X''<sup>4</sup>) / ''σ''<sup>4</sup>.
| |
| | |
| If ''k''<sup>2</sup> − ''k''γ − 1 > 0 then
| |
| | |
| :<math> P(X > k\sigma) \le \frac{ \kappa - \gamma^2 - 1 }{ (\kappa - \gamma^2 - 1) (1 + k^2) + (k^2 - k\gamma - 1) }.</math>
| |
| | |
| The necessity of ''k''<sup>2</sup> − ''k''γ − 1 > 0 requires that ''k'' be reasonably large.
| |
| | |
| ===Mitzenmacher and Upfal's inequality===
| |
| [[Michael Mitzenmacher | Mitzenmacher]] and [[Eli Upfal | Upfal]]<ref name=Mitzenmacher2005>{{cite book |last1=Mitzenmacher |first1=Michael |authorlink1=Michael Mitzenmacher |last2=Upfal |first2=Eli |authorlink2=Eli Upfal |title=Probability and Computing: Randomized Algorithms and Probabilistic Analysis |date=January 2005 |publisher=Cambridge Univ. Press |location=Cambridge [u.a.] |isbn=9780521835404 |url=http://www.cambridge.org/us/knowledge/isbn/item1171566/?site_locale=en_US |edition=Repr. |accessdate=6 October 2012}}</ref> note that
| |
| | |
| : <math> [ X - E( X ) ]^{ 2k } > 0 </math>
| |
| | |
| for any real ''k'' > 0 and that
| |
| | |
| : <math> E ( [ X - E( X ) ]^{ 2k } ) </math>
| |
| | |
| is the ''k''<sup>th</sup> central moment. They then show that for ''t'' > 0
| |
| | |
| : <math> P( | X - E( X ) | > t [ E( X - E( X ) )^{ 2k } ]^{ 1 / 2k } ) \le \min\left[ 1, \frac{ 1 }{ t^{ 2k } } \right]. </math>
| |
| | |
| For ''k'' = 2 we obtain Chebyshev's inequality. For ''t'' ≥ 1, ''k'' > 2 and assuming that the ''k''<sup>th</sup> moment exists, this bound is tighter than Chebyshev's inequality.
| |
| | |
| ==Related inequalities==
| |
| Several other related inequalities are also known.
| |
| | |
| ===Zelen's inequality===
| |
| Zelen has shown that<ref name=Zelen1954>Zelen M (1954) Bounds on a distribution function that are functions of moments to order four. J Res Nat Bur Stand 53:377-381</ref>
| |
| | |
| : <math> P( X - \mu \ge k \sigma ) \le [ 1 + k^2 + \frac{ ( k^2 - k \theta_3 - 1 )^2 }{ \theta_4 - \theta_3^2 - 1 } ]^{ -1 } </math>
| |
| | |
| with
| |
| | |
| :<math> k \ge \frac{ \theta_3 + \sqrt{ \theta_3^2 + 4 } }{ 2 } </math>
| |
| | |
| and
| |
| | |
| :<math> \theta_m = \frac{ M_m }{ \sigma_m } </math>
| |
| | |
| where ''M''<sub>m</sub> is the ''M''<sup>th</sup> moment and ''σ'' is the standard deviation.
| |
| | |
| ===He, Zhang and Zhang's inequality===
| |
| For any collection of ''n'' nonnegative independent random variables ''X''<sub>i</sub><ref name=He2010>He S, Zhang J, Zhang S (2010) Bounding probability of small deviation: A fourth moment approach. Mathematics of operations research 35 (1) 208-232. doi: 10.1287/moor.1090.0438</ref>
| |
| | |
| : <math> P[ \frac{ \Sigma_{ i = 1 }^n X_i }{ n } - 1 \ge \frac{ 1 }{ n } ] \le \frac{ 7 }{ 8 }. </math>
| |
| | |
| ===Hoeffding’s lemma===
| |
| Let ''X'' be a random variable with ''a'' ≤ ''X'' ≤ ''b'' and ''E''[ ''X'' ] = 0, then for any ''s'' > 0, we have
| |
| | |
| : <math> E[ e^{ sX } ] \le e^{ \frac{ s^2 ( b - a )^2 }{ 8 } }.</math>
| |
| | |
| === van Zuijlen's bound ===
| |
| van Zuijlen has proved the following result.<ref name=vanZuijlen2011>van Zuijlen Martien CA (2011) On a conjecture concerning the sum of independent Rademacher random variables. http://arxiv.org/abs/1112.4988</ref>
| |
| | |
| Let ''X<sub>i</sub>'' be a set of independent [[Rademacher distribution|Rademacher]] random variables: {{nowrap|''P''( ''X<sub>i</sub>'' {{=}} 1 ) {{=}}}} {{nowrap|P( ''X<sub>i</sub>'' {{=}} −1 ) {{=}} 0.5}}. Then
| |
| | |
| : <math> P \Bigl( \Bigl | \frac{ \sum_{ i = 1 }^n X_i } { \sqrt n } \Bigr| \le 1 ) \ge 0.5. </math>
| |
| | |
| The bound is sharp and better than that which can be derived from the normal distribution (approximately ''P'' > 0.31).
| |
| | |
| ==Unimodal distributions==
| |
| A distribution function ''F'' is unimodal at ''ν'' if ''F'' is [[convex function|convex]] on (−∞, ''ν'') and [[concave function|concave]] on (''ν'',∞)<ref name=Feller1966>{{cite book |last=Feller |first=William |authorlink=William Feller |title=An Introduction to Probability Theory and Its Applications, Volume 2 |year=1966 |publisher=Wiley |url=http://books.google.com/?id=LhrvAAAAMAAJ&dq=%22An+introduction+to+probability+theory+and+its+applications%22+%22volume+2%22+feller |edition=2 |accessdate=6 October 2012 |page=155}}</ref> An empirical distribution can be tested for unimodality with the [[dip test]].<ref name=Hartigan1985>Hartigan J.A., Hartigan P.M. (1985) [http://projecteuclid.org/euclid.aos/1176346577 "The dip test of unimodality"]. ''Annals of Statistics'' 13 (1) 70–84 {{doi|10.1214/aos/1176346577}} {{MR|773153}}</ref>
| |
| | |
| In 1823 [[Gauss]] showed that for a unimodal distribution with a mode of zero<ref name=Gauss1823>Gauss C.F. Theoria Combinationis Observationum Erroribus Minimis Obnoxiae. Pars Prior. Pars Posterior. Supplementum. Theory of the Combination of Observations Least Subject to Errors. Part One. Part Two. Supplement. 1995. Translated by G.W. Stewart. Classics in Applied Mathematics Series, Society for Industrial and Applied Mathematics, Philadelphia</ref>
| |
| | |
| : <math> P( | X | \ge k ) \le \frac{ 4 \operatorname{ E }( X^2 ) } { 9k^2 } \quad\text{if} \quad k^2 \ge \frac{ 4 } { 3 } \operatorname{E} (X^2), </math>
| |
| | |
| : <math> P( | X | \ge k ) \le 1 - \frac{ k^2 } { 3 \operatorname{ E }( X^2 ) } \quad \text{if} \quad k^2 \le \frac{ 4 } { 3 } \operatorname{ E }( X^2 ). </math>
| |
| | |
| If the second condition holds then the second bound is always less than or equal to the first.{{citation needed|date=July 2012}}
| |
| | |
| If the mode (''ν'') is not zero and the mean (''μ'') and standard deviation (''σ'') are both finite then denoting the root mean square deviation from the mode by ''ω'', we have{{citation needed|date=July 2012}}
| |
| | |
| : <math> \sigma \le \omega \le 2 \sigma, </math>
| |
| | |
| and
| |
| | |
| : <math> | \nu - \mu | \le \sqrt{ \frac{ 3 }{ 4 } } \omega. </math>
| |
| | |
| Winkler in 1866 extended Gauss' inequality to ''r''<sup>th</sup> moments <ref name=Winkler1886>Winkler A (1886) Math-Natur theorie Kl. Akad. Wiss Wien Zweite Abt 53, 6-41</ref> where ''r'' > 0 and the distribution is unimodal with a mode of zero:
| |
| | |
| : <math> P( | X | \ge k ) \le \left( \frac{ r } { r + 1 } \right)^r \frac{ \operatorname{ E }( | X | )^r } { k^r } \quad \text{if} \quad k^r \ge \frac{ r^r } { ( r + 1 )^{ r + 1 } } \operatorname{ E }( | X |^r ), </math>
| |
| | |
| : <math> P( | X | \ge k) \le \left( 1 - \left[ \frac{ k^r }{ ( r + 1 ) \operatorname{ E }( | X | )^r } \right]^{ 1 / r } \right) \quad \text{if} \quad k^r \le \frac{ r^r } { ( r + 1 )^{ r + 1 } } \operatorname{ E }( | X |^r ). </math>
| |
| | |
| Gauss' bound has been subsequently sharpened and extended to apply to departures from the mean rather than the mode: see the [[Vysochanskiï–Petunin inequality]] for details.
| |
| | |
| The Vysochanskiï–Petunin inequality has been extended by Dharmadhikari and Joag-Dev<ref name=Dharmadhikari1985>Dharmadhikari SW, Joag-Dev K(1985) The Gauss–Tchebyshev inequality for unimodal distributions. Teor Veroyatnost i Primenen 30(4) 817–820</ref>
| |
| | |
| : <math> P( | X | > k ) \le \max\left( \left[ \frac{ r }{( r + 1 ) k } \right]^r E| X^r |, \frac{ s }{( s - 1 ) k^r } E| X^r | - \frac{ 1 }{ s - 1 } \right) </math>
| |
| | |
| where ''s'' is a constant satisfying both ''s'' > ''r'' + 1 and ''s''(''s'' − ''r'' − 1) = ''r''<sup>''r''</sup> and ''r'' > 0.
| |
| | |
| It can be shown that these inequalities are the best possible and that further sharpening of the bounds requires that additional restrictions be placed on the distributions.
| |
| | |
| ===Unimodal symmetrical distributions===
| |
| The bounds on this inequality can also be sharpened if the distribution is both [[unimodal]] and [[Symmetric probability distribution|symmetrical]].<ref name=Clarkson2009>{{cite doi|10.1214/08-AAP536}}</ref> An empirical distribution can be tested for symmetry with a number of tests including McWilliam's R*.<ref name=McWilliams1990>McWilliams T.P. (1990) "A distribution-free test for symmetry based on a runs statistic".''Journal of the American Statistical Association'' 85 (412)1130–1133 {{jstor|2289611}}</ref> It is known that the variance of a unimodal symmetrical distribution with finite support [''a'', ''b''] is less than or equal to ( ''b'' − ''a'' )<sup>2</sup> / 12.<ref name=Seaman1987>{{cite journal |last1=Seaman |first1=John W., Jr. |last3=Odell |first3=Patrick. L. |last2=Young |first2=Dean M. |title=Improving small sample variance estimators for bounded random variables |journal=Industrial Mathematics |issn=0019-8528 |date=1987 |volume=37 |zbl=0637.62024 | pages=65–75}}</ref>
| |
| | |
| Let the distribution be supported on the [[finite (disambiguation)|finite]] [[interval (mathematics)|interval]] [ −''N'', ''N'' ] and the variance be finite. Let the [[mode (statistics)|mode]] of the distribution be zero and rescale the variance to 1. Let ''k'' > 0 and assume ''k'' < 2''N''/3. Then<ref name="Clarkson2009"/>
| |
| | |
| : <math> P( X \ge k ) \le \frac{ 1 }{ 2 } - \frac{ k }{ 2 \sqrt{ 3 } } \quad \text{if} \quad 0 \le k \le \frac{ 2 }{ \sqrt{ 3 } },</math>
| |
| | |
| : <math> P( X \ge k ) \le \frac{ 2 }{ 9k^2 } \quad \text{if} \quad \frac{ 2 }{ \sqrt{ 3 } } \le k \le \frac{ 2N }{ 3 }. </math>
| |
| | |
| If 0 < ''k'' ≤ 2 / √3 the bounds are reached with the density<ref name="Clarkson2009"/>
| |
| | |
| : <math> f( x ) = \frac{ 1 }{ 2 \sqrt{ 3 } } \quad \text{if} \quad | x | < \sqrt{ 3 } </math>
| |
| | |
| : <math> f( x ) = 0 \quad \text{if} \quad | x | \ge \sqrt{ 3 }. </math>
| |
| | |
| If 2 / √3 < ''k'' ≤ 2''N'' / 3 the bounds are attained by the distribution
| |
| | |
| :<math> ( 1 - \beta_k ) \delta_0 ( x ) + \beta_k f_k( x ), </math>
| |
| | |
| where ''β''<sub>k</sub> = 4 / 3''k''<sup>2</sup>, ''δ''<sub>0</sub> is the [[Dirac delta function]] and where
| |
| | |
| : <math> f_k( x ) = \frac{ 1 }{ 3k } \quad \text{if} \quad | x | < \frac{ 3k }{ 2 }, </math>
| |
| | |
| : <math> f_k( x ) = 0 \quad \text{if} \quad | x | \ge \frac{ 3k }{ 2 }. </math>
| |
| | |
| The existence of these densities shows that the bounds are optimal. Since ''N'' is arbitrary these bounds apply to any value of ''N''.
| |
| | |
| The Camp–Meidell's inequality is a related inequality.<ref name=Bickel1992>{{cite journal |last1=Bickel |first1=Peter J. |authorlink1=Peter J. Bickel |last2=Krieger |first2=Abba M. |title=Extensions of Chebyshev's Inequality with Applications |journal=Probability and Mathematical Statistics |year=1992 |volume=13 |issue=2 |pages=293–310 |url=http://www.math.uni.wroc.pl/~pms/publicationsArticle.php?nr=13.2&nrA=11&ppB=%20293&ppE=%20310 |accessdate=6 October 2012 |language=English |publisher=Wydawnictwo Uniwersytetu Wrocławskiego |issn=0208-4147}}</ref> For an absolutely continuous unimodal and symmetrical distribution
| |
| | |
| : <math> P( | X - \mu | \ge k \sigma ) \le 1 - \frac{ k }{ \sqrt{ 3 } } \quad \text{if} \quad k \le \frac{ 2 }{ \sqrt { 3 } }</math>
| |
| | |
| : <math> P( | X - \mu | \ge k \sigma ) \le \frac{ 4 }{ 9k^2 } \quad \text{if} \quad k > \frac{ 2 }{ \sqrt { 3 } }</math>
| |
| | |
| The second of these inequality is the same as the Vysochanskiï–Petunin inequality.
| |
| | |
| DasGupta has shown that if the distribution is known to be normal<ref name=DasGupta2000>DasGupta A (2000) Best constants in Chebychev inequalities with various applications. Metrika 5 (1) 185–200</ref>
| |
| | |
| : <math> P( | X - \mu | \ge k \sigma ) \le \frac{ 1 }{ 3 k^2 } </math>
| |
| | |
| ===Notes===
| |
| ;Effects of symmetry and unimodality
| |
| | |
| Symmetry of the distribution decreases the inequality's bounds by a factor of 2 while unimodality sharpens the bounds by a factor of 4/9.
| |
| | |
| ;Unimodal distributions
| |
| | |
| Because the mean and the mode in a unimodal distribution differ by at most √3 standard deviations<ref name=unimodal>{{cite web|url=http://www.se16.info/~se16/hgb/cheb2.htm#3unimodalinequalities |title=More thoughts on a one tailed version of Chebyshev's inequality – by Henry Bottomley |publisher=se16.info |date= |accessdate=2012-06-12}}</ref> at most 5% of a symmetrical unimodal distribution lies outside (2√10 + 3√3)/3 standard deviations of the mean (approximately 3.840 standard deviations). This is sharper than the bounds provided by the Chebyshev inequality (approximately 4.472 standard deviations).
| |
| | |
| These bounds on the mean are less sharp than those that can be derived from symmetry of the distribution alone which shows that at most 5% of the distribution lies outside approximately 3.162 standard deviations of the mean. The Vysochanskiï–Petunin inequality further sharpens this bound by showing that for such a distribution that at most 5% of the distribution lies outside 4√5/3 (approximately 2.981) standard deviations of the mean.
| |
| | |
| ;Symmetrical unimodal distributions
| |
| | |
| For any symmetrical unimodal distribution:
| |
| * approximately 5.784% of the distribution lies outside 1.96 standard deviations of the mode
| |
| * 5% of the distribution lies outside 2√10/3 (approximately 2.11) standard deviations of the mode
| |
| | |
| DasGupta's inequality states that for a normal distribution at least 95% lies within approximately 2.582 standard deviations of the mean. This is less sharp than the true figure (approximately 1.96 standard deviations of the mean).
| |
| | |
| ==Bounds for specific distributions==
| |
| DasGupta has determined a set of best possible bounds for a [[normal distribution]] for this inequality.<ref name=DasGupta2000>DasGupta A (2000) Best constants in Chebyshev inequalities with various applications. Metrika, 51: 185-200</ref>
| |
| | |
| Steliga and Szynal have extended these bounds to the [[Pareto distribution]].<ref name=Steliga2010>Steliga K, Szynal D (2010) Int J Pure App Math 58 (2) 137-152</ref>
| |
| | |
| == Zero means ==
| |
| When the mean (''μ'') is zero Chebyshev's inequality takes a simple form. Let ''σ''<sup>2</sup> be the variance. Then
| |
| | |
| : <math> P(| X | \ge 1) \le \min(1, \sigma^2) </math>
| |
| | |
| With the same conditions Cantelli's inequality takes the form
| |
| | |
| : <math> P(X \ge 1) \le \frac{ \sigma^2 }{ 1 + \sigma^2 } </math>
| |
| | |
| === Unit variance ===
| |
| If in addition ''E''( ''X''<sup>2</sup> ) = 1 and ''E''( ''X''<sup>4</sup> ) = ''ψ'' then for any 0 ≤ ''ε'' ≤ 1<ref name=Godwin1964a>Godwin HJ (1964) Inequalities on distribution functions. (Chapter 3) New York, Hafner Pub Co</ref>
| |
| | |
| : <math> P( | X | > \epsilon ) \ge \frac{ ( 1 - \epsilon^2 )^2 }{ \psi - 1 + ( 1 - \epsilon^2 )^2 } \ge \frac{( 1 - \epsilon^2 )^2 }{ \psi }</math>
| |
| | |
| The first inequality is sharp.
| |
| | |
| It is also known that for a random variable obeying the above conditions that<ref name=Lesley2003>Lesley FD, Rotar VI (2003) Some remarks on lower bounds of Chebyshev's type for half-lines. J Inequalities Pure Appl Math 4 (5) Art 96</ref>
| |
| | |
| : <math> P( X \ge \epsilon ) \ge \frac{ C_0 }{ \psi } - \frac{ C_1 }{ \sqrt{ \psi } } \epsilon + \frac{ C_2 }{ \psi \sqrt{ \psi } } \epsilon </math>
| |
| | |
| where
| |
| | |
| : <math> C_0 = 2 \sqrt{ 3 } - 3 \quad ( \approxeq 0.464 ) </math>
| |
| | |
| : <math> C_1 = 1.397 </math>
| |
| | |
| : <math> C_2 = 0.0231 </math>
| |
| | |
| It is also known that<ref name="Lesley2003"/>
| |
| | |
| : <math> P( X > 0 ) \ge \frac{ C_0 }{ \psi } </math>
| |
| | |
| The value of C<sub>0</sub> is optimal and the bounds are sharp if
| |
| | |
| : <math> \psi \ge \frac{ 3 }{ \sqrt{ 3 } + 1 } \quad ( \approxeq 1.098 ) </math>
| |
| | |
| If
| |
| | |
| : <math> \psi \le \frac{ 3 }{ \sqrt{ 3 } + 1 } </math>
| |
| | |
| then the sharp bound is
| |
| | |
| : <math> P( X > 0 ) \ge \frac{ 2 }{ 3 + \psi + \sqrt{ ( 1 + \psi )^2 - 4 } } </math>
| |
| | |
| == Integral Chebyshev inequality ==
| |
| | |
| There is a second (less well known) inequality also named after Chebyshev<ref name=Fink1984>{{cite journal |last1=Fink |first1=A. M. |last2=Jodeit |first2=Max, Jr. |title=On Chebyshev's Other Inequality |journal=Institute of Mathematical Statistics Lecture Notes - Monograph Series |isbn=0-940600-04-8 |mr=789242 |editor1-first=Y. L. |editor1-last=Tong |editor2-last=Gupta |editor2-first=Shanti S. |year=1984 |volume=5 |series=Inequalities in Statistics and Probability: Proceedings of the Symposium on Inequalities in Statistics and Probability, October 27–30, 1982, Lincoln, Nebraska |pages=115–120 |doi=10.1214/lnms/1215465637 |url=http://projecteuclid.org/euclid.lnms/1215465617 |accessdate=7 October 2012}}</ref>
| |
| | |
| If ''f'', ''g'' : [''a'', ''b''] → '''R''' are two [[monotonic]] [[function (mathematics)|function]]s of the same monotonicity, then
| |
| | |
| : <math> \frac{ 1 }{ b - a } \int_a^b \! f(x) g(x) \,dx \ge \left[ \frac{ 1 }{ b - a } \int_a^b \! f(x) \,dx \right] \left[ \frac{ 1 }{ b - a } \int_a^b \! g(x) \,dx \right] </math>
| |
| | |
| If ''f'' and ''g'' are of opposite monotonicity, then the above inequality works in the reverse way.
| |
| | |
| This inequality is related to [[Jensen's inequality]],<ref name=Niculescu2001>{{cite journal |last=Niculescu |first=Constantin P. |title=An extension of Chebyshev's inequality and its connection with Jensen's inequality |journal=Journal of Inequalities and Applications |year=2001 |volume=6 |issue=4 |pages=451–462 |doi=10.1155/S1025583401000273 |url=http://emis.matem.unam.mx/journals/HOA/JIA/Volume6_4/462.html |accessdate=6 October 2012 |issn=1025-5834}}</ref> [[Kantorovich's inequality]],<ref name=Niculescu2001a>{{cite journal |last1=Niculescu |first1=Constantin P. |last2=Pečarić |first2=Josip |authorlink2=Josip Pečarić |title=The Equivalence of Chebyshev's Inequality to the Hermite–Hadamard Inequality |journal=Mathematical Reports |year=2010 |volume=12 |issue=62 |pages=145–156 |url=http://www.csm.ro/reviste/Mathematical_Reports/Pdfs/2010/2/Niculescu.pdf |accessdate=6 October 2012 |publisher=Publishing House of the Romanian Academy |language=English |issn=1582-3067}}</ref> the [[Hermite–Hadamard inequality]]<ref name="Niculescu2001a"/> and [[Walter's conjecture]].<ref name=Malamud2001>{{cite journal |last=Malamud |first=S. M. |title=Some complements to the Jensen and Chebyshev inequalities and a problem of W. Walter |journal=Proceedings of the American Mathematical Society |date=15 February 2001 |volume=129 |issue=9 |pages=2671–2678 |doi=10.1090/S0002-9939-01-05849-X |mr=1838791 |url=http://www.ams.org/journals/proc/2001-129-09/S0002-9939-01-05849-X/ |accessdate=7 October 2012 |issn=0002-9939}}</ref>
| |
| | |
| ===Other inequalities===
| |
| | |
| There are also a number of other inequalities associated with Chebyshev
| |
| | |
| *[[Chebyshev's sum inequality]]
| |
| *[[Chebyshev–Markov–Stieltjes inequalities]]
| |
| | |
| == Haldane's transformation ==
| |
| One use of Chebyshev's inequality in applications is to create confidence intervals for variates with an unknown distribution. [[J. B. S. Haldane|Haldane]] noted,<ref name=Haldane1952>Haldane JB (1952) Simple tests for bimodality and bitangentiality. ''[[Annals of Eugenics]]'' 16(4):359–364 {{doi|10.1111/j.1469-1809.1951.tb02488.x}}</ref> using an equation derived by [[Maurice Kendall|Kendall]],<ref name=Kendall1943>Kendall MG (1943) The Advanced Theory of Statistics, 1. London</ref> that if a variate (''x'') has a zero mean, unit variance and both finite [[skewness]] (''γ'') and [[kurtosis]] (''κ'') then the variate can be converted to a normally distributed [[standard score]] (''z''):
| |
| | |
| : <math> z = x - \frac{ \gamma }{ 6 } (x^2 - 1) + \frac{ x }{ 72 } [ 2 \gamma^2 (4 x^2 - 7) - 3 \kappa (x^2 - 3) ] + \cdots </math>
| |
| | |
| This transformation may be useful as an alternative to Chebyshev's inequality or as an adjunct to it for deriving confidence intervals for variates with unknown distributions.
| |
| | |
| While this transformation may be useful for moderately skewed and/or kurtotic distributions, it performs poorly when the distribution is markedly skewed and/or kurtotic.
| |
| | |
| == Chernoff bounds ==
| |
| If the random variables may also be assumed to be independently distributed it is possible to obtain sharper bounds. Let δ > 0. Then<ref name=Chernoff1952>{{cite journal
| |
| |last1=Chernoff |first1=H.
| |
| |year=1952
| |
| |title=A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations
| |
| |journal=[[Annals of Mathematical Statistics]]
| |
| |volume=23 |issue=4 |pages=493–507
| |
| |doi=10.1214/aoms/1177729330
| |
| |jstor=2236576
| |
| |mr=57518
| |
| |zbl=0048.11804
| |
| }}
| |
| </ref>
| |
| | |
| : <math> \delta - (1 + \delta) \log(1 + \delta) < \frac{ -\delta^2 }{ 2 + \delta } .</math>
| |
| | |
| With this inequality it can be shown that
| |
| | |
| : <math> P(X > (1 + \delta) \mu) \le e^{ \frac{ -\delta^2 \mu }{ 2 + \delta } }, </math>
| |
| | |
| : <math> P(X < (1 - \delta) \mu) \le e^{ \frac{ -\delta^2 \mu }{ 2 + \delta } }. </math>
| |
| | |
| where ''μ'' is the mean of the distribution. Further discussion may be found in the article on [[Chernoff bounds]]
| |
| | |
| ==Notes==
| |
| ===Caution concerning use of Chebyshev's inequality===
| |
| The [[Environmental Protection Agency]] has suggested best practices for the use of Chebyshev's inequality for estimating confidence intervals.<ref name=EPA2000>{{Cite report
| |
| | title = Calculating Upper Confidence Limits for Exposure Point Concentrations at hazardous Waste Sites
| |
| | publisher = Office of Emergency and Remedial Response of the U.S. Environmental Protection Agency
| |
| | year = 2002
| |
| | month = December
| |
| | url = http://www.epa.gov/oswer/riskassessment/pdf/ucl.pdf
| |
| | accessdate = 1 October 2012
| |
| | format = PDF}}</ref> This caution appears to be justified as its use in this context may be seriously misleading [http://www.quantdec.com/envstats/notes/class_12/ucl.htm]
| |
| | |
| ==See also==
| |
| *[[Chernoff bound]] — a bound on sums of random variables
| |
| *[[Cornish–Fisher expansion]]
| |
| *[[Eaton's inequality]]
| |
| *[[Hoeffding's inequality]] — an exponential bound on the sum of a series of random variables
| |
| *[[Kolmogorov's inequality]]
| |
| *[[Law of large numbers/Proof|Proof of the weak law of large numbers]] using Chebyshev's inequality
| |
| *[[Le Cam's theorem]]
| |
| *[[Multidimensional Chebyshev's inequality]]
| |
| *[[Paley–Zygmund inequality]]
| |
| *[[Vysochanskiï–Petunin inequality]] — a stronger result applicable to [[unimodal probability distributions]]
| |
| | |
| ==References==
| |
| {{reflist|2}}
| |
| | |
| ==Further reading==
| |
| * A. Papoulis (1991), ''Probability, Random Variables, and Stochastic Processes'', 3rd ed. McGraw–Hill. ISBN 0-07-100870-5. pp. 113–114.
| |
| * [[Geoffrey Grimmett|G. Grimmett]] and D. Stirzaker (2001), ''Probability and Random Processes'', 3rd ed. Oxford. ISBN 0-19-857222-0. Section 7.3.
| |
| | |
| ==External links==
| |
| {{commonscat}}
| |
| * {{springer|title=Chebyshev inequality in probability theory|id=p/c021890}}
| |
| * [http://mws.cs.ru.nl/mwiki/random_2.html#T7 Formal proof] in the [[Mizar system]].
| |
| | |
| [[Category:Articles containing proofs]]
| |
| [[Category:Probabilistic inequalities]]
| |
| [[Category:Probability theory]]
| |
| [[Category:Statistical inequalities]]
| |