Nested word: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>UKoch
en>Jochen Burghardt
References: moved ref.s to central ((cite doi)) template pages, where possible
Line 1: Line 1:
{{No footnotes|date=November 2010}}
Nothing to tell about me at all.<br>Great to be a part of wmflabs.org.<br>I just wish Im useful in one way .<br><br>Take a look at my website [http://opac-istec.prebi.unlp.edu.ar/resolve/index/?url=http://www.web-ecigs.com/ eLiquide]
 
This article is supplemental for “[[Convergence of random variables]]” and provides proofs for selected results.
 
Several results will be established using the '''[[portmanteau lemma]]''': A sequence {''X<sub>n</sub>''} converges in distribution to ''X'' if and only if any of the following conditions are met:
<ol type=A>
  <li> E[''f''(''X<sub>n</sub>'')] → E[''f''(''X'')] for all [[Bounded function|bounded]], [[continuous function]]s ''f'';
  <li> E[''f''(''X<sub>n</sub>'')] → E[''f''(''X'')] for all bounded, [[Lipschitz function]]s ''f'';
  <li> limsup{Pr(''X<sub>n</sub>'' ∈ ''C'')} ≤ Pr(''X'' ∈ ''C'') for all [[closed set]]s ''C'';
</ol>
 
=={{anchor|propA1}} Convergence almost surely implies convergence in probability==
: <math>X_n\ \xrightarrow{as}\ X  \quad\Rightarrow\quad  X_n\ \xrightarrow{p}\ X</math>
'''Proof:''' If {''X<sub>n</sub>''} converges to ''X'' almost surely, it means that the set of points {ω: lim ''X<sub>n</sub>''(ω) ≠ ''X''(ω)} has measure zero; denote this set ''O''. Now fix ε > 0 and consider a sequence of sets
: <math>A_n = \bigcup_{m\geq n} \left \{ \left |X_m-X \right |>\varepsilon \right\}</math>
 
This sequence of sets is decreasing: ''A''<sub>''n''</sub> ⊇ ''A''<sub>''n''+1</sub> ⊇ ..., and it decreases towards the set
 
:<math>A_{\infty} = \bigcap_{n \geq 1} A_n.</math>
 
For this decreasing sequence of events, their probabilities are also a decreasing sequence, and it decreases towards the Pr(''A''<sub>∞</sub>); we shall show now that this number is equal to zero. Now any point ω in the complement of ''O'' is such that lim ''X<sub>n</sub>''(ω) = ''X''(ω), which implies that |''X<sub>n</sub>''(ω) − ''X''(ω)| < ε for all ''n'' greater than a certain number ''N''. Therefore, for all ''n'' ≥ ''N'' the point ω will not belong to the set ''A<sub>n</sub>'', and consequently it will not belong to ''A''<sub>∞</sub>. This means that ''A''<sub>∞</sub> is disjoint with <span style="text-decoration:overline">''O''</span>, or equivalently, ''A''<sub>∞</sub> is a subset of ''O'' and therefore Pr(''A''<sub>∞</sub>) = 0.
 
Finally, consider
: <math>\operatorname{Pr}\left(|X_n-X|>\varepsilon\right) \leq \operatorname{Pr}(A_n) \ \underset{n\to\infty}{\rightarrow} 0,</math>
which by definition means that ''X<sub>n</sub>'' converges in probability to ''X''.
 
=={{anchor|propA1i}} Convergence in probability does not imply almost sure convergence in the discrete case==
If ''X<sub>n</sub>'' are independent random variables assuming value one with probability 1/''n'' and zero otherwise, then ''X<sub>n</sub>'' converges to zero in probability but not almost surely. This can be verified using the [[Borel–Cantelli lemma]]s.
 
=={{anchor|propA2}} Convergence in probability implies convergence in distribution==
: <math>    X_n\ \xrightarrow{p}\ X \quad\Rightarrow\quad X_n\ \xrightarrow{d}\ X,</math>
 
===Proof for the case of scalar random variables===
'''Lemma.''' Let ''X'', ''Y'' be random variables, ''a'' a real number and ε > 0. Then
: <math>    \operatorname{Pr}(Y \leq a) \leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|Y - X| > \varepsilon).</math>
 
'''Proof of lemma:'''
: <math>\begin{align}
\operatorname{Pr}(Y\leq a) &= \operatorname{Pr}(Y\leq a,\ X\leq a+\varepsilon) + \operatorname{Pr}(Y\leq a,\ X>a+\varepsilon) \\
      &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X\leq a-X,\ a-X<-\varepsilon) \\
      &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X<-\varepsilon) \\
      &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X<-\varepsilon) + \operatorname{Pr}(Y-X>\varepsilon)\\
      &= \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|Y-X|>\varepsilon)
  \end{align}</math>
 
'''Proof of the theorem:''' Recall that in order to prove convergence in distribution, one must show that the sequence of cumulative distribution functions converges to the ''F<sub>X</sub>'' at every point where ''F<sub>X</sub>'' is continuous.  Let ''a'' be such a point. For every ε > 0, due to the preceding lemma, we have:
: <math>\begin{align}
\operatorname{Pr}(X_n\leq a) &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|X_n-X|>\varepsilon) \\
\operatorname{Pr}(X\leq a-\varepsilon)&\leq \operatorname{Pr}(X_n\leq a) + \operatorname{Pr}(|X_n-X|>\varepsilon)
\end{align}</math>
 
So, we have
: <math> \operatorname{Pr}(X\leq a-\varepsilon) - \operatorname{Pr} \left (\left |X_n-X \right |>\varepsilon \right ) \leq \operatorname{Pr} \left (X_n\leq a \right ) \leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr} \left (\left |X_n-X \right |>\varepsilon \right ).  </math>
 
Taking the limit as ''n'' → ∞, we obtain:
: <math>    F_X(a-\varepsilon) \leq \lim_{n\to\infty} \operatorname{Pr}(X_n \leq a) \leq F_X(a+\varepsilon),</math>
where ''F<sub>X</sub>''(''a'') = Pr(''X'' ≤ ''a'') is the [[cumulative distribution function]] of ''X''. This function is continuous at ''a'' by assumption, and therefore both ''F<sub>X</sub>''(''a''−ε) and ''F<sub>X</sub>''(''a''+ε) converge to ''F<sub>X</sub>''(''a'') as ε → 0<sup>+</sup>. Taking this limit, we obtain
: <math>    \lim_{n\to\infty} \operatorname{Pr}(X_n \leq a) = \operatorname{Pr}(X \leq a),</math>
which means that {''X<sub>n</sub>''} converges to ''X'' in distribution.
 
===Proof for the generic case===
We see that |''X<sub>n</sub>'' − ''X''| converges in probability to zero, and also ''X'' converges to ''X'' in distribution trivially. Applying the property [[#propB2|proved later on this page]] we conclude that ''X<sub>n</sub>'' converges to ''X'' in distribution.
 
=={{anchor|propB1}} Convergence in distribution to a constant implies convergence in probability==
: <math>    X_n\ \xrightarrow{d}\ c \quad\Rightarrow\quad X_n\ \xrightarrow{p}\ c,</math> <span style="position:relative;top:.4em;left:2em;">provided ''c'' is a constant.</span>
 
'''Proof:''' Fix ε > 0. Let ''B''<sub>ε</sub>(''c'') be the open ball of radius ε around point ''c'', and ''B''<sub>ε</sub><span style="position:relative;left:-.4em"><sup>''c''</sup>(''c'')</span>its complement. Then
: <math>\operatorname{Pr}\left(|X_n-c|\geq\varepsilon\right) = \operatorname{Pr}\left(X_n\in B_\varepsilon^c(c)\right).</math>
By the portmanteau lemma (part C), if ''X<sub>n</sub>'' converges in distribution to ''c'', then the [[limsup]] of the latter probability must be less than or equal to Pr(''c'' ∈ ''B''<sub>ε</sub><span style="position:relative;left:-.4em"><sup>''c''</sup>(''c'')),</span>which is obviously equal to zero. Therefore
 
: <math>\begin{align}
\lim_{n\to\infty}\operatorname{Pr}\left( \left |X_n-c \right |\geq\varepsilon\right) &\leq \limsup_{n\to\infty}\operatorname{Pr}\left( \left |X_n-c \right | \geq \varepsilon \right) \\
&= \limsup_{n\to\infty}\operatorname{Pr}\left(X_n\in B_\varepsilon^c(c)\right) \\
&\leq \operatorname{Pr}\left(c\in B_\varepsilon^c(c)\right) = 0
\end{align}</math>
 
which by definition means that ''X<sub>n</sub>'' converges to ''c'' in probability.
 
=={{anchor|propB2}} Convergence in probability to a sequence converging in distribution implies convergence to the same distribution==
: <math>    |Y_n-X_n|\ \xrightarrow{p}\ 0,\ \ X_n\ \xrightarrow{d}\ X\  \quad\Rightarrow\quad  Y_n\ \xrightarrow{d}\ X</math>
 
'''Proof:''' We will prove this theorem using the portmanteau lemma, part B. As required in that lemma, consider any bounded function ''f'' (i.e. |''f''(''x'')| ≤ ''M'') which is also Lipschitz:
 
: <math>\exists K >0, \forall x,y: \quad |f(x)-f(y)|\leq K|x-y|.</math>
 
Take some ε > 0 and majorize the expression |E[''f''(''Y<sub>n</sub>'')] − E[''f''(''X<sub>n</sub>'')]| as
 
: <math>\begin{align}
\left|\operatorname{E}\left[f(Y_n)\right] - \operatorname{E}\left [f(X_n) \right] \right| &\leq \operatorname{E} \left [\left |f(Y_n) - f(X_n) \right | \right ]\\
&= \operatorname{E}\left[ \left |f(Y_n) - f(X_n) \right |\mathbf{1}_{\left \{|Y_n-X_n|<\varepsilon \right \}} \right] + \operatorname{E}\left[ \left |f(Y_n) - f(X_n) \right |\mathbf{1}_{\left \{|Y_n-X_n|\geq\varepsilon \right \}} \right] \\
&\leq \operatorname{E}\left[K \left |Y_n - X_n \right |\mathbf{1}_{\left \{|Y_n-X_n|<\varepsilon \right \}}\right] + \operatorname{E}\left[2M\mathbf{1}_{\left \{|Y_n-X_n|\geq\varepsilon \right \}}\right] \\
&\leq K \varepsilon \operatorname{Pr} \left (\left |Y_n-X_n \right |<\varepsilon\right) + 2M \operatorname{Pr} \left( \left |Y_n-X_n \right |\geq\varepsilon\right )\\
&\leq K \varepsilon + 2M \operatorname{Pr} \left (\left |Y_n-X_n \right |\geq\varepsilon \right )
\end{align}</math>
 
(here '''1'''<sub>{...}</sub> denotes the [[indicator function]]; the expectation of the indicator function is equal to the probability of corresponding event). Therefore
: <math>\begin{align}
\left |\operatorname{E}\left [f(Y_n)\right ] - \operatorname{E}\left [f(X) \right ]\right | &\leq \left|\operatorname{E}\left[ f(Y_n) \right ]-\operatorname{E} \left [f(X_n) \right ] \right| + \left|\operatorname{E}\left [f(X_n) \right ]-\operatorname{E}\left [f(X) \right] \right| \\
    &\leq K\varepsilon + 2M \operatorname{Pr}\left (|Y_n-X_n|\geq\varepsilon\right )+ \left |\operatorname{E}\left[ f(X_n) \right]-\operatorname{E} \left [f(X) \right ]\right|.
  \end{align}</math>
If we take the limit in this expression as ''n'' → ∞, the second term will go to zero since {''Y<sub>n</sub>−X<sub>n</sub>''} converges to zero in probability; and the third term will also converge to zero, by the portmanteau lemma and the fact that ''X<sub>n</sub>'' converges to ''X'' in distribution. Thus
: <math>    \lim_{n\to\infty} \left|\operatorname{E}\left [f(Y_n) \right] - \operatorname{E}\left [f(X) \right ] \right| \leq K\varepsilon.</math>
Since ε was arbitrary, we conclude that the limit must in fact be equal to zero, and therefore E[''f''(''Y<sub>n</sub>'')] → E[''f''(''X'')], which again by the portmanteau lemma implies that {''Y<sub>n</sub>''} converges to ''X'' in distribution. QED.
 
=={{anchor|propB3}} Convergence of one sequence in distribution and another to a constant implies joint convergence in distribution==
: <math>    X_n\ \xrightarrow{d}\ X,\ \ Y_n\ \xrightarrow{d}\ c\ \quad\Rightarrow\quad (X_n,Y_n)\ \xrightarrow{d}\ (X,c)
  </math> <span style="position:relative;top:.4em;left:2em;">provided ''c'' is a constant.</span>
 
'''Proof:''' We will prove this statement using the portmanteau lemma, part A.
 
First we want to show that (''X<sub>n</sub>'', ''c'') converges in distribution to (''X'', ''c''). By the portmanteau lemma this will be true if we can show that E[''f''(''X<sub>n</sub>'', ''c'')] → E[''f''(''X'', ''c'')] for any bounded continuous function ''f''(''x'', ''y''). So let ''f'' be such arbitrary bounded continuous function. Now consider the function of a single variable ''g''(''x'') := ''f''(''x'', ''c''). This will obviously be also bounded and continuous, and therefore by the portmanteau lemma for sequence {''X<sub>n</sub>''} converging in distribution to ''X'', we will have that E[''g''(''X<sub>n</sub>'')] → E[''g''(''X'')]. However the latter expression is equivalent to “E[''f''(''X<sub>n</sub>'', ''c'')] → E[''f''(''X'', ''c'')]”, and therefore we now know that (''X<sub>n</sub>'', ''c'') converges in distribution to (''X'', ''c'').
 
Secondly, consider |(''X<sub>n</sub>'', ''Y<sub>n</sub>'') − (''X<sub>n</sub>'', ''c'')| = |''Y<sub>n</sub>'' − ''c''|. This expression converges in probability to zero because ''Y<sub>n</sub>'' converges in probability to ''c''. Thus we have demonstrated two facts:
: <math>\begin{cases}
    \left| (X_n, Y_n) - (X_n,c) \right|\ \xrightarrow{p}\ 0, \\
    (X_n,c)\ \xrightarrow{d}\ (X,c).
  \end{cases}</math>
By the property [[#propB2|proved earlier]], these two facts imply that (''X<sub>n</sub>'', ''Y<sub>n</sub>'') converge in distribution to (''X'', ''c'').
 
=={{anchor|propB4}} Convergence of two sequences in probability implies joint convergence in probability==
: <math>X_n\ \xrightarrow{p}\ X,\ \ Y_n\ \xrightarrow{p}\ Y\ \quad\Rightarrow\quad (X_n,Y_n)\ \xrightarrow{p}\ (X,Y)</math>
 
'''Proof:'''
: <math>\begin{align}
\operatorname{Pr}\left(\left|(X_n,Y_n)-(X,Y)\right|\geq\varepsilon\right) &\leq \operatorname{Pr}\left(|X_n-X| + |Y_n-Y|\geq\varepsilon\right) \\
&\leq\operatorname{Pr}\left(|X_n-X|\geq\tfrac{\varepsilon}{2}\right) + \operatorname{Pr}\left(|Y_n-Y|\geq\tfrac{\varepsilon}{2}\right)
\end{align}</math>
Each of the probabilities on the right-hand side converge to zero as ''n'' → ∞ by definition of the convergence of {''X<sub>n</sub>''} and {''Y<sub>n</sub>''} in probability to ''X'' and ''Y'' respectively. Taking the limit we conclude that the left-hand side also converges to zero, and therefore the sequence {(''X<sub>n</sub>'', ''Y<sub>n</sub>'')} converges in probability to {(''X'', ''Y'')}.
 
==See also==
* [[Convergence of random variables]]
 
==References==
{{refbegin}}
* {{cite book
  | last = van der Vaart
  | first = Aad W.
  | title = Asymptotic statistics
  | year = 1998
  | publisher = Garrick Ardis
  | location = New York
  | isbn = 978-0-521-49603-2
  | lccn = QA276.V22 1998
  | ref = CITEREFvan_der_Vaart1998
  }}
{{refend}}
 
{{DEFAULTSORT:Proofs Of Convergence Of Random Variables}}
[[Category:Article proofs]]
[[Category:Probability theory]]

Revision as of 15:09, 16 February 2014

Nothing to tell about me at all.
Great to be a part of wmflabs.org.
I just wish Im useful in one way .

Take a look at my website eLiquide