Talk:Expected value

From formulasearchengine
Jump to navigation Jump to search

Template:Vital article Template:WPStatistics Template:Maths rating

Archived discussion to mid 2009

what is it with you math people and your inability to express yourselves ??

I have a PhD in molecular biology, and I do a much better job of explaining complex stuff. This article is - and am being fact based here - a disgrace. The opening paragraph should read something like Expected value is the average or most probable value that we can expect, for instance if we toss 2 fair dice, the expected value is ( think about 6 no); if we toss 100 fair pennies... You people need some fair, but harsh criticism; I'm not enough of a math person, but someone should step up , this article like most on math based subjects is not an ENCYCLOPEDIA article; it is for advance math students. I really mad at you people - you have had long time to work on this and have done a bad job I don't care if you have PhDs in math, or are tenured profs at big shot univeristys: you deserve all the opprobrium i, heaping on you instead of dismissing me, why don't you ask a historian or english major or poet or poli sci person what they think — Preceding unsigned comment added by (talk) 20:22, 29 January 2012 (UTC)

I do hope you understand your molecular biology better than your maths. Typical life scientist spouting nonsense. (talk) 09:23, 31 March 2013 (UTC)
No, no! In no way! A typical life scientist is much more clever and nice. This is either not a life scientist, or a quite atypical life scientist. Boris Tsirelson (talk) 12:19, 31 March 2013 (UTC)
I completely agree. I've worked as a consultant to casinos training their management teams to understand expected value and standard deviation as it applies to casino gaming so I would hope that I would at least be able to casually read and understand the introduction. As it reads now I feel like an idiot. The standard deviation article, on the other hand, is much easier to read as a layman in advanced mathematics. As an example, please compare the "application" sections of this article and the standard deviation article. Here's an excerpt from :
Scientific journals and research papers. A Wikipedia article should not be presented on the assumption that the reader is well versed in the topic's field. Introductory language in the lead and initial sections of the article should be written in plain terms and concepts that can be understood by any literate reader of Wikipedia without any knowledge in the given field before advancing to more detailed explanations of the topic. While wikilinks should be provided for advanced terms and concepts in that field, articles should be written on the assumption that the reader will not or cannot follow these links, instead attempting to infer their meaning from the text.
Academic language. Texts should be written for everyday readers, not for academics. AddBlue (talk) 08:50, 17 February 2012 (UTC)

Absolutely agree. this is a disgracefully written page. — Preceding unsigned comment added by (talk) 19:57, 14 May 2012 (UTC)

Math people expressed themselves on Wikipedia talk:WikiProject Mathematics, see the Frequently Asked Questions on the top of that page. Boris Tsirelson (talk) 05:35, 17 May 2012 (UTC)
I can hack through quantum mechanics but this “math” is beyond reason. Whoever wrote this should be ashamed and whoever keeps it like this (who know the subject well) is even worst. This is the worst article I have seen on this site. I have several science degrees and this is almost too far gone for even me. (talk) 16:38, 25 June 2012 (UTC)

"...most probable value that we can expect..."—excuse me, but this is plainly false, as explained in the last sentence of the second paragraph in the article. So the opening paragraph shall definitely not read like this. Personally I find the article well-written and understandable. bungalo (talk) 20:29, 27 June 2012 (UTC)

Proposition to add origin of the theory

I propose to add the following in a separate chapter:

Blaise Pascal was challenged by a friend, Antoine Gombaud (self acclaimed “Chevalier deM\´er\´e” and writer), with a gambling problem. The problem was that of two players who want to finish a game early and, given the current circumstances of the game, want to divide the stakes fairly, based on the chance each has of winning the game from that point. How should they find this “fair amount”? In 1654, he corresponded with Louis de Fermat on the subject of gambling. And it is in the discussion about this problem, that the foundations of the mathematical theory of probabilities are laid and the notion expected value introduced.

--->PLEASE let me know (right here or on my talk page) if this is would not be ok otherwise I plan to add it in a few days<---

Phdb (talk) 14:00, 24 February 2009 (UTC)

Yes please add. In fact, for quite some time the concept of expectation was more fundamental than the concept of probability in the theory. Fermat and Pacal never mentioned the word "probability" in their correspondence, for example. The probability concept then eventually emerged from the concept of expectation and replaced it as the fundamental concept of the theory. iNic (talk) 01:03, 26 February 2009 (UTC)
Laplace used the term “hope” or “mathematical hope” to denote the concept of expected value (see [1], ch.6):
I wonder what name was used by Pascal (if any), and where did “expectation” came from?  … stpasha »  19:37, 24 September 2009 (UTC)


I not very familiar with statistics, and possibly this is why I can't see any sense in counting the numbers written on the sides of the die. What if the sides of the die would be assigned signs without defined alphanumerical order? What would be the "expected value"? -- (talk) 12:14, 11 September 2009 (UTC)

In that case you can't really speak of a "value" of a certain side. If there's no value to a side, it's impossible to speak of an expectation value of throwing the die. The concept would be meaningless. Gabbe (talk) 15:52, 22 January 2010 (UTC)
Of course, you could assign values to the sides. The sides of a coin, for example, do not have numerical values. But if you assigned the value "-1" to heads and "1" to tails you would get the expected value
Similarly, if you instead gave heads the value "0" and tails the value "1" you would get
and so forth. Gabbe (talk) 20:34, 22 January 2010 (UTC)

Upgrading the article

At present (Feb 2010) the article is rated as only "Start", yet supposedly has "Top" priority and is on the frequent-viewing lists. Some initial comments on where things fall short are:

  • The lead fails to say why "expected value" is important, either generally or regarding its importance as underlying statistical inference.
  • There is a lack of references, both at a general-reader level and for the more sophisticated stuff.
  • There is some duplication, which is not necessarily a problem, but it would be if there were cross-referencing of this.
  • There is a poor ordering of material, in terms of sophistication, with elementary level stuff interspersed.

I guess others will have other thoughts. Particularly for this topic, it should be a high priority to retain a good exposition that is accessible at an elementary level, for which there is good start already in the article. Melcombe (talk) 13:17, 25 February 2010 (UTC)


"The expected value is in general not a typical value that the random variable can take on. It is often helpful to interpret the expected value of a random variable as the long-run average value of the variable over many independent repetitions of an experiment."

So the expected value is the mean for repeated experiments (why not just say so?), and yet you explicitly tell me that it is "in general not a typical value that the random variable can take on". The normal distribution begs to disagree. Regardless of theoretical justifications in multimodal cases, this is simply bizarre. More jargon != smarter theoreticians. Doug (talk) 18:33, 21 October 2010 (UTC)

What is the problem? The expected value is in general not a typical value. In the special case of the normal distribution it really is, who says otherwise? In the asymmetric unimodal case it is different from the mode. For a discrete distribution is it (in general) not a possible value at all. Boris Tsirelson (talk) 19:22, 21 October 2010 (UTC)
(why not just say so?) — because this is the statement of the law of large numbers — that when the expected value exists, the long-run average will converge almost surely to the expected value. If you define the expected value as the long-run average, then this theorem becomes circularly-dependent. Also, for some random variables it is not possible to imagine that they can be repeated many times over (say, a random variable that a person dies tomorrow). Expected value is a mathematical construct which exists regardless of the possibility to repeat the experiment.  // stpasha »  02:21, 22 October 2010 (UTC)
From the article: «formally, the expected value is a weighted average of all possible values.». A formal definition should refer to a particular definition of weight: probability. As it happens, the Wikipedia article Weighted arithmetic mean refers to a "weighted average" as a "weighted mean". "Mean" is both more precise than the ambiguous "average", and less confusing. The Wikipedia article on the Law of large numbers links to average, which again links to mean, median and mode. Our current article talks about average, but then stresses that it does not refer to a typical, nor even an actual, value—so as to eliminate the other definitions of "average" than "mean". It would be much simpler to just say "mean".
Either just say "mean", or use "mean" when referring to probability distributions, and "expected value" when referring to random variables. That's not standard, though. — Preceding unsigned comment added by SvartMan (talkcontribs) 00:04, 3 March 2014 (UTC)

Proposition for alternative proof of

I tried to add the proof below which I believe to be correct (except for a minor typo which is now changed). This was undone because "it does not work for certain heavy-tailed distribution such as Pareto (α < 1)". Can someone elaborate?

Alternative proof: Using integration by parts

and the bracket vanishes because as . —Preceding unsigned comment added by (talk) 02:50, 13 May 2011 (UTC)

Actually the end of section 1.4 seems in agreement, so I am reinstating my changes —Preceding unsigned comment added by (talk) 02:54, 13 May 2011 (UTC)
I have removed it again. The "proof" is invalid as it explicitly relies on the assumption as which does not hold for all cdfs (e.g. Pareto as said above). You might try reversing the argument and doing an integration by parts, starting with the "result", which might then be shown to be equavalent to the formula involving the density. PS, please sign your posts on talk pages. JA(000)Davidson (talk) 09:40, 13 May 2011 (UTC)
Let's try to sort this out: I claim that whenever X nonnegative has an expectation, then (Pareto distribution when alpha < 1 doesn't even have an expectation, so this is not a valid counter-example)
Proof: Assuming X has density function f, we have for any
Recognizing and rearranging terms:
as claimed.
Are we all in agreement, or am I missing something again? Phaedo1732 (talk) 19:05, 13 May 2011 (UTC)
Regardless of the validity of the proof, is an alternative proof a strong addition to the page? CRETOG8(t/c) 19:12, 13 May 2011 (UTC)
I think so, because the current proof is more like a trick than a generic method, whereas the alternative proof could be generalized (as shown in Section 1.4) I also think the point of an encyclopedia is to give more information rather than less. Phaedo1732 (talk) 00:49, 14 May 2011 (UTC)
See 6 of WP:NOTTEXTBOOK, and WP:MSM#Proofs. This doesn't seem to be a place that needs a proof at all. What is needed is a proper citation for the result, and a proper statement of the result and its generalisation to other lower bounds. {I.e. the result could be used as an alternative definition of "expected value", but are the definitions entirely equivalent?)JA(000)Davidson (talk) 08:28, 16 May 2011 (UTC)
Clearly the previous editor of that section thought a proof should be given. If anyone comes up with a good citation, I am all for it. Phaedo1732 (talk) 15:31, 16 May 2011 (UTC)

Simple generalization of the cumulative function integral

Currently the article has the integral

for non-negative random variables X. However, the non-negativeness restriction is easily removed, resulting in

Should we give the more general form, too? -- Coffee2theorems (talk) 22:33, 25 November 2011 (UTC)

But do not forget the minus sign before the first integral. Boris Tsirelson (talk) 15:47, 26 November 2011 (UTC)
Oops. Fixed. Anyhow, do you think it would be a useful addition? -- Coffee2theorems (talk) 19:29, 4 December 2011 (UTC)
Yes, why not. I always present it in my courses.
And by the way, did you see in "general definition" these formulas:
I doubt it is true under just this condition. Boris Tsirelson (talk) 07:26, 5 December 2011 (UTC)
Moreover, the last formula is ridiculous:
if Pr[X ≥ 0] = 1, where F is the cumulative distribution function of X.
Who needs the absolute value of X assuming that X is non-negative? Boris Tsirelson (talk) 07:30, 5 December 2011 (UTC)

"Expected value of a function" seems to be misplaced

Does the text starting with "The expected value of an arbitrary function of ..." really belong to the definition of the expectation, or would it be better to move it to Properties, between 3.6 and 3.7, and give it a new section (with which title?)? I am not entirely sure, but I think one can derive the expected value of a function of a random variable without the need for an explicit definition. After all, the function of a random variable is a random variable again; given that random variables are (measurable) functions themselves, it should be possible to construct $E(g(X))$ just from the general definition of $E$. Any thoughts? Grumpfel (talk) 21:54, 29 November 2011 (UTC)

I agree. Boris Tsirelson (talk) 06:33, 30 January 2012 (UTC)

Expectation of the number of positive events

If there is a probability p that a certain event will happen, and there are N such events, then the expectation of the number of events is , even when the events are dependent. I think this is a useful application of the sum-of-expectations formula. --Erel Segal (talk) 14:39, 6 May 2012 (UTC)


In the section on iterated expectations and the law of total expectation, the lower-case x is used to refer to particular values of the random variable denoted by capital X, so that for example

Then I found notation that looks like this:

Now what in the world would that be equal to in the case where x = 3?? It would be

but what is that??? This notation makes no sense, and I got rid of it. Michael Hardy (talk) 21:33, 13 August 2012 (UTC)

Incorrect example?

Example 2 in the definition section doesn't take into account the $1 wager. — Preceding unsigned comment added by Gregchaz (talkcontribs) 22:23, 17 November 2012 (UTC)

Isn't it factored into the $35 payout? —C.Fred (talk) 22:25, 17 November 2012 (UTC)

Formulas for special cases - Non-negative discrete

In the example at the bottom, I think the sum should be from i=1, and equal (1/p)-1. For instance, if p=1, you get heads every time, so since they so carefully explained that this means that X=0, the sum should work out to 0; hence (1/p)-1 rather than (1/p).

Well, imagine that p = 1 but YOU don't know. Then you'll toss the coin, and of course it gives heads at the first try. Other way of explaining: suppose p = 0.9999, and you know it. But then you are not absolutely sure that you will get heads, and you have to toss, with a very high probability of success at the first try. Bdmy (talk) 07:27, 1 June 2013 (UTC)
The OP is correct -- Bdmy has missed the fact that getting heads on the first try with certainty or near certainty is, in the notation of the article's example, X=0 not X=1.. There are several ways to see this: (1) The formula derived above the example says that the sum goes from one to infinity, not zero to infinity. Or, consider (2) if p=1/2, the possible sequences are H (X=0 with probability 1/2); TH (X=1 with probability 1/4); TTH (X=2 with probability 1/8; etc. So the expected value is 0 times 1/2 plus 1 times 1/4 plus 2 times 1/8 plus ... = 0 + 1/4 + 2/8 + 3/16 + 4/32 + .... = 1 = (1/p) -1. Or, consider (3) the OP's example with p=1 is correct -- you will certainly get a success on the first try, so X=0 with certainty so E(X) = 0 = (1/p) - 1. I'll correct it in the article. Duoduoduo (talk) 18:37, 1 June 2013 (UTC)

Reason for deleting sub-section "Terminology"

I'm deleting the subsection "Terminology" of the section "Definition"for the following reasons. The sections reads

When one speaks of the "expected price", "expected height", etc. one often means the expected value of a random variable that is a price, a height, etc. However, the "value" in expected value is more general than price or winnings. For example a game played to try to obtain the cost of a life saving operation would assign a high value where the winnings are above the required amount, but the value may be very low or zero for lesser amounts.
When one speaks of the "expected number of attempts needed to get one successful attempt", one might conservatively approximate it as the reciprocal of the probability of success for such an attempt. Cf. expected value of the geometric distribution.

The first sentence is a pointless tautology. The remainder of the first paragraph doesn't make a bit of sense, but maybe it is attempting to make the obvious point that sometimes "value" means a dollar value and sometimes not. This is obvious and not useful, even if it were well-expressed. The second paragraph doesn't have anything to do with either "Definition" or "Terminology", and in any event it is wrong (the "approximate" value it gives is actually exact. Duoduoduo (talk) 14:51, 2 June 2013 (UTC)

Upgrading the article

The article seems pretty clearly to have satisfied the criteria at least for C-class quality; I'd say it looks more like B-class at this point. I'm re-rating it to C-class, and I'd love to hear thoughts on the article's current quality. -Bryanrutherford0 (talk) 03:31, 18 July 2013 (UTC)

Multivariate formula

The following formula has been added for the expected value of a multivariate random variable:

First, I don't understand what calculation is called for by the formula. Why do we have a multiple integral? It seems to me that since the left side of the equation is an n-dimensional vector, the right side should also be an n-dimensional vector in which each element i has a single integral of

Second, I don't understand the subsequent sentence

Note that, for the univariate cases, the random variables X are taken as the identity functions over different sets of reals.

What different sets of reals? And in what way is X in the general case not based on the identity function -- is intended to mean something other than simply the vector  ? Duoduoduo (talk) 17:56, 12 September 2013 (UTC)

I agree, that is a mess. First, I guess that the formula
was really meant. Second, I guess, the author of this text is one of these numerous people that believe that, dealing with an n-dim random vector, we should take the probability space equal to Rn, the probability measure equal to the distribution of the random vector, and yes, . (Probably because they have no other idea.) I am afraid that they can support this by some (more or less reliable) sources. Boris Tsirelson (talk) 18:17, 12 September 2013 (UTC)
Should it be reverted? Duoduoduo (talk) 19:40, 12 September 2013 (UTC)
Maybe. Or maybe partially deleted and partially reformulated? Boris Tsirelson (talk) 21:07, 12 September 2013 (UTC)
I'll leave it up to you -- you're more familiar with this material than I am. Duoduoduo (talk) 22:56, 12 September 2013 (UTC)
I wrote this formula. (I have a Ph.D. in Applied Mathematics, though I can admit that I am wrong if someone proves it) I make the link between the general form and the uni-variate form. This formula is what I meant. This link can help to legitimate the statement . Note that if this line is not there it is hard to make the link between the general form and the uni-varite form.
The formula in Wolfram is OK; but why your one is different?
Before deciding whether or not your formula should be here we should decide whether or not it is correct.
In Wolfram one considers expectation of a function f of n random variables that have a joint density P. In contrast, you write "multivariate random variable admits a probability density function ". What could it mean? A (scalar) function of n random variables is a one-dimensional random variable, and its density (if exists) is a function of one variable. The random vector is a multivariate random variable and its density (if exists) is a function of n variables. What could you mean by X? Boris Tsirelson (talk) 13:59, 2 October 2013 (UTC)
You say " In Wolfram one considers expectation of a function f of n random variables that have a joint density P ". I do not agree. Indeed, they don't say that the are RANDOM variable. I would say that their definition is more prudent. More specifically, in this article, I consider in the definition of the multivariate case that the vector belongs to the sample space whereas is a random variable which is actually a function of this variable in the sample space.
Notice that (for simplicity) in the case of univariate functions, the variables of your sample space are equal to the observed random variable. I.e. roll one dice you see 5 then the random variable return 5, this is simple.
Let us now consider a multivariate random variable: Choose one longitude, one latitude, and a date (these are the variables of the sample space). Let us now measure something e.g. atmospheric pressure or temperature or simply the sum of longitude and latitude (even if it does not make much sense.) these are multivariate random variables. You just observe numbers and build your statistic like you would do in the section general definition.

(Unindent) Ah, yes, this is what I was fearing of, see above where I wrote: "the author of this text is one of these numerous people that believe that, dealing with an n-dim random vector, we should take the probability space equal to Rn, the probability measure equal to the distribution of the random vector, and yes, . (Probably because they have no other idea.)"

The probem is that (a) this is not the mainstream definition of a random variable (in your words it is rather the prudent definition, though I do not understand what is the prudence; as for me, the standard definition is more prudent); and (b) this "your" definition does not appear in Wikipedia (as far as I know). Really, I am not quite protesting against it. But for now the reader should be puzzled unless he/she reads your comment here on the talk page. In order to do it correctly you should first introduce "your" approach in other articles (first of all, "Random variable") and only then use it here, with the needed explanation. And of course, for succeeding with this project you need reliable sources. Boris Tsirelson (talk) 16:40, 2 October 2013 (UTC)

And please do not forget to sign your messages with four tildas: ~~~~. :-) Boris Tsirelson (talk) 16:44, 2 October 2013 (UTC)

1° Just to make things clear, I am not talking about random vector.
2° For me the definition of a random variable is the same as in the section "Measure-theoretic definition" of the article "Random Variable" and is what is actually used in the section "General definition" of the article Expected value.
3° What I am trying here is to fill the gap between the "univariate cases" and the "General definition". "univariate cases" are simplified cases of "General definition". It was not easy to see at first, so I am trying to fill the gap. My contribution is simply to consider the "General definition" with and then say that, if , then for simplicity one often consider the as done in the univariate cases. (talk) 11:38, 3 October 2013 (UTC)
But your formula is not parallel to the univariate case (as it is presented for now):
"If the probability distribution of X admits a probability density function f(x), then the expected value can be computed as
You see, nothing special is assumed about the probability space; it is left arbitrary (as usual), and does not matter. What matters is the distribution. Not at all "". If you want to make it parallel, you should first add your-style formulation to the univariate case: "It is always possible to use the change-of-variable formula in order to pass from an arbitrary probability space to the special case where (you know what) without changing the distribution (and therefore the expectation as well)", something like that. Also your terminology... what you call a multivariate random variable is what I would call a univariate random variable defined on the n-dimensional probability space (you know which). What about sources for your terminology? Boris Tsirelson (talk) 12:46, 3 October 2013 (UTC)
1° (Just to be aware of what we talk about) How would you define formally a "Univariate random variable" ? Note that this term is not in the article "Random variable".
2° Don't you agree that there is a gap that needs to be filled between the univariate definitions and the general definition ? I totally agree if someone helps me to improve my possibly inadequate modification.
3° As far as terminology is concern, here is a reference for bivariate (multivariate is a similar extension) (talk) 14:25, 3 October 2013 (UTC)
Ironically, the book pointed by you confirms my view and not yours! There I read (page 29): "bivariate continuous random variable is a variable that takes a continuum of values on the plane according to the rule determined by a joint density function defined over the plane. The rule is that the probability that a bivariate random variable falls into any region on the plane is equal..."
Exactly so! (a) Nothing special is assumed about the probability space; moreover, the probability space is not mentioned. Only the distribution matters. (b) it is exactly what I called a random vector (since a point of the plane is usually identified with a pair of real numbers, as well as a 2-dim vector). I do not insist on the word "vector"; but note: bivariate means values are two-dimensional (values! not the points of the probability space, but rather their images under the measurable map from the probability space to the plane). Accordingly, expectation of a bivariate random variable is a vector (well, a point of the plane), not a number! And the formula "" is neither written nor meant.
How would I define formally a "Univariate random variable"? As a measurable map from the given probability space to the real line, of course. Boris Tsirelson (talk) 18:34, 3 October 2013 (UTC)
Surely it would be nice, to improve the article. But please, on the basis of reliable sources, not reinterpreted and mixed with your original research. Boris Tsirelson (talk) 18:50, 3 October 2013 (UTC)
Actually, I do agree that my contribution is not good enough. If you see any way to make it right, do not hesitate to transform it. Otherwise, just remove it. Thank you for your patience and involvement in this discussion. (talk) 11:32, 7 October 2013 (UTC)
OK, I've moved it to a place of more appropriate context, and adapted it a little to that place. Happy editing. Boris Tsirelson (talk) 14:43, 7 October 2013 (UTC)