Stock correlation network: Difference between revisions
en>Addbot m Bot: Removing Orphan Tag (Nolonger an Orphan) (Report Errors) |
|||
| Line 1: | Line 1: | ||
'''Pseudo-random number sampling''' or '''non-uniform pseudo-random variate generation''' is the [[Numerical analysis|numerical]] practice of generating [[pseudo-random number]]s that are distributed according to a given [[probability distribution]]. | |||
Methods of sampling a non-[[Uniform distribution (continuous)|uniform distribution]] are typically based on the availability of a [[pseudo-random number generator]] producing numbers ''X'' that are uniformly distributed. Computational algorithms are then used to manipulate a single [[random variate]], ''X'', or often several such variates, into a new random variate ''Y'' such that these values have the required distribution. | |||
Historically, basic methods of pseudo-random number sampling were developed for [[Monte-Carlo method|Monte-Carlo simulations]] in the [[Manhattan project]];{{Citation needed|date=June 2011}} they were first published by [[John von Neumann]] in the early 1950s.{{Citation needed|date=June 2011}} | |||
== Finite discrete distributions == | |||
For a [[discrete probability distribution]] with a finite number ''n'' of indices at which the [[probability mass function]] ''f'' takes non-zero values, the basic sampling algorithm is straightforward. The interval <nowiki>[</nowiki>0, 1<nowiki>)</nowiki> is divided in ''n'' intervals [0, ''f''(1)), [''f''(1), ''f''(1) + ''f''(2)), ... The width of interval ''i'' equals the probability ''f''(''i''). | |||
One draws a uniformly distributed pseudo-random number ''X'', and searches for the index ''i'' of the corresponding interval. The so determined ''i'' will have the distribution ''f''(''i''). | |||
Formalizing this idea becomes easier by using the cumulative distribution function | |||
:<math>F(i)=\sum_{j=1}^i f(j).</math> | |||
It is convenient to set ''F''(0) = 0. The ''n'' intervals are then simply [''F''(0), ''F''(1)), [''F''(1), ''F''(2)), ..., [''F''(''n'' − 1), ''F''(''n'')). The main computational task is then to determine ''i'' for which ''F''(''i'' − 1) ≤ ''X'' < ''F''(''i''). | |||
This can be done by different algorithms: | |||
* [[Linear search]], computational time linear in ''n''. | |||
* [[Binary search]], computational time goes with log ''n''. | |||
* [[Indexed search]],<ref>Ripley (1987) {{Page needed|date=June 2011}}</ref> also called the ''cutpoint method''.<ref>Fishman (1996) {{Page needed|date=June 2011}}</ref> | |||
* [[Alias method]], computational time is constant, using some pre-computed tables. | |||
* There are other methods that cost constant time.<ref>Fishman (1996) {{Page needed|date=June 2011}}</ref> | |||
== Continuous distributions == | |||
Generic methods for generating [[statistical independence|independent]] samples: | |||
* [[Rejection sampling]] | |||
* [[Inverse transform sampling]] | |||
* [[Slice sampling]] | |||
* [[Ziggurat algorithm]], for monotonously decreasing density functions | |||
* [[Convolution random number generator]], not a sampling method in itself: it describes the use of arithmetics on top of one ore more existing sampling methods to generate more involved distributions. | |||
Generic methods for generating [[correlated]] samples (often necessary for unusually-shaped or high-dimensional distributions): | |||
* [[Markov chain Monte Carlo]], the general principle | |||
* [[Metropolis–Hastings algorithm]] | |||
* [[Gibbs sampling]] | |||
* [[Slice sampling]] | |||
* [[Reversible-jump Markov chain Monte Carlo]], when the number of dimensions is not fixed (e.g. when estimating a [[mixture model]] and simultaneously estimating the number of mixture components) | |||
* [[Particle filter]]s, when the observed data is connected in a [[Markov chain]] and should be processed sequentially | |||
For generating a [[normal distribution]]: | |||
* [[Box–Muller transform]] | |||
* [[Marsaglia polar method]] | |||
For generating a [[Poisson distribution]]: | |||
* See [[Poisson distribution#Generating Poisson-distributed random variables]] | |||
== Software Libraries == | |||
[[GNU Scientific Library]] has a section entitled "Random Number Distributions" with routines for sampling under more than twenty different distributions. | |||
== Footnotes == | |||
{{reflist}} | |||
== Literature == | |||
* Devroye, L. (1986) ''Non-Uniform Random Variate Generation.'' New York: Springer | |||
* Fishman, G.S. (1996) ''Monte Carlo. Concepts, Algorithms, and Applications.'' New York: Springer | |||
* Hörmann, W.; J Leydold, G Derflinger (2004,2011) ''Automatic Nonuniform Random Variate Generation.'' Berlin: Springer. | |||
* [[Donald Knuth|Knuth, D.E.]] (1997) ''[[The Art of Computer Programming]]'', Vol. 2 ''Seminumerical Algorithms'', Chapter 3.4.1 (3rd edition). | |||
* Ripley, B.D. (1987) ''Stochastic Simulation''. Wiley. | |||
{{DEFAULTSORT:Pseudo-Random Number Sampling}} | |||
[[Category:Pseudorandom number generators]] | |||
[[Category:Non-uniform random numbers]] | |||
Revision as of 04:11, 9 January 2013
Pseudo-random number sampling or non-uniform pseudo-random variate generation is the numerical practice of generating pseudo-random numbers that are distributed according to a given probability distribution.
Methods of sampling a non-uniform distribution are typically based on the availability of a pseudo-random number generator producing numbers X that are uniformly distributed. Computational algorithms are then used to manipulate a single random variate, X, or often several such variates, into a new random variate Y such that these values have the required distribution.
Historically, basic methods of pseudo-random number sampling were developed for Monte-Carlo simulations in the Manhattan project;Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park. they were first published by John von Neumann in the early 1950s.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.
Finite discrete distributions
For a discrete probability distribution with a finite number n of indices at which the probability mass function f takes non-zero values, the basic sampling algorithm is straightforward. The interval [0, 1) is divided in n intervals [0, f(1)), [f(1), f(1) + f(2)), ... The width of interval i equals the probability f(i). One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).
Formalizing this idea becomes easier by using the cumulative distribution function
It is convenient to set F(0) = 0. The n intervals are then simply [F(0), F(1)), [F(1), F(2)), ..., [F(n − 1), F(n)). The main computational task is then to determine i for which F(i − 1) ≤ X < F(i).
This can be done by different algorithms:
- Linear search, computational time linear in n.
- Binary search, computational time goes with log n.
- Indexed search,[1] also called the cutpoint method.[2]
- Alias method, computational time is constant, using some pre-computed tables.
- There are other methods that cost constant time.[3]
Continuous distributions
Generic methods for generating independent samples:
- Rejection sampling
- Inverse transform sampling
- Slice sampling
- Ziggurat algorithm, for monotonously decreasing density functions
- Convolution random number generator, not a sampling method in itself: it describes the use of arithmetics on top of one ore more existing sampling methods to generate more involved distributions.
Generic methods for generating correlated samples (often necessary for unusually-shaped or high-dimensional distributions):
- Markov chain Monte Carlo, the general principle
- Metropolis–Hastings algorithm
- Gibbs sampling
- Slice sampling
- Reversible-jump Markov chain Monte Carlo, when the number of dimensions is not fixed (e.g. when estimating a mixture model and simultaneously estimating the number of mixture components)
- Particle filters, when the observed data is connected in a Markov chain and should be processed sequentially
For generating a normal distribution:
For generating a Poisson distribution:
Software Libraries
GNU Scientific Library has a section entitled "Random Number Distributions" with routines for sampling under more than twenty different distributions.
Footnotes
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
Literature
- Devroye, L. (1986) Non-Uniform Random Variate Generation. New York: Springer
- Fishman, G.S. (1996) Monte Carlo. Concepts, Algorithms, and Applications. New York: Springer
- Hörmann, W.; J Leydold, G Derflinger (2004,2011) Automatic Nonuniform Random Variate Generation. Berlin: Springer.
- Knuth, D.E. (1997) The Art of Computer Programming, Vol. 2 Seminumerical Algorithms, Chapter 3.4.1 (3rd edition).
- Ripley, B.D. (1987) Stochastic Simulation. Wiley.
- ↑ Ripley (1987) Template:Page needed
- ↑ Fishman (1996) Template:Page needed
- ↑ Fishman (1996) Template:Page needed