Minnaert function: Difference between revisions

Latest revision as of 04:55, 9 January 2013

In various science/engineering applications, such as independent component analysis,^[1] image analysis,^[2] genetic analysis,^[3] speech recognition,^[4] manifold learning,^[5] and time delay estimation^[6] it is useful to estimate the differential entropy of a system or process, given some observations.

The simplest and most common approach uses histogram-based estimation, but other approaches have been developed and used, each with their own benefits and drawbacks.^[7] The main factor in choosing a method is often a trade-off between the bias and the variance of the estimate^[8] although the nature of the (suspected) distribution of the data may also be a factor.^[7]

Histogram estimator

The histogram approach uses the idea that the differential entropy,

H(X)=-\int _{\mathbb {X} }f(x)\log f(x)\,dx

can be approximated by producing a histogram of the observations, and then finding the discrete entropy

H(X)=-\sum _{i=1}^{n}f(x_{i})\log \left({\frac {f(x_{i})}{w(x_{i})}}\right)

of that histogram (which is itself a maximum-likelihood (ML) estimate of the discretized frequency distribution), where w is the width of the ith bin. Histograms can be quick to calculate, and simple, so this approach has some attractions. However, the estimate produced is biased, and although corrections can be made to the estimate, they may not always be satisfactory.^[9]

A method better suited for multidimensional probability density functions (pdf) is to first make a pdf estimate with some method, and then, from the pdf estimate, compute the entropy. A useful pdf estimate method is e.g. Gaussian mixture modeling (GMM), where the expectation maximization (EM) algorithm is used to find an ML estimate of a weighted sum of Gaussian pdf's approximating the data pdf.

Estimates based on sample-spacings

If the data is one-dimensional, we can imagine taking all the observations and putting them in order of their value. The spacing between one value and the next then gives us a rough idea of (the reciprocal of) the probability density in that region: the closer together the values are, the higher the probability density. This is a very rough estimate with high variance, but can be improved, for example by thinking about the space between a given value and the one m away from it, where m is some fixed number.^[7]

The probability density estimated in this way can then be used to calculate the entropy estimate, in a similar way to that given above for the histogram, but with some slight tweaks.

One of the main drawbacks with this approach is going beyond one dimension: the idea of lining the data points up in order falls apart in more than one dimension. However, using analogous methods, some multidimensional entropy estimators have been developed.^[10]^[11]

Estimates based on nearest-neighbours

For each point in our dataset, we can find the distance to its nearest neighbour. We can in fact estimate the entropy from the distribution of the nearest-neighbour-distance of our datapoints.^[7] (In a uniform distribution these distances all tend to be fairly similar, whereas in a strongly nonuniform distribution they may vary a lot more.)

Bayesian estimator

When in under-sampled regime, having a prior on the distribution can help the estimation. One such Bayesian estimator was proposed in the neuroscience context known as the NSB (Nemenman–Shafee–Bialek) estimator.^[12]^[13] The NSB estimator uses a mixture-of-Dirichlet prior, chosen such that the induced prior over the entropy is approximately uniform.

Estimates based on expected entropy

A new approach to the problem of entropy evaluation is to compare the expected entropy of a sample of random sequence with the calculated entropy of the sample. The method gives very accurate results, but it is limited to calculations of random sequences modeled as Markov chains of the first order with small values of bias and correlations. This is the first known method that takes into account the size of the sample sequence and its impact on the accuracy of the calculation of entropy.^[14]

References

43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

↑ Dinh-Tuan Pham (2004) Fast algorithms for mutual information based independent component analysis. In Signal Processing. Volume 52, Issue 10, 2690–2700, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ Chang, C.-I.; Du, Y.; Wang, J.; Guo, S.-M.; Thouin, P.D. (2006) Survey and comparative analysis of entropy and relative entropy thresholding techniques. In Vision, Image and Signal Processing, Volume 153, Issue 6, 837–850, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ Martins, D. C. et al. (2008) Intrinsically Multivariate Predictive Genes. In Selected Topics in Signal Processing. Volume 2, Issue 3, 424–439, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ Gue Jun Jung; Yung-Hwan Oh (2008) Information Distance-Based Subvector Clustering for ASR Parameter Quantization. In Signal Processing Letters, Volume 15, 209–212, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ Costa, J.A.; Hero, A.O. (2004), Geodesic entropic graphs for dimension and entropy estimation in manifold learning. In Signal Processing, Volume 52, Issue 8, 2210–2221, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ Benesty, J.; Yiteng Huang; Jingdong Chen (2007) Time Delay Estimation via Minimum Entropy. In Signal Processing Letters, Volume 14, Issue 3, March 2007 157–160 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ ^7.0 ^7.1 ^7.2 ^7.3 J. Beirlant, E. J. Dudewicz, L. Gyorfi, and E. C. van der Meulen (1997) Nonparametric entropy estimation: An overview. In International Journal of Mathematical and Statistical Sciences, Volume 6, pp. 17– 39.
↑ T. Schürmann, Bias analysis in entropy estimation. In J. Phys. A: Math. Gen, 37 (2004), pp. L295–L301. 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.
↑ G. Miller (1955) Note on the bias of information estimates. In Information Theory in Psychology: Problems and Methods, pp. 95–100.
↑ E. G. Learned-Miller (2003) A new class of entropy estimators for multi-dimensional densities, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), vol. 3, April 2003, pp. 297–300.
↑ I. Lee (2010) Sample-spacings based density and entropy estimators for spherically invariant multidimensional data, In Neural Computation, vol. 22, issue 8, April 2010, pp. 2208–2227.
↑ Ilya Nemenman, Fariel Shafee, William Bialek (2003) Entropy and Inference, Revisited. Advances in Neural Information Processing
↑ Ilya Nemenman, William Bialek, de Ruyter (2004) Entropy and information in neural spike trains: Progress on the sampling problem. Physical Review E
↑ Marek Lesniewicz (2014) Expected Entropy as a Measure and Criterion of Randomness of Binary Sequences [1] In Przeglad Elektrotechniczny, Volume 90, pp. 42– 46.

[1] Dinh-Tuan Pham (2004) Fast algorithms for mutual information based independent component analysis. In Signal Processing. Volume 52, Issue 10, 2690–2700, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[2] Chang, C.-I.; Du, Y.; Wang, J.; Guo, S.-M.; Thouin, P.D. (2006) Survey and comparative analysis of entropy and relative entropy thresholding techniques. In Vision, Image and Signal Processing, Volume 153, Issue 6, 837–850, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[3] Martins, D. C. et al. (2008) Intrinsically Multivariate Predictive Genes. In Selected Topics in Signal Processing. Volume 2, Issue 3, 424–439, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[4] Gue Jun Jung; Yung-Hwan Oh (2008) Information Distance-Based Subvector Clustering for ASR Parameter Quantization. In Signal Processing Letters, Volume 15, 209–212, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[5] Costa, J.A.; Hero, A.O. (2004), Geodesic entropic graphs for dimension and entropy estimation in manifold learning. In Signal Processing, Volume 52, Issue 8, 2210–2221, 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[6] Benesty, J.; Yiteng Huang; Jingdong Chen (2007) Time Delay Estimation via Minimum Entropy. In Signal Processing Letters, Volume 14, Issue 3, March 2007 157–160 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[beirlant-7] 7.0 ^7.1 ^7.2 ^7.3 J. Beirlant, E. J. Dudewicz, L. Gyorfi, and E. C. van der Meulen (1997) Nonparametric entropy estimation: An overview. In International Journal of Mathematical and Statistical Sciences, Volume 6, pp. 17– 39.

[schurmann-8] T. Schürmann, Bias analysis in entropy estimation. In J. Phys. A: Math. Gen, 37 (2004), pp. L295–L301. 21 year-old Glazier James Grippo from Edam, enjoys hang gliding, industrial property developers in singapore developers in singapore and camping. Finds the entire world an motivating place we have spent 4 months at Alejandro de Humboldt National Park.

[miller55-9] G. Miller (1955) Note on the bias of information estimates. In Information Theory in Psychology: Problems and Methods, pp. 95–100.

[lm2003-10] E. G. Learned-Miller (2003) A new class of entropy estimators for multi-dimensional densities, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), vol. 3, April 2003, pp. 297–300.

[il2010-11] I. Lee (2010) Sample-spacings based density and entropy estimators for spherically invariant multidimensional data, In Neural Computation, vol. 22, issue 8, April 2010, pp. 2208–2227.

[Nemenman2003-12] Ilya Nemenman, Fariel Shafee, William Bialek (2003) Entropy and Inference, Revisited. Advances in Neural Information Processing

[Nemenman2004-13] Ilya Nemenman, William Bialek, de Ruyter (2004) Entropy and information in neural spike trains: Progress on the sampling problem. Physical Review E

[Lesniewicz2014-14] Marek Lesniewicz (2014) Expected Entropy as a Measure and Criterion of Randomness of Binary Sequences [1] In Przeglad Elektrotechniczny, Volume 90, pp. 42– 46.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

@@ Line 1: / Line 1: @@
-They call the author Brant it is not is not his birth name. He is currently a filing front desk staff. One of what exactly he loves most should be to keep bees but he hasn't crafted a dime this. Puerto Rico has always been his your own house. He is running and maintaining weblog here: http://www.vimeo.com/91568088<br><br>Also visit my blog kim kardashian sex vid ([http://www.vimeo.com/91568088 simply click the next web page])
+In various science/engineering applications, such as [[independent component analysis]],<ref>Dinh-Tuan Pham (2004) Fast algorithms for mutual information based independent component analysis. In ''Signal Processing''. Volume 52,  Issue 10, 2690–2700, {{doi|10.1109/TSP.2004.834398}}</ref> [[image analysis]],<ref>Chang, C.-I.; Du, Y.; Wang, J.; Guo, S.-M.; Thouin, P.D. (2006) Survey and comparative analysis of entropy and relative entropy thresholding techniques. In ''Vision, Image and Signal Processing'', Volume 153,  Issue 6, 837–850, {{doi|10.1049/ip-vis:20050032}}</ref> [[genetic analysis]],<ref>Martins, D. C. ''et al.'' (2008) Intrinsically Multivariate Predictive Genes. In ''Selected Topics in Signal Processing''. Volume 2,  Issue 3, 424–439, {{doi|10.1109/JSTSP.2008.923841}}</ref> [[speech recognition]],<ref>Gue Jun Jung; Yung-Hwan Oh (2008) Information Distance-Based Subvector Clustering for ASR Parameter Quantization. In ''Signal Processing Letters'', Volume 15, 209–212, {{doi|10.1109/LSP.2007.913132 }}</ref> [[manifold learning]],<ref>Costa, J.A.; Hero, A.O. (2004), Geodesic entropic graphs for dimension and entropy estimation in manifold learning. In ''Signal Processing'', Volume 52,  Issue 8,  2210–2221, {{doi|10.1109/TSP.2004.831130}}</ref> and time delay estimation<ref>Benesty, J.; Yiteng Huang; Jingdong Chen (2007) Time Delay Estimation via Minimum Entropy. In ''Signal Processing Letters'', Volume 14,  Issue 3,  March 2007 157–160 {{doi|10.1109/LSP.2006.884038 }}</ref> it is useful to '''estimate the [[differential entropy]]''' of a system or process, given some observations.
+The simplest and most common approach uses histogram-based estimation, but other approaches have been developed and used, each with their own benefits and drawbacks.<ref name="beirlant">J. Beirlant, E. J. Dudewicz, L. Gyorfi, and E. C. van der Meulen (1997) [http://www.its.caltech.edu/~jimbeck/summerlectures/references/Entropy%20estimation.pdf Nonparametric entropy estimation: An overview]. In ''International Journal of Mathematical and Statistical Sciences'', Volume 6, pp. 17–
+.</ref> The main factor in choosing a method is often  a trade-off between the bias and the variance of the estimate<ref name="schurmann">T. Schürmann, Bias analysis in entropy estimation. In ''J. Phys. A: Math. Gen'', 37 (2004), pp. L295–L301. {{doi|10.1088/0305-4470/37/27/L02}}</ref> although the nature of the (suspected) distribution of the data may also be a factor.<ref name="beirlant"/>
+==Histogram estimator==
+The histogram approach uses the idea that the differential entropy,
+:<math>H(X) = -\int_\mathbb{X} f(x)\log f(x)\,dx</math>
+can be approximated by producing a [[histogram]] of the observations, and then finding the discrete entropy
+:<math>
+H(X)  =  - \sum_{i=1}^nf(x_i)\log \left(\frac{f(x_i)}{w(x_i)} \right)
+</math>
+of that histogram (which is itself a [[Maximum likelihood|maximum-likelihood (ML) estimate]] of the discretized frequency distribution), where ''w'' is the width of the ''i''th bin. Histograms can be quick to calculate, and simple, so this approach has some attractions. However, the estimate produced is [[Bias_(statistics)|bias]]ed, and although corrections can be made to the estimate, they may not always be satisfactory.<ref name="miller55">G. Miller (1955) Note on the bias of information estimates. In ''Information Theory in Psychology: Problems and Methods'', pp. 95–100.</ref>
+A method better suited for multidimensional [[probability density function]]s (pdf) is to first make a [[Density estimation|pdf estimate]] with some method, and then, from the pdf estimate, compute the entropy. A useful pdf estimate method is e.g. Gaussian [[mixture model]]ing (GMM), where the [[expectation maximization]] (EM) algorithm is used to find an ML estimate of a [[weighted sum]] of Gaussian pdf's approximating the data pdf.
+==Estimates based on sample-spacings==
+If the data is one-dimensional, we can imagine taking all the observations and putting them in order of their value. The spacing between one value and the next then gives us a rough idea of (the [[Multiplicative inverse|reciprocal]] of) the probability density in that region: the closer together the values are, the higher the probability density. This is a very rough estimate with high [[variance]], but can be improved, for example by thinking about the space between a given value and the one ''m'' away from it, where ''m'' is some fixed number.<ref name="beirlant"/>
+The probability density estimated in this way can then be used to calculate the entropy estimate, in a similar way to that given above for the histogram, but with some slight tweaks.
+One of the main drawbacks with this approach is going beyond one dimension: the idea of lining the data points up in order falls apart in more than one dimension. However, using analogous methods, some multidimensional entropy estimators have been developed.<ref name="lm2003">E. G. Learned-Miller (2003) A new class of entropy estimators for multi-dimensional densities, in ''Proceedings of the [[International Conference on Acoustics, Speech, and Signal Processing]] (ICASSP’03)'', vol. 3, April 2003, pp. 297–300.</ref><ref name="il2010">I. Lee (2010) Sample-spacings based density and entropy estimators for spherically invariant multidimensional data, In ''Neural Computation'', vol. 22, issue 8, April 2010, pp. 2208–2227.</ref>
+==Estimates based on nearest-neighbours==
+For each point in our dataset, we can find the distance to its [[nearest neighbour]]. We can in fact estimate the entropy from the distribution of the nearest-neighbour-distance of our datapoints.<ref name="beirlant"/> (In a uniform distribution these distances all tend to be fairly similar, whereas in a strongly nonuniform distribution they may vary a lot more.)
+== Bayesian estimator ==
+When in under-sampled regime, having a prior on the distribution can help the estimation. One such [[Bayesian estimator]] was proposed in the neuroscience context known as the NSB (Nemenman–Shafee–[[William Bialek|Bialek]]) estimator.<ref name=Nemenman2003>Ilya Nemenman, Fariel Shafee, William Bialek (2003) Entropy and Inference, Revisited. Advances in Neural Information Processing</ref><ref name=Nemenman2004>Ilya Nemenman, [[William Bialek]], de Ruyter (2004) Entropy and information in neural spike trains: Progress on the sampling problem. Physical Review E</ref> The NSB estimator uses a mixture-of-[[Dirichlet distribution|Dirichlet prior]], chosen such that the induced prior over the entropy is approximately uniform.
+== Estimates based on expected entropy ==
+A new approach to the problem of entropy evaluation is to compare the expected entropy of a sample of random sequence with the calculated entropy of the sample. The method gives very accurate results, but it is limited to calculations of random sequences modeled as Markov chains of the first order with small values of bias and correlations. This is the first known method that takes into account the size of the sample sequence and its impact on the accuracy of the calculation of entropy.<ref name=Lesniewicz2014>Marek Lesniewicz (2014) Expected Entropy as a Measure and Criterion of Randomness of Binary Sequences [http://pe.org.pl/abstract_pl.php?nid=8206] In ''Przeglad Elektrotechniczny, Volume 90, pp. 42– 46.</ref>
+==References==
+{{reflist}}
+{{DEFAULTSORT:Entropy Estimation}}
+[[Category:Entropy and information]]
+[[Category:Information theory]]
+[[Category:Statistical randomness]]
+[[Category:Randomness]]
+[[Category:Random number generation]]

Minnaert function: Difference between revisions

Latest revision as of 04:55, 9 January 2013

Contents

Histogram estimator

Estimates based on sample-spacings

Estimates based on nearest-neighbours

Bayesian estimator

Estimates based on expected entropy

References

Navigation menu

Minnaert function: Difference between revisions

Latest revision as of 04:55, 9 January 2013

Histogram estimator

Estimates based on sample-spacings

Estimates based on nearest-neighbours

Bayesian estimator

Estimates based on expected entropy

References

Navigation menu

Search