Finnish verb conjugation: Difference between revisions

Latest revision as of 20:24, 12 December 2014

Advertising Manager Deshawn from Chalk River, has hobbies and interests including snooker, how can i get pregnant now and tombstone rubbing. Recently has made a journey to Historic Villages of Korea: Hahoe and Yangdong.

@@ Line 1: / Line 1: @@
-{{for|the signal processing concept|spectral density estimation}}
+Advertising Manager Deshawn from Chalk River, has hobbies and interests including snooker, how can i get pregnant now and tombstone rubbing. Recently has made a journey to Historic Villages of Korea: Hahoe and Yangdong.
-{{merge to|Kernel density estimation|date=September 2013}}
-{{Multiple issues|
-{{howto|date=August 2012}}
-{{refimprove|date=August 2012}}
-}}
-[[File:KernelDensityGaussianAnimated.gif|thumb|350px|Demonstration of density estimation using [[kernel smoothing]]: The true density is mixture of two Gaussians centered around 0 and 3, shown with solid blue curve. In each frame, 100 samples are generated from the distribution, shown in red. Centered on each sample, a Gaussian kernel is drawn in gray. Averaging the Gaussians yields the density estimate shown in the dashed black curve.]]
-In [[probability]] and [[statistics]],
-'''density estimation''' is the construction of an estimate, based on observed [[data]], of an unobservable underlying [[probability density function]].  The unobservable density function is thought of as the density according to which a large population is distributed; the data are usually thought of as a random sample from that population.
-A variety of approaches to density estimation are used, including [[Parzen window]]s and a range of [[data clustering]] techniques, including [[vector quantization]]. The most basic form of density estimation is a rescaled [[histogram]].
-== Example of density estimation ==
-We will consider records of the incidence of [[diabetes]]. The following is quoted verbatim from the [[data set]] description:
-:A population of women who were at least 21 years old, of [[Pima people|Pima]] Indian heritage and living near Phoenix, Arizona,  was tested for [[diabetes mellitus]] according to [[World Health Organization]] criteria.  The data were collected by the US National Institute of Diabetes and Digestive and Kidney Diseases. We used the 532 complete records.<ref>{{cite web|url=http://stat.ethz.ch/R-manual/R-patched/library/MASS/html/Pima.tr.html|title=Diabetes in Pima Indian Women - R documentation}}</ref><ref>{{cite journal|author=Smith, J. W., Everhart, J. E., Dickson, W. C., Knowler, W. C. and Johannes, R. S.|year=1988|title=Using the ADAP learning algorithm to forecast the onset of diabetes mellitus|journal=Proceedings of the Symposium on Computer Applications in Medical Care (Washington, 1988)|editor=R. A. Greenes|pages=261–265|place=Los Alamitos, CA|publisher=IEEE Computer Society Press|url=http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2245318/}}</ref>
-In this example,
-we construct three density estimates for "glu" ([[Blood plasma|plasma]] [[glucose]] concentration),
-one [[Conditional probability|conditional]] on the presence of diabetes,
-the second conditional on the absence of diabetes,
-and the third not conditional on diabetes.
-The conditional density estimates are then used to construct the probability of diabetes conditional on "glu".
-The "glu" data were obtained from the MASS package<ref>{{cite web|url=http://cran.r-project.org/web/packages/MASS/index.html|title=Support Functions and Datasets for Venables and Ripley's MASS}}</ref> of the [[R programming language]]. Within R, <tt>?Pima.tr</tt> and <tt>?Pima.te</tt> give a fuller account of the data.
-The [[mean]] of "glu" in the diabetes cases is 143.1 and the standard deviation is 31.26.
-The mean of "glu" in the non-diabetes cases is 110.0 and the standard deviation is 24.29.
-From this we see that, in this data set, diabetes cases are associated with greater levels of "glu".
-This will be made clearer by plots of the estimated density functions.
-The first figure shows density estimates of ''p''(glu | diabetes=1), ''p''(glu | diabetes=0), and ''p''(glu).
-The density estimates are kernel density estimates using a Gaussian kernel.
-That is,
-a Gaussian density function is placed at each data point,
-and the sum of the density functions is computed over the range of the data.
-[[File:P glu given diabetes.png|thumb|center|360px|Estimated density of ''p'' (glu &#124; diabetes=1) (red), ''p''&nbsp;(glu &#124; diabetes=0) (blue), and ''p''&nbsp;(glu) (black)]]
-From the density of "glu" conditional on diabetes,
-we can obtain the probability of diabetes conditional on "glu" via [[Bayes' rule]].
-For brevity, "diabetes" is abbreviated "db." in this formula.
-:<math> p(\mbox{diabetes}=1|\mbox{glu})
- = \frac{p(\mbox{glu}|\mbox{db.}=1)\,p(\mbox{db.}=1)}{p(\mbox{glu}|\mbox{db.}=1)\,p(\mbox{db.}=1) + p(\mbox{glu}|\mbox{db.}=0)\,p(\mbox{db.}=0)}
-</math>
-The second figure shows the estimated posterior probability ''p''(diabetes=1 | glu).
-From these data,
-it appears that an increased level of "glu" is associated with diabetes.
-[[File:P diabetes given glu.png|thumb|center|360px|Estimated probability of ''p''(diabetes=1 &#124; glu)]]
-=== Script for example ===
-The following R commands will create the figures shown above. These commands can be entered at the command prompt by using cut and paste.
-<source lang="rsplus">
- library (MASS)
- data(Pima.tr)
- data(Pima.te)
- Pima <- rbind (Pima.tr, Pima.te)
- glu <- Pima[,'glu']
- d0 <- Pima[,'type'] == 'No'
- d1 <- Pima[,'type'] == 'Yes'
- base.rate.d1 <- sum(d1)/(sum(d1) + sum(d0))
- glu.density <- density (glu)
- glu.d0.density <- density (glu[d0])
- glu.d1.density <- density (glu[d1])
- approxfun (glu.d0.density$x, glu.d0.density$y) -> glu.d0.f
- approxfun (glu.d1.density$x, glu.d1.density$y) -> glu.d1.f
- p.d.given.glu <- function (glu, base.rate.d1)
- {
-    p1 <- glu.d1.f(glu) * base.rate.d1
-    p0 <- glu.d0.f(glu) * (1 - base.rate.d1)
-    p1/(p0+p1)
- }
- x <- 1:250
- y <- p.d.given.glu (x, base.rate.d1)
- plot (x, y, type='l', col='red', xlab='glu', ylab='estimated p(diabetes|glu)')
- plot (density(glu[d0]), col='blue', xlab='glu', ylab='estimate p(glu),
-    p(glu|diabetes), p(glu|not diabetes)', main=NA)
- lines (density(glu[d1]), col='red')
-</source>
-Note that the above conditional density estimator uses bandwidths that are optimal for unconditional densities. Alternatively, one
-could use the method of Hall, Racine and Li (2004)<ref name=hallracineli/> and the  R np package<ref>{{cite web|url=http://cran.r-project.org/web/packages/np/index.html|title=The np package - An R package that provides a variety of nonparametric and semiparametric kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types}}</ref>
-for automatic (data-driven) bandwidth selection that is
-optimal for conditional density estimates; see the np vignette<ref>{{cite web|url=http://cran.r-project.org/web/packages/np/vignettes/np.pdf|title=The np Package|author=Tristen Hayfield and Jeffrey S. Racine}}</ref> for an introduction to the np package. The following R commands use the  <tt>npcdens()</tt> function to deliver optimal smoothing. Note that the response "Yes"/"No" is a factor.
-<source lang="rsplus">
- library(np)
- fy.x <- npcdens(type~glu,nmulti=1,data=Pima)
- Pima.eval <- data.frame(type=factor("Yes"),
-                        glu=seq(min(Pima$glu),max(Pima$glu),length=250))
- plot (x, y, type='l', lty=2, col='red', xlab='glu',
-      ylab='estimated p(diabetes|glu)')
- lines(Pima.eval$glu,predict(fy.x,newdata=Pima.eval),col="blue")
- legend(0,1,c("Unconditional bandwidth", "Conditional bandwidth"),
-        col=c("red","blue"),lty=c(2,1))
-</source>
-The third figure uses optimal smoothing via the method of Hall, Racine, and Li<ref name=hallracineli>{{cite journal|author=Peter Hall, Jeffrey S. Racine and Qi Li|title=Cross-Validation and the Estimation of Conditional Probability Densities|journal=Journal of The American Statistical Association|volume=99|issue=468|pages=1015–1026|year=2004|url=http://econpapers.repec.org/article/besjnlasa/v_3a99_3ay_3a2004_3ap_3a1015-1026.htm}}</ref> indicating that the unconditional density bandwidth used in the second figure above yields a conditional density estimate that may be somewhat undersmoothed.
-[[File:Glu opt.png|thumb|center|360px|Estimated probability of ''p''&nbsp;(diabetes=1 &#124; glu)]]
-== See also ==
-* [[Kernel density estimation]]
-* [[Mean integrated squared error]]
-* [[Histogram]]
-* [[Multivariate kernel density estimation]]
-* [[Spectral density estimation]]
-* [[Kernel embedding of distributions]]
-== References ==
-{{reflist}}
-'''Sources'''
-* {{cite book|author=Brian D. Ripley|title=Pattern Recognition and Neural Networks|place=Cambridge|publisher=Cambridge University Press|year=1996|url=http://books.google.de/books/about/Pattern_Recognition_and_Neural_Networks.html?hl=de&id=2SzT2p8vP1oC|isbn=978-0521460866}}
-* [[Trevor Hastie]], [[Robert Tibshirani]], and Jerome Friedman. ''The Elements of Statistical Learning''. New York: Springer, 2001. ISBN 0-387-95284-5. ''(See Chapter 6.)''
-* Qi Li and Jeffrey S. Racine. ''Nonparametric Econometrics: Theory and Practice''. Princeton University Press, 2007, ISBN 0-691-12161-3. ''(See Chapter 1.)''
-* D.W. Scott. ''Multivariate Density Estimation. Theory, Practice and Visualization''. New York: Wiley, 1992.
-* [[Bernard Silverman|B.W. Silverman]]. ''Density Estimation''. London: Chapman and Hall, 1986. ISBN 978-0-412-24620-3
-==External links==
-* [http://www.creem.st-and.ac.uk/software.php CREEM: Centre for Research Into Ecological and Environmental Modelling] Downloads for free density estimation software packages [http://www.ruwpa.st-and.ac.uk/distance/ ''Distance 4''] (from Research Unit for Wildlife Population Assessment "RUWPA") and [http://www.ruwpa.st-and.ac.uk/estimating.abundance/ ''WiSP''].
-* [http://www.ics.uci.edu/~mlearn/MLSummary.html UCI Machine Learning Repository Content Summary] ''(See "Pima Indians Diabetes Database" for the original data set of 732 records, and additional notes.)''
-* [http://www.mathworks.com/matlabcentral/fileexchange/authors/27236 Free MATLAB code for one and two dimensional density estimation]
-* [http://libagf.sourceforge.net libAGF] C++ software for [[variable kernel density estimation]].
-[[Category:Estimation of densities]]
-[[Category:Non-parametric statistics]]

Finnish verb conjugation: Difference between revisions

Latest revision as of 20:24, 12 December 2014

Navigation menu

Search