|
|
Line 1: |
Line 1: |
| '''Functional principal component analysis''' ('''FPCA''') is a [[statistics|statistical method]] for investigating the dominant modes of variation of [[Functional data analysis|functional data]]. Using this method, a [[random function]] is represented in the eigenbasis, which is an [[orthonormality|orthonormal]] basis of the [[Hilbert space]] ''L''<sup>2</sup> that consists of the eigenfunctions of the [[covariance operator|autocovariance operator]]. FPCA represents functional data in the most parsimonious way, in the sense that when using a fixed number of [[basis functions]], the eigenfunction basis explains more variation than any other basis expansion. FPCA can be applied for representing random functions,<ref name="jones and rice 1992">{{cite doi|10.1080/00031305.1992.10475870}}</ref> or functional regression<ref name="Yao 2005b">{{cite doi| 10.1214/009053605000000660}}</ref> and classification.
| | <br><br>Boost your online visibility: In this regard, you can create, promote and boost your website visibility to get more [http://Browse.Deviantart.com/?q=success success] and profit as well. If you loved this post and you would like to receive more info about Cheap Hostgator Hosting ([http://orichinese.com/groups/hostgator-dreamweaver-ftp/ http://orichinese.com]) kindly visit our web page. Choosing a high-quality and reliable host would make an incredible deference in any web site's potential. Before, when I have issues, my current host ever actually calls me to talk to a live human being will allow. Windows Dedicated Servers offer pretty much all the exact same features of Linux Plans but these are developed excessively for intensive bandwidth usage. They are recognized worldwide as one of the best worldwide web hosting providers with over 7,000,000 hosting domains.Why decide on HostGator?It's the one of the cheapest webhosting services out there today, why pay a lot for something you can have at a better cost?. With that being said, however, you'll want to create content people actually want to read. I like the fact that there is no contract or commitment with HostGator and you can simply sign up for month to month. You obtain unconstrained WordPress sites, unrestrained bandwidth plus web freedom in the midst of countless added benefits. Therefore you'll need to know that you're working with a web hosting company that knows how to handle important problems with a priority on solving them, which is something many of the newcomers to this market tend to neglect until it's too late. Chances are you're using Hostgator as a host for your new domain, because you can't import a Blogger blog using Hostgator, but you can import a WordPress blog.<br><br><br><br>I'm amazed with how quickly news is spread through this platform. In most cases you will want to stick to the major hosting companies such as Hostgator and Bluehost - you will likely have heard of them from one place or another and so it's easy to know that you can trust them since they're well-known names on the market. HostGator techs after some conversation with a very large accounts support the fact that they not only escape was adamant about, but if I had to manage all of the backup process, etc., myself, that I really was asking for a problem that would set me back years in the business since the sites were responsible for my income. Despite trying to keep what promises they made to customers there are certain shortcomings that these web hosting companies have to face. So, best of all results are assured and they also take care of the customers just like no other web site on web. It's through passion and drive that you'll not only start a blog but become a powerhouse on the web. At present, HostGator has in excess of 2,250,000 consumers. Inc.5000 comprises of listings of the fastest growing companies In the United States Of America. |
| | |
| ==Formulation==
| |
| For a [[Square-integrable function|square-integrable]] [[stochastic process]] ''X''(''t''), ''t'' ∈ 𝒯, let
| |
| :<math> \mu(t) = \text{E}(X(t)) </math>
| |
| and
| |
| :<math> G(s, t) = \text{Cov}(X(s), X(t)) = \sum_{k=1}^\infty \lambda_k \varphi_k(s) \varphi_k(t), </math> | |
| where ''λ''<sub>1</sub> ≥ ''λ''<sub>2</sub> ≥ ··· ≥ 0 are the eigenvalues and ''φ''<sub>1</sub>, ''φ''<sub>2</sub>, ... are the [[orthonormality|orthonormal]] eigenfunctions of the linear [[Hilbert–Schmidt operator]]
| |
| :<math> G: L^2(\mathcal{T}) \rightarrow L^2(\mathcal{T}),\, G(f) = \int_\mathcal{T} G(s, t) f(s) ds. </math>
| |
| | |
| By the [[Karhunen–Loève theorem]], one can express the centered process in the eigenbasis,
| |
| :<math> X(t) - \mu(t) = \sum_{k=1}^\infty \xi_k \varphi_k(t),
| |
| </math>
| |
| where
| |
| :<math> \xi_k = \int_\mathcal{T} (X(t) - \mu(t)) \varphi_k(t) dt
| |
| </math>
| |
| is the principal component associated with the ''k''-th eigenfunction ''φ''<sub>''k''</sub>, with the properties
| |
| :<math> \text{E}(\xi_k) = 0, \text{Var}(\xi_k) = \lambda_k \text{ and } \text{E}(\xi_k \xi_l) = 0 \text{ for } k \ne l.</math> | |
| | |
| The centered process is then equivalent to ''ξ''<sub>1</sub>, ''ξ''<sub>2</sub>, .... A common assumption is that ''X'' can be represented by only the first few eigenfunctions (after subtracting the mean function), i.e.
| |
| :<math> X(t) \approx X_m(t) = \mu(t) + \sum_{k=1}^m \xi_k \varphi_k(t),</math>
| |
| where
| |
| :<math> \mathrm{E}\left(\int_{\mathcal{T}} \left( X(t) - X_m(t)\right)^2 dt\right) = \sum_{j>m} \lambda_j \rightarrow 0 \text{ as } m \rightarrow \infty .</math>
| |
| | |
| ==Interpretation of eigenfunctions==
| |
| The first eigenfunction ''φ''<sub>1</sub> depicts the dominant mode of variation of ''X''.
| |
| :<math> \varphi_1 = \underset{\Vert \mathbf{\varphi} \Vert = 1}{\operatorname{arg\,max}}
| |
| \left\{\operatorname{Var}(\int_\mathcal{T} (X(t) - \mu(t)) \varphi(t) dt) \right\}, </math>
| |
| where
| |
| :<math>
| |
| \Vert \mathbf{\varphi} \Vert = \left( \int_\mathcal{T} \varphi(t)^2 dt \right)^{\frac{1}{2}}. </math>
| |
| | |
| The ''k''-th eigenfunction ''φ''<sub>''k''</sub> is the dominant mode of variation orthogonal to ''φ''<sub>1</sub>, ''φ''<sub>2</sub>, ... , ''φ''<sub>''k''-1</sub>,
| |
| :<math> \varphi_k = \underset{\Vert \mathbf{\varphi} \Vert = 1, \langle \varphi, \varphi_j \rangle = 0 \text{ for } j = 1, \dots, k-1}{\operatorname{arg\,max}}
| |
| \left\{\operatorname{Var}(\int_\mathcal{T} (X(t) - \mu(t)) \varphi(t) dt) \right\}, </math>
| |
| where
| |
| :<math>
| |
| \langle \varphi, \varphi_j \rangle = \int_\mathcal{T} \varphi(t)\varphi_j(t) dt, \text{ for } j = 1, \dots, k-1. </math>
| |
| | |
| ==Estimation==
| |
| Let ''Y''<sub>''ij''</sub> = ''X''<sub>''i''</sub>(''t''<sub>''ij''</sub>) + ε<sub>''ij''</sub> be the observations made at locations (usually time points) ''t''<sub>''ij''</sub>, where ''X''<sub>''i''</sub> is the ''i''-th realization of the smooth stochastic process that generates the data, and ε<sub>''ij''</sub> are identically and independently distributed normal random variable with mean 0 and variance σ<sup>2</sup>, ''j'' = 1, 2, ..., ''m''<sub>''i''</sub>. To obtain an estimate of the mean function ''μ''(''t''<sub>''ij''</sub>), if a dense sample on a regular grid is available, one may take the average at each location ''t''<sub>''ij''</sub>:
| |
| :<math> \hat{\mu}(t_{ij}) = \frac{1}{n} \sum_{i=1}^n Y_{ij}. </math>
| |
| If the observations are sparse, one needs to smooth the data pooled from all observations to obtain the mean estimate,<ref name="yao 2005a">{{cite doi|10.1198/016214504000001745}}</ref> using smoothing methods like [[local regression|local linear smoothing]] or [[spline smoothing]].
| |
| | |
| Then the estimate of the covariance function <math> \hat{G}(s, t) </math> is obtained by averaging (in the dense case) or smoothing (in the sparse case) the raw covariances
| |
| :<math> G_i(t_{ij}, t_{il}) = (Y_{ij} - \hat{\mu}(t_{ij})) (Y_{il} - \hat{\mu}(t_{il})), j \neq l, i = 1, \dots, n. </math>
| |
| | |
| Note that the diagonal elements of ''G''<sub>''i''</sub> should be removed because they contain measurement error.<ref name="Staniswalis and Lee 1998">{{cite doi| 10.1080/01621459.1998.10473801}}</ref>
| |
| | |
| In practice, <math> \hat{G}(s, t) </math> is discretized to an equal-spaced dense grid, and the estimation of eigenvalues ''λ''<sub>''k''</sub> and eigenvectors ''v''<sub>''k''</sub> is carried out by numerical linear algebra.<ref name="rice and silverman 1991">{{cite journal| last=Rice | first=John|last2=Silverman|first2=B.|year=1991|title=Estimating the Mean and Covariance Structure Nonparametrically When the Data are Curves|journal=Journal of the Royal Statistical Society. Series B (Methodological)|volume= 53|issue=1|pages=233–243|publisher=Wiley}}</ref> The eigenfunction estimates <math> \hat{\varphi}_k</math> can then be obtained by [[interpolation|interpolating]] the eigenvectors <math> \hat{v_k}. </math>
| |
| | |
| The fitted covariance should be [[Positive-definite matrix|positive definite]] and [[Symmetric matrix|symmetric]] and is then obtained as
| |
| :<math> \tilde{G}(s, t) = \sum_{\lambda_k > 0} \hat{\lambda}_k \hat{\varphi}_k(s) \hat{\varphi}_k(s). </math>
| |
| | |
| Let <math> \hat{V}(t) </math> be a smoothed version of the diagonal elements ''G''<sub>''i''</sub>(''t<sub>ij</sub>, t<sub>ij</sub>'') of the raw covariance matrices. Then <math> \hat{V}(t) </math> is an estimate of (''G''(''t'', ''t'') + ''σ''<sup>2</sup>). An estimate of ''σ''<sup>2</sup> is obtained by
| |
| :<math> \hat{\sigma}^2 = \frac{2}{|\mathcal{T}|} \int_{\mathcal{T}} (\hat{V}(t) - \tilde{G}(t, t)) dt, </math> if <math> \hat{\sigma}^2 > 0; </math> otherwise <math> \hat{\sigma}^2 = 0. </math>
| |
| | |
| If the observations ''X''<sub>''ij''</sub>, ''j''=1, 2, ..., ''m<sub>i</sub>'' are dense in 𝒯, then the ''k''-th FPC ''ξ''<sub>''k''</sub> can be estimated by [[numerical integration]], implementing
| |
| :<math> \hat{\xi}_k = \langle X - \hat{\mu}, \hat{\varphi}_k \rangle. </math>
| |
| | |
| However, if the observations are sparse, this method will not work. Instead, one can use [[best linear unbiased prediction|best linear unbiased predictors]],<ref name="yao 2005a"/> yielding
| |
| :<math> \hat{\xi}_k = \hat{\lambda}_k \hat{\varphi}_k^T \hat{\Sigma}_{Y_i}^{-1}(Y_i - \hat{\mu}),
| |
| </math>
| |
| where
| |
| :<math> \hat{\Sigma}_{Y_i} = \tilde{G} + \hat{\sigma}^2 \mathbf{I}_{m_i} </math>,
| |
| and <math> \tilde{G}</math> is evaluated at the grid points generated by ''t''<sub>''ij''</sub>, ''j'' = 1, 2, ..., ''m''<sub>''i''</sub>. The algorithm, PACE, has an available Matlab package.<ref name="pace"> {{cite web|url=http://www.stat.ucdavis.edu/PACE/|title=PACE: Principal Analysis by Conditional Expectation}}</ref>
| |
| | |
| Asymptotic convergence properties of these estimates have been investigated.<ref name="yao 2005a"/><ref name="hall 2006">{{cite doi| 10.1214/009053606000000272}}</ref><ref name="li 2010">{{cite doi| 10.1214/10-AOS813}}</ref>
| |
| | |
| ==Applications==
| |
| FPCA can be applied for displaying the modes of functional variation,<ref name="jones and rice 1992"/> in scatterplots of FPCs against each other or of responses against FPCs, for modeling sparse [[longitudinal data]],<ref name="yao 2005a"/> or for functional regression and classification, e.g., functional linear regression.<ref name="Yao 2005b"/> [[Factor analysis#Criteria_for_determining_the_number_of_factors|Scree plots]] and other methods can be used to determine the number of included components.
| |
| | |
| ==Connection with principal component analysis==
| |
| The following table shows a comparison of various elements of [[principal component analysis]] (PCA) and FPCA. The two methods are both used for [[dimensionality reduction]]. In implementations, FPCA uses a PCA step.
| |
| | |
| However, PCA and FPCA differ in some critical aspects. First, the order of multivariate data in PCA can be [[permutation|permuted]], which has no effect on the analysis, but the order of functional data carries time or space information and cannot be reordered. Second, the spacing of observations in FPCA matters, while there is no spacing issue in PCA. Third, regular PCA does not work for high-dimensional data without [[Regularization (mathematics)|regularization]], while FPCA has a build-in regularization due to the smoothness of the functional data and the truncation to a finite number of included components.
| |
| | |
| {| class="wikitable"
| |
| |-
| |
| ! Element
| |
| ! In PCA
| |
| ! In FPCA
| |
| |-
| |
| ! scope="row"| Data
| |
| | <math> X \in \mathbb{R}^p </math>
| |
| | <math> X \in L^2(\mathcal{T}) </math>
| |
| |-
| |
| ! Dimension
| |
| | <math> p < \infty </math>
| |
| | <math> \infty </math>
| |
| |-
| |
| ! Mean
| |
| | <math> \mu = \text{E}(X) </math>
| |
| | <math> \mu(t) = \text{E}(X(t)) </math>
| |
| |-
| |
| ! Covariance
| |
| | <math> \text{Cov}(X) = \Sigma_{p \times p}</math>
| |
| | <math> \text{Cov}(X(s), X(t)) = G(s, t) </math>
| |
| |-
| |
| ! Eigenvalues
| |
| | <math> \lambda_1, \lambda_2, \dots, \lambda_p </math>
| |
| | <math> \lambda_1, \lambda_2, \dots </math>
| |
| |-
| |
| ! Eigenvectors/Eigenfunctions
| |
| | <math> \mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_p </math>
| |
| | <math> \varphi_1(t), \varphi_2(t), \dots </math>
| |
| |-
| |
| ! Inner Product
| |
| | <math> \langle \mathbf{X}, \mathbf{Y} \rangle = \sum_{k=1}^p X_k Y_k </math>
| |
| | <math> \langle X, Y \rangle = \int_\mathcal{T} X(t) Y(t) dt </math>
| |
| |-
| |
| ! Principal Components
| |
| | <math> z_k = \langle X - \mu, \mathbf{v_k} \rangle, k = 1, 2, \dots, p </math>
| |
| | <math> \xi_k = \langle X - \mu, \varphi_k\rangle, k = 1, 2, \dots </math>
| |
| |}
| |
| | |
| ==See also==
| |
| *[[Principal component analysis]]
| |
| | |
| ==Notes==
| |
| {{Reflist}}
| |
| | |
| ==References==
| |
| * {{cite book|author1=James O. Ramsay|author2=B. W. Silverman|title=Functional Data Analysis|url=http://books.google.com/books?id=mU3dop5wY_4C|date=8 June 2005|publisher=Springer|isbn=978-0-387-40080-8}}
| |
| | |
| <!--- STOP! Be warned that by using this process instead of Articles for Creation, this article is subject to scrutiny. As an article in "mainspace", it will be DELETED if there are problems, not just declined. If you wish to use AfC, please return to the Wizard and continue from there. --->
| |
| | |
| [[Category:Statistical methods]]
| |
| [[Category:Non-parametric statistics]]
| |
Boost your online visibility: In this regard, you can create, promote and boost your website visibility to get more success and profit as well. If you loved this post and you would like to receive more info about Cheap Hostgator Hosting (http://orichinese.com) kindly visit our web page. Choosing a high-quality and reliable host would make an incredible deference in any web site's potential. Before, when I have issues, my current host ever actually calls me to talk to a live human being will allow. Windows Dedicated Servers offer pretty much all the exact same features of Linux Plans but these are developed excessively for intensive bandwidth usage. They are recognized worldwide as one of the best worldwide web hosting providers with over 7,000,000 hosting domains.Why decide on HostGator?It's the one of the cheapest webhosting services out there today, why pay a lot for something you can have at a better cost?. With that being said, however, you'll want to create content people actually want to read. I like the fact that there is no contract or commitment with HostGator and you can simply sign up for month to month. You obtain unconstrained WordPress sites, unrestrained bandwidth plus web freedom in the midst of countless added benefits. Therefore you'll need to know that you're working with a web hosting company that knows how to handle important problems with a priority on solving them, which is something many of the newcomers to this market tend to neglect until it's too late. Chances are you're using Hostgator as a host for your new domain, because you can't import a Blogger blog using Hostgator, but you can import a WordPress blog.
I'm amazed with how quickly news is spread through this platform. In most cases you will want to stick to the major hosting companies such as Hostgator and Bluehost - you will likely have heard of them from one place or another and so it's easy to know that you can trust them since they're well-known names on the market. HostGator techs after some conversation with a very large accounts support the fact that they not only escape was adamant about, but if I had to manage all of the backup process, etc., myself, that I really was asking for a problem that would set me back years in the business since the sites were responsible for my income. Despite trying to keep what promises they made to customers there are certain shortcomings that these web hosting companies have to face. So, best of all results are assured and they also take care of the customers just like no other web site on web. It's through passion and drive that you'll not only start a blog but become a powerhouse on the web. At present, HostGator has in excess of 2,250,000 consumers. Inc.5000 comprises of listings of the fastest growing companies In the United States Of America.