Distributional data analysis: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 07:20, 22 June 2024 edit David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,677 edits ce ← Previous edit		Latest revision as of 08:44, 18 December 2024 edit undo Aadirulez8 (talk \| contribs) Extended confirmed users 85,228 edits m v2.05 - Autofix / Fix errors for CW project (Link equal to linktext) Tag: WPCleaner
(3 intermediate revisions by 3 users not shown)
Line 32: === Functional principal component analysis === [[Functional principal component analysis~~\|Functional~~]] ~~principal component analysis~~(FPCA)]] can be directly applied to the probability density functions.<ref>{{Cite journal\|last1=Kneip\|first1=A.\|last2=Utikal\|first2=K.J.\|date=2001\|title=Inference for density families using functional principal component analysis\|journal=Journal of the American Statistical Association\|volume=96\|issue=454\|pages=519–532\|doi=10.1198/016214501753168235\|s2cid=123524014 }}</ref> Consider a distribution process <math>\nu \sim \mathfrak{F}</math> and let <math>f</math> be the density function of <math>\nu</math>. Let the mean density function as <math>\mu(t) = \mathbb{E}\left[f(t)\right]</math> and the covariance function as <math>G(s,t) = \operatorname{Cov}(f(s), f(t))</math> with orthonormal eigenfunctions <math>\{\phi_j\}_{j=1}^\infty</math> and eigenvalues <math>\{\lambda_j\}_{j=1}^\infty</math>. By the Karhunen-Loève theorem, <math> Line 63: </math> Let the reference measure <math>\nu_0</math> be the Wasserstein mean <math>\mu_\oplus</math>. Then, a ''principal geodesic subspace (PGS)'' of dimension <math>k</math> with respect to <math>\mu_\oplus</math> is a set <math>G_k = \operatorname{argmin}_{G \in \text{CG}_{\nu_\oplus, k}(\mathcal{W}_2)} K_{W_2}(G)</math>.<ref name="gpca1">{{Cite journal\|last1=Bigot\|first1=J.\|last2=Gouet\|first2=R.\|last3=Klein\|first3=T.\|last4=López\|first4=A.\|date=2017\|title=Geodesic PCA in the Wasserstein space by convex PCA\|journal= Annales de l'~~institut~~Institut Henri ~~Poincare (B)~~Poincaré, ~~Probability~~Probabilités ~~and~~et ~~Statistics~~Statistiques\|volume=53\|issue=1\|pages=1–26\|doi=10.1214/15-AIHP706\|bibcode=2017AnIHP..53....1B \|s2cid=49256652 \|url=https://hal.archives-ouvertes.fr/hal-01978864/file/AIHP706.pdf }}</ref><ref name="gpca2">{{Cite journal\|last1=Cazelles\|first1=E.\|last2=Seguy\|first2=V.\|last3=Bigot\|first3=J.\|last4=Cuturi\|first4=M.\|last5=Papadakis\|first5=N.\|date=2018\|title=Geodesic PCA versus Log-PCA of histograms in the Wasserstein space\|journal=SIAM Journal on Scientific Computing\|volume=40\|issue=2\|pages=B429–B456\|doi=10.1137/17M1143459 \|bibcode=2018SJSC...40B.429C }}</ref> Note that the tangent space <math>T_{\mu_\oplus}</math> is a subspace of <math>L^2_{\mu_\oplus}</math>, the Hilbert space of <math>{\mu_\oplus}</math>-square-integrable functions. Obtaining the PGS is equivalent to performing PCA in <math>L^2_{\mu_\oplus}</math> under constraints to lie in the convex and closed subset.<ref name="gpca2"/> Therefore, a simple approximation of the Wasserstein Geodesic PCA is the Log FPCA by relaxing the geodesicity constraint, while alternative techniques are suggested.<ref name="gpca1"/><ref name="gpca2"/>