Multivariate kernel density estimation: Difference between revisions

Content deleted Content added
m clean up inline <math> formulas and substitute named html entities using AWB
m clean up inline <math> formulas and substitute named html entities using AWB
Line 15:
 
==Definition==
The previous figure is a graphical representation of kernel density estimate, which we now define in an exact manner. Let '''X'''<mathsub>1</sub>\bold{X}_1, \bold{'''X}_2'''<sub>2</sub>, \dots, \bold{'''X}_n'''<sub>''n''</mathsub> be a ''d''-variate random sample drawn from a common density function ''fƒ''. The kernel density estimate is defined to be
 
: <math>\hat{f}_\bold{H}(\bold{x})= n^{-1} \sum_{i=1}^n K_\bold{H} (\bold{x} - \bold{X}_i)</math>
Line 21:
where
<ul>
<li><math>\bold{{nowrap|'''x}''' {{=}} (x_1''x''<sub>1</sub>, x_2''x''<sub>2</sub>, \dots, x_d''x<sub>d</sub>'')^<sup>''T''</mathsup>}}, {{nowrap|'''X'''<mathsub>''i''</sub>\bold{X}_i {{=}} (X_{i1}''X''<sub>''i''1</sub>, X_{i2}''X''<sub>''i''2</sub>, \dots, X_{''X<sub>id}</sub>'')^<sup>''T''</sup>, ''i'' {{=}} 1, 2, \dots, ''n</math>''}} are ''d''-vectors
<li>''K'' is the kernel function which is a symmetric density function, with {{nowrap|''K''<mathsub>K_\bold{'''H}'''(\bold{</sub>'''x}''') {{=}} |\bold{H{!}|^}'''H'''{-1{!}}<sup>−1/2} </sup>''K''(\bold{'''H}^{-1'''<sup>−1/2} \bold{x})</mathsup>'''x''')}}
<li><strong>H</strong> is the bandwidth (or smoothing) matrix which is a symmetric, [[positive definite matrix|positive definite]] ''d x d'' matrix.
</ul>
Line 60:
 
===Plug-in===
The plug-in (PI) estimate of the AMISE is formed by replacing '''Ψ'''<mathsub>\bold{\Psi}_44</mathsub> by its estimator <math>\hat{\bold{\Psi}}_4</math>
 
: <math>\operatorname{PI}(\bold{H}) = n^{-1} |\bold{H}|^{-1/2} R(K) + \tfrac{1}{4} m_2(K)^2
Line 66:
 
where <math>\hat{\bold{\Psi}}_4 (\bold{G}) = n^{-2} \sum_{i=1}^n
\sum_{j=1}^n [(\operatorname{vec} \, \operatorname{D}^2) (\operatorname{vec}^T \operatorname{D}^2)] K_\bold{G} (\bold{X}_i - \bold{X}_j)</math>. Thus <math>\hat{\bold{H}}_{\operatorname{PI}} = \operatorname{argmin}_{\bold{H} \in F} \, \operatorname{PI} (\bold{H})</math> is the plug-in selector.<ref>{{Cite journal| author1=Wand, M.P. | author2=Jones, M.C. | title=Multivariate plug-in bandwidth selection | journal=Computational Statistics | year=1994 | volume=9 | pages=97–177}}</ref><ref>{{Cite journal| doi=10.1080/10485250306039 | author1=Duong, T. | author2=Hazelton, M.L. | title=Plug-in bandwidth matrices for bivariate kernel density estimation | journal=Journal of Nonparametric Statistics | year=2003 | volume=15 | pages=17–30}}</ref> These references also contain algorithms on optimal estimation of the pilot bandwidth matrix <strong>G</strong> and establish that <math>\hat{\bold{H}}_{\operatorname{PI}}</math> [[convergence in probability|converges in probability]] to '''H'''<mathsub>\bold{H}_{\operatorname{AMISE}}</mathsub>.
 
===Smoothed cross validation===
Line 76:
 
Thus <math>\hat{\bold{H}}_{\operatorname{SCV}} = \operatorname{argmin}_{\bold{H} \in F} \, \operatorname{SCV} (\bold{H})</math> is the SCV selector.<ref>{{Cite journal| doi=10.1007/BF01205233 | author1=Hall, P. | author2=Marron, J. | author3=Park, B. | title=Smoothed cross-validation | journal=Probability Theory and Related Fields | year=1992 | volume=92 | pages=1–20}}</ref><ref>{{Cite journal| doi=10.1111/j.1467-9469.2005.00445.x | author1=Duong, T. | author2=Hazelton, M.L. | title=Cross validation bandwidth matrices for multivariate kernel density estimation | journal=Scandinavian Journal of Statistics | year=2005 | volume=32 | pages=485–506}}</ref>
These references also contain algorithms on optimal estimation of the pilot bandwidth matrix <strong>G</strong> and establish that <math>\hat{\bold{H}}_{\operatorname{SCV}}</math> converges in probability to '''H'''<mathsub>\bold{H}_{\operatorname{AMISE}}</mathsub>.
 
==Computer implementation==