Revision as of 11:49, 15 September 2010 edit Drleft (talk \| contribs) 102 edits No edit summary ← Previous edit		Revision as of 12:20, 15 September 2010 edit undo Drleft (talk \| contribs) 102 edits No edit summary Next edit →
Line 6: [[Kernel density estimation]] is one of the most popular techniques for density estimation. It can be viewed as a generalisation of [[histogram]] density estimation with improved statistical properties. Kernel density estimators were first introduced in the scientific literature for [[univariate]] data in the 1950s and 1960s<ref>{{cite journal\|doi=10.1214/aoms/1177728190\|last=Rosenblatt\|first=M.\|title=Remarks on some nonparametric estimates of a density function \|url=http://projecteuclid.org/euclid.aoms/1177728190\|journal=[[Annals of Mathematical Statistics]]\|year=1956\|volume=27\|pages=832-837}}</ref><ref>{{cite journal\|doi=10.1214/aoms/1177704472\|last=Parzen\|first=E.\|title=On estimation of a probability density function and mode\|url=http://projecteuclid.org/euclid.aoms/1177704472\|journal=[[Annals of Mathematical Statistics]]\|year=1962\|volume=33\|pages=1065-1076}}</ref> and subsequently have been widely adopted. It was soon recognised that analagous estimators for multivariate data would be an important addition to [[multivariate statistics]]. Based on research carried out in the 1990s and 2000s, multivariate kernel density estimation has reached a level of maturity comparable to their univariate counterparts. ▼ ~~Kernel density estimators were first introduced in the scientific literature for [[univariate]] data in the 1950s and 1960s by~~ ▲<ref>{{cite journal\|doi=10.1214/aoms/1177728190\|last=Rosenblatt\|first=M.\|title=Remarks on some nonparametric estimates of a density function \|url=http://projecteuclid.org/euclid.aoms/1177728190\|journal=[[Annals of Mathematical Statistics]]\|year=1956\|volume=27\|pages=832-837}}</ref><ref>{{cite journal\|doi=10.1214/aoms/1177704472\|last=Parzen\|first=E.\|title=On estimation of a probability density function and mode\|url=http://projecteuclid.org/euclid.aoms/1177704472\|journal=[[Annals of Mathematical Statistics]]\|year=1962\|volume=33\|pages=1065-1076}}</ref> and subsequently have been widely adopted. It was soon recognised that analagous estimators for multivariate data would be an important addition to [[multivariate statistics]]. == Motivation == To motivate the definition of multivariate kernel density estimators, we take as an illustrative [[bivariate]] data set drawn from .... Problems with bivariate histograms. == Definition == Let <math>\bold{X}_1, \bold{X}_2, \dots, \bold{X}_n</math> be a <em>d</em>-variate random sample drawn from a common density function <em>f</em>. The kernel density estimate is defined to be <math>\widehat{f}_\bold{H}(\bold{x})= n^{-1} \|\bold{H}\|^{-1/2} \sum_{i=1}^n K_\bold{H} (\bold{x} - \bold{X}_i)</math> where <ul> <li><math>\bold{x} = (x_1, x_2, \dots, x_d)^T</math>, <math>\bold{X}_i = (X_{i1}, X_{i2}, \dots, X_{id})^T, i=1, 2, \dots, n.</math> <li><em>K</em> is the kernel function which is a symmetric density function, with<math>K_\bold{H}(\bold{x}) = \|\bold{H}\|^{-1/2} K(\bold{H}^{-1/2} \bold{x})</math> <li><strong>H</strong> is the bandwidth (or smoothing) matrix which is a symmetric, [[positive definite matrix\|positive definite]] <em>d x d</em> matrix. </ul> == References ==

Multivariate kernel density estimation: Difference between revisions