Multivariate kernel density estimation: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 03:59, 27 December 2024 edit LR.127 (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 25,442 edits Adding short description: "Concept in statistics mathematics" Tag: Shortdesc helper ← Previous edit		Latest revision as of 12:02, 17 June 2025 edit undo Frap (talk \| contribs) Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 35,585 edits →Density estimation with a diagonal bandwidth matrix
(One intermediate revision by one other user not shown)
Line 154: <syntaxhighlight lang="matlab" style="overflow:auto;"> clear all % generate synthetic data data=[randn(500, 2); randn(500, 1) + 3.5, randn(500, 1);]; % call the routine, which has been saved in the current directory [bandwidth, density, X, Y] = kde2d(data); % plot the data and the density estimate contour3(X, Y, density, 50), hold on plot(data(:,1), data(:,2), 'r.', 'MarkerSize', 5) </syntaxhighlight> Line 197: where, ''N'' is the number of data points, ''d'' is the number of dimensions (variables), and <math>I_{\vec{A}}(\vec{t})</math> is a filter that is equal to 1 for 'accepted frequencies' and 0 otherwise. There are various ways to define this filter function, and a simple one that works for univariate or multivariate samples is called the 'lowest contiguous hypervolume filter'; <math>I_{\vec{A}}(\vec{t})</math> is chosen such that the only accepted frequencies are a contiguous subset of frequencies surrounding the origin for which <math>\|\hat{\varphi}(\vec{t})\|^2 \ge 4(N-1)N^{-2}</math> (see <ref name=":22"/> for a discussion of this and other filter functions). Note that direct calculation of the ''empirical characteristic function'' (ECF) is slow, since it essentially involves a direct Fourier transform of the data samples. However, it has been found that the ECF can be approximated accurately using a [[Non-uniform discrete Fourier transform\|non-uniform fast Fourier transform]] (nuFFT) method,<ref name=":1" /><ref name=":22"/> which increases the calculation speed by several orders of magnitude (depending on the dimensionality of the problem). The combination of this objective KDE method and the nuFFT-based ECF approximation has been referred to as ''[https://~~bitbucket~~github.~~org~~com/~~lbl~~LBL-~~cascade~~EESA/fastkde fastKDE]'' in the literature.<ref name=":22"/> [[File:FastKDE_example.jpg\|alt=A demonstration of fastKDE relative to a sample PDF. (a) True PDF, (b) a good representation with fastKDE, and (c) a slightly blurry representation.\|none\|thumb\|664x664px\|A non-trivial mixture of normal distributions: (a) the underlying PDF, (b) a fastKDE estimate on 1,000,000 samples, and (c) a fastKDE estimate on 10,000 samples.]]