Nonparametric regression: Difference between revisions

Content deleted Content added
m Typo/general fixing, replaced: vice-versa → vice versa, typo(s) fixed: , → , using AWB
Line 13:
 
Here the penalty term is proportional to the [[RKHS]] norm of the vector of regression coefficients.
 
 
===Lasso===
Line 23 ⟶ 22:
[[File:NonparRegrGaussianKernel.png|thumb| Example of a curve (red line) fit to a small data set (black points) with nonparametric regression using a Gaussian kernel smoother. The pink shaded area illustrates the kernel function applied to obtain an estimate of y for a given value of x. The kernel function defines the weight given to each data point in producing the estimate for a target point.]]
Kernel regression estimates the continuous dependent variable from a limited set of data points by [[Convolution|convolving]] the data points' locations with a [[kernel function]]—approximately speaking, the kernel function specifies how to "blur" the influence of the data points so that their values can be used to predict the value for nearby locations.
 
 
==Nonparametric multiplicative regression==
Line 40 ⟶ 38:
NPMR has been useful for modeling the response of an organism to its environment. Organismal response to environment tends to be nonlinear and have complex interactions among predictors. NPMR allows you to model automatically the complex interactions among predictors in much the same way that organisms integrate the numerous factors affecting their performance.<ref>{{Cite journal|last=McCune|first=B.|year=2006|title=Non-parametric habitat models with automatic interactions|journal=Journal of Vegetation Science|volume=17|pages=819–830|doi=10.1658/1100-9233(2006)17[819:NHMWAI]2.0.CO;2|issue=6}}</ref>
 
A key biological feature of an NPMR model is that failure of an organism to tolerate any single dimension of the predictor space results in overall failure of the organism. For example, assume that a plant needs a certain range of moisture in a particular temperature range. If either temperature or moisture fall outside the tolerance of the organism, then the organism dies. If it is too hot, then no amount of moisture can compensate to result in survival of the plant. Mathematically this works with NPMR because the product of the weights for the target point is zero or near zero if any of the weights for individual predictors (moisture or temperature) are zero or near zero. Note further that in this simple example, the second condition listed above is probably true: the response of the plant to moisture probably depends on temperature and vice- versa.
 
Optimizing the selection of predictors and their smoothing parameters in a multiplicative model is computationally intensive. With a large pool of predictors, the computer must search through a huge number of potential models in search for the best model. The best model has the best fit, subject to [[overfitting]] constraints or penalties (see below).<ref>{{cite journal |last=Grundel |first=R. |first2=N. B. |last2=Pavlovic |year=2007 |title=Response of bird species densities to habitat structure and fire history along a Midwestern open–forest Gradient |journal=[[The Condor (journal)|The Condor]] |volume=109 |issue=4 |pages=734–749 |doi=10.1650/0010-5422(2007)109[734:ROBSDT]2.0.CO;2 }}</ref><ref>{{cite journal |last=DeBano |first=S. J. |first2=P. B. |last2=Hamm |first3=A. |last3=Jensen |first4=S. I. |last4=Rondon |first5=P. J. |last5=Landolt |year=2010 |title=Spatial and temporal dynamics of potato tuberworm (''Lepidoptera: Gelechiidae'') in the Columbia Basin of the Pacific Northwest |journal=Environmental Entomology |volume=39 |issue=1 |pages=1–14 |doi=10.1603/EN08270 }}</ref>
Line 96 ⟶ 94:
==Further reading==
* {{cite book |last=Bowman |first=A. W. |first2=A. |last2=Azzalini |year=1997 |title=Applied Smoothing Techniques for Data Analysis |publisher=Clarendon Press |___location=Oxford |isbn=0198523963 }}
* Cleveland, W.S. (1979) "Robust locally weighted regression and smoothing scatterplots" J. Amer. Statist. Assoc., 74 , pp. &nbsp;829–836
* McCune, B. and M. J. Mefford (2009). HyperNiche. Nonparametric Multiplicative Habitat Modeling. Version 2. MjM Software, Gleneden Beach, Oregon, U.S.A.
* Fan, J. (1993) "Local linear regression smoothers and their minimax efficiency" Ann. Statist., 21, pp. &nbsp;196–216
* Fan, J. and I. Gijbels (1992) Variable bandwidth and local linear regression smoothers, Ann. Statist., 20, pp. &nbsp;2008–2036
* Fan, J. and I. Gijbels (1996) Local Polynomial Modelling and its Applications, Chapman and Hall
* Li, Q. and J. Racine (2007) Nonparametric Econometrics: Theory and Practice, Princeton University Press
* [http://www.sciencedirect.com/science/article/pii/S0167947314001741 Li, D., Simar, L. and V. Zelenyuk (2014) "Generalized nonparametric smoothing with mixed discrete and continuous data" Computational Statistics and Data Analysis, 1-21. doi:10.1016/j.csda.2014.06.003]
* Pagan, A. and A. Ullah (1999) Nonparametric Econometrics, Cambridge University Press.
* Racine, J. and Q. Li, (2004) "Nonparametric estimation of regression functions with both categorical and continuous data" Journal of Econometrics 119, pp. &nbsp;99–130
* [https://ideas.repec.org/a/eee/econom/v146y2008i1p185-198.html Park, Byeong U. & Simar, Léopold & Zelenyuk, Valentin, 2008. "Local likelihood estimation of truncated regression and its partial derivatives: Theory and application," Journal of Econometrics, Elsevier, vol. 146(1), pages 185-198, September.]