Functional regression: Difference between revisions

Content deleted Content added
Ms.chen (talk | contribs)
No edit summary
Ms.chen (talk | contribs)
No edit summary
Line 17:
where in implementation the infinite sum is replaced by a finite sum truncated at <math>K</math>
<math display="block">Y = \beta_0 + \sum_{k=1}^K \beta_k x_k +\epsilon</math>
where <math>K\in\mathbb{N}</math> is finite<ref name=wang:16>Wang, Chiou and M&uuml;ller (2016). "Functional data analysis". ''Annual Review of Statistics and Its Application''. '''3''':257&ndash;295. [[Digital object identifier|doi]]:[http://doi.org/dx.doi.org/10.1146/annurev-statistics-041715-033624 10.1146/annurev-statistics-041715-033624]</ref>.<br />
Adding multiple functional and scalar covariates, the FLR can be extended as
<math display="block">Y = \langle\mathbf{Z},\alpha\rangle + \sum_{j=1}^p \int_{\mathcal{T}_j} X_j^c(t) \beta_j(t) dt + \epsilon</math>
where <math>\mathbf{Z}=(Z_1,\cdots,Z_q)^T</math> with <math>Z_1=1</math> is a vector of scalar covariates, <math>\alpha=(\alpha_1,\cdots,\alpha_q)^T</math> is a vector of coefficients corresponding to <math>\mathbf{Z}</math>, <math>\langle\cdot,\cdot\rangle</math> denotes the inner product in Euclidean space, <math>X^c_1,\cdots,X^c_p</math> are multiple centered functional covariates given by <math>X_j^c(\cdot) = X_j(\cdot) - \mathbb{E}(X_j(\cdot))</math>, and <math>\mathcal{T}_j</math> is the interval <math>X_j(\cdot)</math> is defined on. However, due to the parametric component <math>\alpha</math>, the estimation of this model is different from that of the FLR. A possible approach to estimating <math>\alpha</math> is through [[Generalized estimating equation|generalized estimating equation]] with the nonparametric part <math> \sum_{j=1}^p \int_{\mathcal{T}_j} X_j^c(t) \beta_j(t) dt</math> replaced by its estimate for a given <math>\alpha</math><ref>Hu, Wang and Carroll (2004). "Profile-kernel versus backfitting in the partially linear models for longitudinal/clustered data". ''Biometrika''. '''91''' (2): 251&ndash;262. [[Digital object identifier|doi]]:[http://doi.org/10.1093/biomet/91.2.251 10.1093/biomet/91.2.251]</ref>. Once <math>\alpha</math> is estimated, one can apply any suitable consistent method to <math>Y-\langle\mathbf{Z}, \hat\alpha\rangle</math> to estimate <math>\beta_j</math>s<ref name=wang:16/>.<br />
 
=== Functional linear models with functional response ===
For a function <math>Y(\cdot)</math> on <math>\mathcal{T}_Y</math> and a functional covariate <math>X(\cdot)</math> on <math>\mathcal{T}_X</math>, two primary models have been considered<ref name=wang:16/><supref>Ramsay id="cite_ref-Ramsay_3-0"and class="reference"><a[[Bernard Silverman|Silverman]] (2005). ''Functional data analysis'', 2nd ed., New href="York&#cite_note160;: Springer, [[Special:BookSources/038740080X|ISBN 0-Ramsay387-3">[340080-X]]</a></supref>. One functional linear model regressing <math>Y(\cdot)</math> on <math>X(\cdot)</math> is given by
<math display="block">Y(s) = \beta_0(s) + \int_{\mathcal{T}_X} \beta(s,t) X^c(t)dt + \epsilon(s)</math>
where <math>s\in\mathcal{T}_Y</math>, <math>t\in\mathcal{T}_X</math>, <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> is still the centered functional covariate, <math>\beta_0(\cdot)</math> and <math>\beta(\cdot,\cdot)</math> are coefficient functions, and <math>\epsilon(\cdot)</math> is usually assumed to be a Gaussian process with mean zero. In this case, at any given time <math>s\in\mathcal{T}_Y</math>, the value of <math>Y</math>, i.e. <math>Y(s)</math>, depends on the entire trajectory of <math>X</math>. This model, for any given time <math>s</math>, is an extension of the traditional multivariate linear regression model by simply replacing the inner product in Euclidean space by that in <math>L^2</math> space. Thus, estimation of this model can be given by analogy to multivariate linear regression
Line 39:
== Functional nonlinear models ==
=== Functional polynomial models ===
Functional polynomial models is an extension of the FLMs, analogous to extending multivariate linear models to polynomial ones. For a scalar response <math>Y</math> and a functional covariate <math>X(\cdot)</math> defined on an interval <math>\mathcal{T}</math>, athe simplest example of functional polynomial models is functional quadratic regression<supref>Yao id="cite_ref-Yao_5-0"and M&uuml;ller (2010). class="reference"><aFunctional href=quadratic regression"#cite_note-Yao-5">[5]<. ''Biometrika''. '''97''' (1):49&ndash;64. http:/a>/www.jstor.org/stable/27798896</supref>
<math display="block">Y = \alpha + \int_\mathcal{T}\beta(t)X^c(t)dt + \int_\mathcal{T} \int_\mathcal{T} \gamma(s,t) X^c(s)X^c(t) dsdt + \epsilon</math>
where <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> is the centered functional covariate, <math>\alpha</math> is a scalar coefficient, <math>\beta(\cdot)</math> and <math>\gamma(\cdot,\cdot)</math> are coefficient functions defined on <math>\mathcal{T}</math> and <math>\mathcal{T}\times\mathcal{T}</math> respectively, and <math>\epsilon</math> is a random error with mean zero and variance finite. By analogy to FLMs, estimation of functional polynomial models can be obtained through expanding both the centered covariate <math>X^c</math> and the coefficient functions <math>\beta</math> and <math>\gamma</math> on an orthonormal basis. Then the model can be equivalently written as multivariate polynomial regression and thus the corresponding estimation is straightforward.
Line 46:
A functional multiple index model is given by
<math display="block">Y = g\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)dt, \cdots, \int_{\mathcal{T}} X^c(t) \beta_p(t)dt \right) + \epsilon.</math>
Taking <math>p=1</math> yields a functional single index model. However, this model is problematic due to [[Curse of dimensionality|curse of dimensionality]]. In other words, with <math>p>1</math> and relatively small sample sizes, this model often leads to high variability of the estimator<supref>Chen, id="cite_ref-Chen_4-0"Hall and M&uuml;ller (2011). class="reference"><aSingle href=and multiple index functional regression models with nonparametric link"#cite_note-Chen-4">[4]<. ''The Annals of Statistics''. '''39''' (3):1720&ndash;1747. http:/a>/www.jstor.org/stable/23033613</supref>. Alternatively, a preferable <math>p</math>-component functional multiple index model can be formed as
<math display="block">Y = g_1\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)dt\right)+ \cdots+ g_p\left(\int_{\mathcal{T}} X^c(t) \beta_p(t)dt \right) + \epsilon.</math>