Functional regression: Difference between revisions

Content deleted Content added
Ms.chen (talk | contribs)
No edit summary
No edit summary
Line 5:
== Functional linear models (FLMs) ==
Functional linear models (FLMs) are an extension of [[Linear regression|linear models]] (LMs). A linear model with scalar response <math>Y\in\mathbb{R}</math> and scalar covariates <math>X\in\mathbb{R}^p</math> can be written as
{{NumBlk|::|<math display="block">Y = \beta_0 + \langle X,\beta\rangle + \epsilonvarepsilon,</math>|{{EquationRef|1}}}}
where <math>\langle\cdot,\cdot\rangle</math> denotes the [[Inner product space|inner product]] in [[Euclidean space|Euclidean space]], <math>\beta_0\in\mathbb{R}</math> and <math>\beta\in\mathbb{R}^p</math> denote the regression coefficients, and <math>\epsilonvarepsilon</math> is a random error with [[Expected value|mean]] zero and finite [[Variance|variance]]. FLMs can be divided into two types based on the responses.
 
=== Functional linear models with scalar responses ===
Functional linear models with scalar responses can be obtained by replacing the scalar covariates <math>X</math> and the coefficient vector <math>\beta</math> in model ({{EquationNote|1}}) by a centered functional covariate <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> and a coefficient function <math>\beta = \beta(\cdot)</math> with [[Domain of a function|___domain]] <math>\mathcal{T}</math>, respectively, and replacing the inner product in Euclidean space by that in [[Hilbert space]] [[Lp space|<math>L^2</math>]],
{{NumBlk|::|<math display="block">Y = \beta_0 + \langle X^c, \beta\rangle +\epsilonvarepsilon = \beta_0 + \int_\mathcal{T} X^c(t)\beta(t)\,dt + \epsilonvarepsilon,</math>|{{EquationRef|2}}}}
where <math>\langle \cdot, \cdot \rangle</math> here denotes the inner product in <math>L^2</math>. One approach to estimating <math>\beta_0</math> and <math>\beta(\cdot)</math> is to expand the centered covariate <math>X^c(\cdot)</math> and the coefficient function <math>\beta(\cdot)</math> in the same [[Basis function|functional basis]], for example, [[B-spline|B-spline]] basis or the eigenbasis used in the [[Karhunen&ndash;Lo&egrave;ve theorem|Karhunen&ndash;Lo&egrave;ve expansion]]. Suppose <math>\{\phi_k\}_{k=1}^\infty</math> is an [[Orthonormal basis|orthonormal basis]] of <math>L^2</math>. Expanding <math>X^c</math> and <math>\beta</math> in this basis, <math>X^c(\cdot) = \sum_{k=1}^\infty x_k \phi_k(\cdot)</math>, <math>\beta(\cdot) = \sum_{k=1}^\infty \beta_k \phi_k(\cdot)</math>, model ({{EquationNote|2}}) becomes
<math display="block">Y = \beta_0 + \sum_{k=1}^\infty \beta_k x_k +\epsilonvarepsilon.</math>
For implementation, regularization is needed and can be done through truncation, <math>L^2</math> penalization or <math>L^1</math> penalization.<ref name=morr:15>Morris (2015). "Functional regression". ''Annual Review of Statistics and Its Application''. '''2''':321&ndash;359. [[Digital object identifier|doi]]:[http://doi.org/10.1146/annurev-statistics-010814-020413 10.1146/annurev-statistics-010814-020413].</ref>. In addition, a [[Reproducing kernel Hilbert space|reproducing kernel Hilbert space]] (RKHS) approach can also be used to estimate <math>\beta_0</math> and <math>\beta(\cdot)</math> in model ({{EquationNote|2}})<ref>Yuan and Cai (2010). "A reproducing kernel Hilbert space approach to functional linear regression". ''The Annals of Statistics''. '''38''' (6):3412&ndash;3444. [[Digital object identifier|doi]]:[http://doi.org/10.1214/09-AOS772 10.1214/09-AOS772].</ref>
<br />
 
Adding multiple functional and scalar covariates, model ({{EquationNote|2}}) can be extended to
{{NumBlk|::|<math display="block">Y = \sum_{k=1}^q Z_k\alpha_k + \sum_{j=1}^p \int_{\mathcal{T}_j} X_j^c(t) \beta_j(t) \,dt + \epsilonvarepsilon,</math>|{{EquationRef|3}}}}
where <math>Z_1,\cdotsldots,Z_q</math> are scalar covariates with <math>Z_1=1</math>, <math>\alpha_1,\cdotsldots,\alpha_q</math> are regression coefficients for <math>Z_1,\cdotsldots,Z_q</math>, respectively, <math>X^c_j</math> is a centered functional covariate given by <math>X_j^c(\cdot) = X_j(\cdot) - \mathbb{E}(X_j(\cdot))</math>, <math>\beta_j</math> is regression coefficient function for <math>X_j^c(\cdot)</math>, and <math>\mathcal{T}_j</math> is the ___domain of <math>X_j</math> and <math>\beta_j</math>, for <math>j=1,\cdotsldots,p</math>. However, due to the parametric component <math>\alpha</math>, the estimation methods for model ({{EquationNote|2}}) cannot be used in this case<ref name=wang:16>Wang, Chiou and M&uuml;ller (2016). "Functional data analysis". ''Annual Review of Statistics and Its Application''. '''3''':257&ndash;295. [[Digital object identifier|doi]]:[http://doi.org/10.1146/annurev-statistics-041715-033624 10.1146/annurev-statistics-041715-033624].</ref> and alternative estimation methods for model ({{EquationNote|3}}) are available<ref>Kong, Xue, Yao and Zhang (2016). "Partially functional linear regression in high dimensions". ''Biometrika''. '''103''' (1):147&ndash;159. [[Digital object identifier|doi]]:[http://doi.org/10.1093/biomet/asv062 10.1093/biomet/asv062].</ref><ref>Hu, Wang and Carroll (2004). "Profile-kernel versus backfitting in the partially linear models for longitudinal/clustered data". ''Biometrika''. '''91''' (2): 251&ndash;262. [[Digital object identifier|doi]]:[http://doi.org/10.1093/biomet/91.2.251 10.1093/biomet/91.2.251].</ref>.<br />
 
=== Functional linear models with functional responses ===
For a functional response <math>Y(\cdot)</math> with ___domain <math>\mathcal{T}</math> and a functional covariate <math>X(\cdot)</math> with ___domain <math>\mathcal{S}</math>, two FLMs regressing <math>Y(\cdot)</math> on <math>X(\cdot)</math> have been considered<ref name=wang:16/><ref>Ramsay and [[Bernard Silverman|Silverman]] (2005). ''Functional data analysis'', 2nd ed., New York&#160;: Springer, [[Special:BookSources/038740080X|ISBN 0-387-40080-X]].</ref>. One of these two models is of the form
{{NumBlk|::|<math display="block">Y(t) = \beta_0(t) + \int_{\mathcal{S}} \beta(s,t) X^c(s)\,ds + \epsilonvarepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>|{{EquationRef|4}}}}
where <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> is still the centered functional covariate, <math>\beta_0(\cdot)</math> and <math>\beta(\cdot,\cdot)</math> are coefficient functions, and <math>\epsilonvarepsilon(\cdot)</math> is usually assumed to be a random process with mean zero and finite variance. In this case, at any given time <math>t\in\mathcal{T}</math>, the value of <math>Y</math>, i.e., <math>Y(t)</math>, depends on the entire trajectory of <math>X</math>. Model ({{EquationNote|4}}), for any given time <math>t</math>, is an extension of [[Multivariate linear regression|multivariate linear regression]] with the inner product in Euclidean space replaced by that in <math>L^2</math>. An estimating equation motivated by multivariate linear regression is
<math display="block">r_{XY} = R_{XX}\beta, \text{ for } \beta\in L^2(\mathcal{S}\times\mathcal{S}),</math>
where <math>r_{XY}(s,t) = \text{cov}(X(s),Y(t))</math>, <math>R_{XX}: L^2(\mathcal{S}\times\mathcal{S}) \rightarrow L^2(\mathcal{S}\times\mathcal{T})</math> is defined as <math>(R_{XX}\beta)(s,t) = \int_\mathcal{S} r_{XX}(s,w)\beta(w,t)dw</math> with <math>r_{XX}(s,w) = \text{cov}(X(s),X(w))</math> for <math>s,w\in\mathcal{S}</math><ref name=wang:16/>. Regularization is needed and can be done through truncation, <math>L^2</math> penalization or <math>L^1</math> penalization<ref name=morr:15/>. Various estimation methods for model ({{EquationNote|4}}) are available<ref>Ramsay and Dalzell (1991). "Some tools for functional data analysis". ''Journal of the Royal Statistical Society. Series B (Methodological)''. '''53''' (3):539&ndash;572. http://www.jstor.org/stable/2345586.</ref><ref>Yao, M&uuml;ller and Wang (2005). "Functional linear regression analysis for longitudinal data". ''The Annals of Statistics''. '''33''' (6):2873&ndash;2903. [[Digital object identifier|doi]]:[http://doi.org/10.1214/009053605000000660 10.1214/009053605000000660].</ref>.<br />
When <math>X</math> and <math>Y</math> are concurrently observed, i.e., <math>\mathcal{S}=\mathcal{T}</math><ref>Grenander (1950). "Stochastic processes and statistical inference". ''Arkiv Matematik''. '''1''' (3):195&ndash;277. [[Digital object identifier|doi]]:[http://doi.org/10.1007/BF02590638 10.1007/BF02590638].</ref>, it is reasonable to consider a historical functional linear model, where the current value of <math>Y</math> only depends on the history of <math>X</math>, i.e., <math>\beta(s,t)=0</math> for <math>s>t</math> in model ({{EquationNote|4}})<ref name=wang:16/><ref>Malfait and Ramsay (2003). "The historical functional linear model". ''Canadian Journal of Statistics''. '''31''' (2):115&ndash;128. [[Digital object identifier|doi]]:[http://doi.org/10.2307/3316063 10.2307/3316063].</ref>. A simpler version of the historical functional linear model is the functional concurrent model (see below).<br />
Adding multiple functional covariates, model ({{EquationNote|4}}) can be extended to
{{NumBlk|::|<math display="block">Y(t) = \beta_0(t) + \sum_{j=1}^p\int_{\mathcal{S}_j} \beta_j(s,t) X^c_j(s)\,ds + \epsilonvarepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>|{{EquationRef|5}}}}
where for <math>j=1,\cdotsldots,p</math>, <math>X_j^c(\cdot)=X_j(\cdot) - \mathbb{E}(X_j(\cdot))</math> is a centered functional covariate with ___domain <math>\mathcal{S}_j</math>, and <math>\beta_j(\cdot,\cdot)</math> is the corresponding coefficient function with the same ___domain, respectively<ref name=wang:16/>. In particular, taking <math>X_j(\cdot)</math> as a constant function yields a special case of model ({{EquationNote|5}})
<math display="block">Y(t) = \sum_{j=1}^p X_j \beta_j(t) + \epsilonvarepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>
which is a FLM with functional responses and scalar covariates.<br />
 
==== Functional concurrent models ====
Assuming that <math>\mathcal{S} = \mathcal{T}</math>, another model, known as the functional concurrent model, sometimes also referred to as the varying-coefficient model, is of the form
{{NumBlk|::|<math display="block">Y(t) = \alpha_0(t) + \alpha(t)X(t)+\epsilonvarepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>|{{EquationRef|6}}}}
where <math>\alpha_0</math> and <math>\alpha</math> are coefficient functions. Note that model ({{EquationNote|6}}) assumes the value of <math>Y</math> at time <math>t</math>, i.e., <math>Y(t)</math>, only depends on that of <math>X</math> at the same time, i.e., <math>X(t)</math>. Various estimation methods can be applied to model ({{EquationNote|6}})<ref>Fan and Zhang (1999). "Statistical estimation in varying coefficient models". ''The Annals of Statistics''. '''27''' (5):1491&ndash;1518. [[Digital object identifier|doi]]:[http://doi.org/10.1214/aos/1017939139 10.1214/aos/1017939139].</ref><ref>Huang, Wu and Zhou (2004). "Polynomial spline estimation and inference for varying coefficient models with longitudinal data". ''Biometrika''. '''14''' (3):763&ndash;788. http://www.jstor.org/stable/24307415.</ref><ref>&#350;ent&uuml;rk and M&uuml;ller (2010). "Functional varying coefficient models for longitudinal data". ''Journal of the American Statistical Association''. '''105''' (491):1256&ndash;1264. [[Digital object identifier|doi]]:[http://doi.org/10.1198/jasa.2010.tm09228 10.1198/jasa.2010.tm09228].</ref>.<br />
Adding multiple functional covariates, model ({{EquationNote|6}}) can also be extended to
<math display="block">Y(t) = \alpha_0(t) + \sum_{j=1}^p\alpha_j(t)X_j(t)+\epsilonvarepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>
where <math>X_1,\cdotsldots,X_p</math> are multiple functional covariates with ___domain <math>\mathcal{T}</math> and <math>\alpha_0,\alpha_1,\cdotsldots,\alpha_p</math> are the coefficient functions with the same ___domain.<ref name=wang:16/>
 
== Functional nonlinear models ==
=== Functional polynomial models ===
Functional polynomial models are an extension of the FLMs with scalar responses, analogous to extending linear regression to [[Polynomial regression|polynomial regression]]. For a scalar response <math>Y</math> and a functional covariate <math>X(\cdot)</math> with ___domain <math>\mathcal{T}</math>, the simplest example of functional polynomial models is functional quadratic regression<ref name=yao:10>Yao and M&uuml;ller (2010). "Functional quadratic regression". ''Biometrika''. '''97''' (1):49&ndash;64. [[Digital object identifier|doi]]:[http://doi.org/10.1093/biomet/asp069 10.1093/biomet/asp069].</ref>
<math display="block">Y = \alpha + \int_\mathcal{T}\beta(t)X^c(t)\,dt + \int_\mathcal{T} \int_\mathcal{T} \gamma(s,t) X^c(s)X^c(t) dsdt\,ds\,dt + \epsilonvarepsilon,</math>
where <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> is the centered functional covariate, <math>\alpha</math> is a scalar coefficient, <math>\beta(\cdot)</math> and <math>\gamma(\cdot,\cdot)</math> are coefficient functions with domains <math>\mathcal{T}</math> and <math>\mathcal{T}\times\mathcal{T}</math>, respectively, and <math>\epsilonvarepsilon</math> is a random error with mean zero and finite variance. By analogy to FLMs with scalar responses, estimation of functional polynomial models can be obtained through expanding both the centered covariate <math>X^c</math> and the coefficient functions <math>\beta</math> and <math>\gamma</math> in an orthonormal basis.<ref name=yao:10/>
 
=== Functional single and multiple index models ===
A functional multiple index model is given by
<math display="block">Y = g\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)\,dt, \cdotsldots, \int_{\mathcal{T}} X^c(t) \beta_p(t)\,dt \right) + \epsilonvarepsilon.</math>
Taking <math>p=1</math> yields a functional single index model. However, for <math>p>1</math>, this model is problematic due to [[Curse of dimensionality|curse of dimensionality]]. With <math>p>1</math> and relatively small sample sizes, the estimator given by this model often has large variance.<ref name=chen:11>Chen, Hall and M&uuml;ller (2011). "Single and multiple index functional regression models with nonparametric link". ''The Annals of Statistics''. '''39''' (3):1720&ndash;1747. [[Digital object identifier|doi]]:[http://doi.org/10.1214/11-AOS882 10.1214/11-AOS882].</ref>. An alternative <math>p</math>-component functional multiple index model can be expressed as
<math display="block">Y = g_1\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)\,dt\right)+ \cdots+ g_p\left(\int_{\mathcal{T}} X^c(t) \beta_p(t)\,dt \right) + \epsilonvarepsilon.</math>
Estimation methods for functional single and multiple index models are available<ref name=chen:11/><ref>Jiang and Wang (2011). "Functional single index models for longitudinal data". '''39''' (1):362&ndash;388. [[Digital object identifier|doi]]:[http://doi.org/10.1214/10-AOS845 10.1214/10-AOS845].</ref>.
 
Line 60:
<math display="block">\mathbb{E}(Y|X)=\mathbb{E}(Y) + \sum_{k=1}^\infty f_k(x_k),</math>
where <math>f_k</math> satisfies <math>\mathbb{E}(f_k(x_k))=0</math> for <math>k\in\mathbb{N}</math><ref name=wang:16/><ref>M&uuml;ller and Yao (2008). "Functional additive models". ''Journal of the American Statistical Association''. '''103''' (484):1534&ndash;1544. [[Digital object identifier|doi]]:[http://doi.org/10.1198/016214508000000751 10.1198/016214508000000751].</ref>. Another form of FAMs consists of a sequence of time-additive models:
<math display="block">\mathbb{E}(Y|X(t_1),\cdotsldots,X(t_p))=\sum_{j=1}^p f_j(X(t_j)),</math>
where <math>\{t_1,\cdotsldots,t_p\}</math> is a dense grid on <math>\mathcal{T}</math> with increasing size <math>p\in\mathbb{N}</math>, and <math>f_j(x) = g(t_j,x)</math> with <math>g</math> a smooth function, for <math>j=1,\cdotsldots,p</math><ref name=wang:16/><ref>Fan, James and Radchenko (2015). "Functional additive regression". ''The Annals of Statistics''. '''43''' (5):2296&ndash;2325. [[Digital object identifier|doi]]:[http://doi.org/10.1214/15-AOS1346 10.1214/15-AOS1346].</ref>
 
== Extensions ==
A direct extension of FLMs with scalar responses shown in model ({{EquationNote|2}}) is to add a link function to create a [[Generalized functional linear model|generalized functional linear model]] (GFLM) by analogy to extending [[Linear regression|linear regression]] to [[Generalized linear model|generalized linear regression]] (GLM), of which the three components are:
# Linear predictor <math>\eta = \beta_0 + \int_{\mathcal{T}} X^c(t)\beta(t)\,dt</math>;
# [[Variance function]] <math>\text{Var}(Y|X) = V(\mu)</math>, where <math>\mu = \mathbb{E}(Y|X)</math> is the [[Conditional expectation|conditional mean]];
# Link function <math>g</math> connecting the conditional mean and the linear predictor through <math>\mu=g(\eta)</math>.