Functional regression: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(13 intermediate revisions by 2 users not shown)
Line 18:
Adding multiple functional and scalar covariates, model ({{EquationNote|2}}) can be extended to
{{NumBlk|::|<math display="block">Y = \sum_{k=1}^q Z_k\alpha_k + \sum_{j=1}^p \int_{\mathcal{T}_j} X_j^c(t) \beta_j(t) \,dt + \varepsilon,</math>|{{EquationRef|3}}}}
where <math>Z_1,\ldots,Z_q</math> are scalar covariates with <math>Z_1=1</math>, <math>\alpha_1,\ldots,\alpha_q</math> are regression coefficients for <math>Z_1,\ldots,Z_q</math>, respectively, <math>X^c_j</math> is a centered functional covariate given by <math>X_j^c(\cdot) = X_j(\cdot) - \mathbb{E}(X_j(\cdot))</math>, <math>\beta_j</math> is regression coefficient function for <math>X_j^c(\cdot)</math>, and <math>\mathcal{T}_j</math> is the ___domain of <math>X_j</math> and <math>\beta_j</math>, for <math>j=1,\ldots,p</math>. However, due to the parametric component <math>\alpha</math>, the estimation methods for model ({{EquationNote|2}}) cannot be used in this case<ref name=wang:16>{{cite journal|doi=10.1146/annurev-statistics-041715-033624|title=Functional Data Analysis|year=2016|last1=Wang|first1=Jane-Ling|last2=Chiou|first2=Jeng-Min|last3=Müller|first3=Hans-Georg|journal=[[Annual Review of Statistics and Its Application]]|volume=3|issue=1|pages=257–295|bibcode=2016AnRSA...3..257W|url=https://zenodo.org/record/895750|doi-access=free}}</ref> and alternative estimation methods for model ({{EquationNote|3}}) are available.<ref>{{Cite journal |last=Kong |first=Dehan |last2=Xue |first2=Kaijie |last3=Yao |first3=Fang |last4=Zhang |first4=Hao H. |date= |title=Partially functional linear regression in high dimensions |url=https://academic.oup.com/biomet/article-lookup/doi/10.1093/biomet/asv062 |journal=Biometrika |language=en |volume=103 |issue=1 |pages=147–159 |doi=10.1093/biomet/asv062 |issn=0006-3444|url-access=subscription }}</ref><ref>Hu,{{Cite Wangjournal and|last=Hu Carroll|first=Z. (|date=2004).-06-01 "|title=Profile-kernel versus backfitting in the partially linear models for longitudinal/clustered data". ''Biometrika''. '''91''' (2): 251&ndash;262. [[Digital object identifier|doi]]:[httpurl=https://doiacademic.orgoup.com/biomet/article-lookup/doi/10.1093/biomet/91.2.251 |journal=Biometrika |language=en |volume=91 |issue=2 |pages=251–262 |doi=10.1093/biomet/91.2.251]. |issn=0006-3444|url-access=subscription }}</ref>
 
=== Functional linear models with functional responses ===
Line 25:
where <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> is still the centered functional covariate, <math>\beta_0(\cdot)</math> and <math>\beta(\cdot,\cdot)</math> are coefficient functions, and <math>\varepsilon(\cdot)</math> is usually assumed to be a random process with mean zero and finite variance. In this case, at any given time <math>t\in\mathcal{T}</math>, the value of <math>Y</math>, i.e., <math>Y(t)</math>, depends on the entire trajectory of <math>X</math>. Model ({{EquationNote|4}}), for any given time <math>t</math>, is an extension of [[multivariate linear regression]] with the inner product in Euclidean space replaced by that in <math>L^2</math>. An estimating equation motivated by multivariate linear regression is
<math display="block">r_{XY} = R_{XX}\beta, \text{ for } \beta\in L^2(\mathcal{S}\times\mathcal{S}),</math>
where <math>r_{XY}(s,t) = \text{cov}(X(s),Y(t))</math>, <math>R_{XX}: L^2(\mathcal{S}\times\mathcal{S}) \rightarrow L^2(\mathcal{S}\times\mathcal{T})</math> is defined as <math>(R_{XX}\beta)(s,t) = \int_\mathcal{S} r_{XX}(s,w)\beta(w,t)dw</math> with <math>r_{XX}(s,w) = \text{cov}(X(s),X(w))</math> for <math>s,w\in\mathcal{S}</math>.<ref name=wang:16/> Regularization is needed and can be done through truncation, <math>L^2</math> penalization or <math>L^1</math> penalization.<ref name=morr:15/> Various estimation methods for model ({{EquationNote|4}}) are available.<ref>{{Cite journal |last=Ramsay |first=J. O. |last2=Dalzell |first2=C. J. |date=1991 |title=Some Tools for Functional Data Analysis |url=https://www.jstor.org/stable/2345586 |journal=Journal of the Royal Statistical Society. Series B (Methodological) |volume=53 |issue=3 |pages=539–572 |issn=0035-9246}}</ref><ref>{{Cite journal |last=Yao, |first=Fang |last2=Müller and|first2=Hans-Georg |last3=Wang (2005).|first3=Jane-Ling "|date= |title=Functional linear regression analysis for longitudinal data" |url=https://projecteuclid.org/journals/annals-of-statistics/volume-33/issue-6/Functional-linear-regression-analysis-for-longitudinal-data/10.1214/009053605000000660.full ''|journal=The Annals of Statistics''. '''|volume=33''' (|issue=6):2873&ndash;2903. [[Digital|pages=2873–2903 object identifier|doi]]:[http://doi.org/=10.1214/009053605000000660 10.1214|issn=0090-5364|arxiv=math/009053605000000660].0603132 }}</ref><br />
When <math>X</math> and <math>Y</math> are concurrently observed, i.e., <math>\mathcal{S}=\mathcal{T}</math>,<ref>{{Cite journal |last=Grenander (1950).|first=Ulf "|date= |title=Stochastic processes and statistical inference". ''Arkiv Matematik''. '''1''' (3):195&ndash;277. [[Digital object identifier|doi]]:[httpurl=https://doiprojecteuclid.org/journals/arkiv-for-matematik/volume-1/issue-3/Stochastic-processes-and-statistical-inference/10.1007/BF02590638.full |journal=Arkiv för Matematik |volume=1 |issue=3 |pages=195–277 |doi=10.1007/BF02590638]. |issn=0004-2080}}</ref> it is reasonable to consider a historical functional linear model, where the current value of <math>Y</math> only depends on the history of <math>X</math>, i.e., <math>\beta(s,t)=0</math> for <math>s>t</math> in model ({{EquationNote|4}}).<ref name=wang:16/><ref>{{Cite journal |last=Malfait and|first=Nicole |last2=Ramsay (2003)|first2=James O. "|date=2003 |title=The historical functional linear model" |url=https://onlinelibrary.wiley.com/doi/10.2307/3316063 ''|journal=Canadian Journal of Statistics''. '''|language=en |volume=31''' (|issue=2):115&ndash;128. [[Digital|pages=115–128 object identifier|doi]]:[http://doi.org/=10.2307/3316063 10.2307/3316063].|issn=1708-945X|url-access=subscription }}</ref> A simpler version of the historical functional linear model is the functional concurrent model (see below).<br />
Adding multiple functional covariates, model ({{EquationNote|4}}) can be extended to
{{NumBlk|::|<math display="block">Y(t) = \beta_0(t) + \sum_{j=1}^p\int_{\mathcal{S}_j} \beta_j(s,t) X^c_j(s)\,ds + \varepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>|{{EquationRef|5}}}}
Line 36:
Assuming that <math>\mathcal{S} = \mathcal{T}</math>, another model, known as the functional concurrent model, sometimes also referred to as the varying-coefficient model, is of the form
{{NumBlk|::|<math display="block">Y(t) = \alpha_0(t) + \alpha(t)X(t)+\varepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>|{{EquationRef|6}}}}
where <math>\alpha_0</math> and <math>\alpha</math> are coefficient functions. Note that model ({{EquationNote|6}}) assumes the value of <math>Y</math> at time <math>t</math>, i.e., <math>Y(t)</math>, only depends on that of <math>X</math> at the same time, i.e., <math>X(t)</math>. Various estimation methods can be applied to model ({{EquationNote|6}}).<ref>{{Cite journal |last=Fan |first=Jianqing |last2=Zhang |first2=Wenyang |date= |title=Statistical estimation in varying coefficient models |url=https://projecteuclid.org/journals/annals-of-statistics/volume-27/issue-5/Statistical-estimation-in-varying-coefficient-models/10.1214/aos/1017939139.full |journal=The Annals of Statistics |volume=27 |issue=5 |pages=1491–1518 |doi=10.1214/aos/1017939139 |issn=0090-5364}}</ref><ref>{{Cite journal |last=Huang, |first=Jianhua Z. |last2=Wu and|first2=Colin O. |last3=Zhou (|first3=Lan |date=2004). "|title=Polynomial splineSpline estimationEstimation and inferenceInference for varyingVarying coefficientCoefficient modelsModels with longitudinalLongitudinal data". ''Biometrika''. '''14''' (3):763&ndash;788.Data |url=https://www.jstor.org/stable/24307415. |journal=Statistica Sinica |volume=14 |issue=3 |pages=763–788 |issn=1017-0405}}</ref><ref>{{Cite journal |last=Şentürk and|first=Damla |last2=Müller (|first2=Hans-Georg |date=2010).-09-01 "|title=Functional varyingVarying coefficientCoefficient modelsModels for longitudinalLongitudinal data".Data ''Journal of the American Statistical Association''. '''105''' (491):1256&ndash;1264. [[Digital object identifier|doi]]:[httpurl=https://doiwww.orgtandfonline.com/doi/abs/10.1198/jasa.2010.tm09228 |journal=Journal of the American Statistical Association |doi=10.1198/jasa.2010.tm09228]. |issn=0162-1459|url-access=subscription }}</ref><br />
Adding multiple functional covariates, model ({{EquationNote|6}}) can also be extended to
<math display="block">Y(t) = \alpha_0(t) + \sum_{j=1}^p\alpha_j(t)X_j(t)+\varepsilon(t),\ \text{for}\ t\in\mathcal{T},</math>
Line 43:
== Functional nonlinear models ==
=== Functional polynomial models ===
Functional polynomial models are an extension of the FLMs with scalar responses, analogous to extending linear regression to [[polynomial regression]]. For a scalar response <math>Y</math> and a functional covariate <math>X(\cdot)</math> with ___domain <math>\mathcal{T}</math>, the simplest example of functional polynomial models is functional quadratic regression<ref name="yao:10">Yao{{Cite andjournal Müller|last=Yao (2010)|first=F. "Functional|last2=Muller quadratic regression"|first2=H. ''Biometrika''-G. '''97'''|date=2010-03-01 (1):49&ndash;64.|title=Functional [[Digitalquadratic objectregression identifier|doi]]:[httpurl=https://doiacademic.orgoup.com/biomet/article-lookup/doi/10.1093/biomet/asp069 |journal=Biometrika |language=en |volume=97 |issue=1 |pages=49–64 |doi=10.1093/biomet/asp069]. |issn=0006-3444|url-access=subscription }}</ref>
<math display="block">Y = \alpha + \int_\mathcal{T}\beta(t)X^c(t)\,dt + \int_\mathcal{T} \int_\mathcal{T} \gamma(s,t) X^c(s)X^c(t) \,ds\,dt + \varepsilon,</math>
where <math>X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))</math> is the centered functional covariate, <math>\alpha</math> is a scalar coefficient, <math>\beta(\cdot)</math> and <math>\gamma(\cdot,\cdot)</math> are coefficient functions with domains <math>\mathcal{T}</math> and <math>\mathcal{T}\times\mathcal{T}</math>, respectively, and <math>\varepsilon</math> is a random error with mean zero and finite variance. By analogy to FLMs with scalar responses, estimation of functional polynomial models can be obtained through expanding both the centered covariate <math>X^c</math> and the coefficient functions <math>\beta</math> and <math>\gamma</math> in an orthonormal basis.<ref name=yao:10/>
Line 50:
A functional multiple index model is given by
<math display="block">Y = g\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)\,dt, \ldots, \int_{\mathcal{T}} X^c(t) \beta_p(t)\,dt \right) + \varepsilon.</math>
Taking <math>p=1</math> yields a functional single index model. However, for <math>p>1</math>, this model is problematic due to [[curse of dimensionality]]. With <math>p>1</math> and relatively small sample sizes, the estimator given by this model often has large variance.<ref name="chen:11">{{Cite journal |last=Chen, |first=Dong |last2=Hall and|first2=Peter |last3=Müller (2011).|first3=Hans-Georg "|date= |title=Single and multiple index functional regression models with nonparametric link" |url=https://projecteuclid.org/journals/annals-of-statistics/volume-39/issue-3/Single-and-multiple-index-functional-regression-models-with-nonparametric-link/10.1214/11-AOS882.full ''|journal=The Annals of Statistics''. '''|volume=39''' (|issue=3):1720&ndash;1747. [[Digital|pages=1720–1747 object identifier|doi]]:[http://doi.org/=10.1214/11-AOS882 10.1214/11|issn=0090-AOS882]5364|arxiv=1211.5018 }}</ref> An alternative <math>p</math>-component functional multiple index model can be expressed as
<math display="block">Y = g_1\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)\,dt\right)+ \cdots+ g_p\left(\int_{\mathcal{T}} X^c(t) \beta_p(t)\,dt \right) + \varepsilon.</math>
Estimation methods for functional single and multiple index models are available.<ref name=chen:11/><ref>{{Cite journal |last=Jiang and|first=Ci-Ren |last2=Wang (2011).|first2=Jane-Ling "|date= |title=Functional single index models for longitudinal data". '''39''' (1):362&ndash;388. [[Digital object identifier|doi]]:[httpurl=https://doiprojecteuclid.org/journals/annals-of-statistics/volume-39/issue-1/Functional-single-index-models-for-longitudinal-data/10.1214/10-AOS845.full |journal=The Annals of Statistics |volume=39 |issue=1 |pages=362–388 |doi=10.1214/10-AOS845] |issn=0090-5364|arxiv=1103.1726 }}</ref>
 
=== Functional additive models (FAMs) ===
Line 59:
One form of FAMs is obtained by replacing the linear function of <math>x_k</math>, i.e., <math>\beta_k x_k</math>, by a general smooth function <math>f_k</math>,
<math display="block">\mathbb{E}(Y|X)=\mathbb{E}(Y) + \sum_{k=1}^\infty f_k(x_k),</math>
where <math>f_k</math> satisfies <math>\mathbb{E}(f_k(x_k))=0</math> for <math>k\in\mathbb{N}</math>.<ref name=wang:16/><ref>{{Cite journal |last=Müller and|first=Hans-Georg |last2=Yao (|first2=Fang |date=2008).-12-01 "|title=Functional additiveAdditive models"Models |url=https://www.tandfonline.com/doi/abs/10.1198/016214508000000751 ''|journal=Journal of the American Statistical Association''. '''103''' (484):1534&ndash;1544. [[Digital object identifier|doi]]:[http://doi.org/=10.1198/016214508000000751 10.1198/016214508000000751].|issn=0162-1459|url-access=subscription }}</ref> Another form of FAMs consists of a sequence of time-additive models:
<math display="block">\mathbb{E}(Y|X(t_1),\ldots,X(t_p))=\sum_{j=1}^p f_j(X(t_j)),</math>
where <math>\{t_1,\ldots,t_p\}</math> is a dense grid on <math>\mathcal{T}</math> with increasing size <math>p\in\mathbb{N}</math>, and <math>f_j(x) = g(t_j,x)</math> with <math>g</math> a smooth function, for <math>j=1,\ldots,p</math><ref name=wang:16/><ref>{{Cite journal |last=Fan, |first=Yingying |last2=James and|first2=Gareth M. |last3=Radchenko (2015).|first3=Peter "|date= |title=Functional additive regression" |url=https://projecteuclid.org/journals/annals-of-statistics/volume-43/issue-5/Functional-additive-regression/10.1214/15-AOS1346.full ''|journal=The Annals of Statistics''. '''|volume=43''' (|issue=5):2296&ndash;2325. [[Digital|pages=2296–2325 object identifier|doi]]:[http://doi.org/=10.1214/15-AOS1346 10.1214/15|issn=0090-AOS1346]5364|arxiv=1510.04064 }}</ref>
 
== Extensions ==