Functional regression: Difference between revisions

Content deleted Content added
Ms.chen (talk | contribs)
No edit summary
Ms.chen (talk | contribs)
No edit summary
Line 10:
<math>Y = \beta_0 + \langle\mathbf{X},\beta\rangle + \epsilon</math>
where <math>\langle\cdot,\cdot\rangle</math> denotes the [[Inner product space|inner product]] in [[Euclidean space|Euclidean space]], <math>\beta_0\in\mathbb{R}</math> and <math>\beta\in\mathbb{R}^p</math> denote the regression coefficients, and <math>\epsilon</math> is a random error with [[Expected value|mean]] zero and [[Variance|variance]] finite. FLMs can be divided into three types based on responses and covariates.
 
=== Functional linear models with scalar response ===
Functional linear models with scalar response (also known as <a href="/wiki/Generalized_functional_linear_model" title="Generalized functional linear model">functional linear regression (FLR)</a>) can be given by replacing the scalar covariates $\mathbf{X}$ and the coefficient vector $\beta$ in the traditional multivariate linear model by a centered functional covariate $X^c(t) = X(t) - \mathbb{E}(X(t))$ and a coefficient function $\beta = \beta(t)$ for $t\in\mathcal{T}$ respectively
$$Y = \beta_0 + \langle X^c, \beta\rangle +\epsilon = \beta_0 + \int_\mathcal{T} X^c(t)\beta(t)dt + \epsilon$$
where $\langle \cdot, \cdot \rangle$ here denotes the inner product in $L^2$ space. One approach to estimating $\beta_0$ and $\beta(t)$ is to expand the covariate $X$ and the coefficient function $\beta(t)$ on the same <a href="/wiki/Basis_function" title="Basis function">functional basis</a>, such as <a href="/wiki/B-spline" title="B-spline">B-spline</a> basis or the eigenfunctions in the <a href="/wiki/Karhunen%E2%80%93Lo%C3%A8ve_theorem" title="Karhunen&ndash;Lo&egrave;ve theorem">Karhunen&ndash;Lo&egrave;ve expansion</a>. Suppose $\{\phi_k\}_{k=1}^\infty$ is an <a href="/wiki/Orthonormal_basis" title="Orthonormal basis">orthonormal basis</a> of the functional space. Then expansion of $X$ and $\beta$ on this basis can be expressed as $X^c(t) = \sum_{k=1}^\infty x_k \phi_k(t)$ and $\beta(t) = \sum_{k=1}^\infty \beta_k \phi_k(t)$ respectively. Then the FLR model is equivalent to the multivariate linear model of the form
$$Y = \beta_0 + \sum_{k=1}^\infty \beta_k x_k +\epsilon$$
where in implementation the infinite sum is replaced by a finite sum truncated at $K$
$$Y = \beta_0 + \sum_{k=1}^K \beta_k x_k +\epsilon$$
where $K\in\mathbb{N}$ is finite<sup id="cite_ref-Wang_1-0" class="reference"><a href="#cite_note-Wang-1">[1]</a></sup>.<br />
Adding multiple functional and scalar covariates, the FLR can be extended as
$$Y = \langle\mathbf{Z},\alpha\rangle + \sum_{j=1}^p \int_{\mathcal{T}_j} X_j^c(t) \beta_j(t) dt + \epsilon$$
where $\mathbf{Z}=(Z_1,\cdots,Z_q)^T$ with $Z_1=1$ is a vector of scalar covariates, $\alpha=(\alpha_1,\cdots,\alpha_q)^T$ is a vector of coefficients corresponding to $\mathbf{Z}$, $\langle\cdot,\cdot\rangle$ denotes the inner product in Euclidean space, $X^c_1,\cdots,X^c_p$ are multiple centered functional covariates given by $X_j^c(\cdot) = X_j(\cdot) - \mathbb{E}(X_j(\cdot))$, and $\mathcal{T}_j$ is the interval $X_j(\cdot)$ is defined on. However, due to the parametric component $\alpha$, the estimation of this model is different from that of the FLR. A possible approach to estimating $\alpha$ is through <a href="/wiki/Generalized_estimating_equation" title="Generalized estimating equation">generalized estimating equation</a> with the nonparametric part $ \sum_{j=1}^p \int_{\mathcal{T}_j} X_j^c(t) \beta_j(t) dt$ replaced by its estimate for a given $\alpha$.<sup id="cite_ref-Hu_2-0" class="reference"><a href="#cite_note-Hu-2">[2]</a></sup> Once $\alpha$ is estimated, one can apply any suitable consistent method to $Y-\langle\mathbf{Z}, \hat\alpha\rangle$ to estimate $\beta_j$s<sup id="cite_ref-Wang_1-1" class="reference"><a href="#cite_note-Wang-1">[1]</a></sup>.<br />
 
=== Functional linear models with functional response ===
For a function $Y(\cdot)$ on $\mathcal{T}_Y$ and a functional covariate $X(\cdot)$ on $\mathcal{T}_X$, two primary models have been considered<sup id="cite_ref-Wang_1-2" class="reference"><a href="#cite_note-Wang-1">[1]</a></sup><sup id="cite_ref-Ramsay_3-0" class="reference"><a href="#cite_note-Ramsay-3">[3]</a></sup>. One functional linear model regressing $Y(\cdot)$ on $X(\cdot)$ is given by
$$Y(s) = \beta_0(s) + \int_{\mathcal{T}_X} \beta(s,t) X^c(t)dt + \epsilon(s)$$
where $s\in\mathcal{T}_Y$, $t\in\mathcal{T}_X$, $X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))$ is still the centered functional covariate, $\beta_0(\cdot)$ and $\beta(\cdot,\cdot)$ are coefficient functions, and $\epsilon(\cdot)$ is usually assumed to be a Gaussian process with mean zero. In this case, at any given time $s\in\mathcal{T}_Y$, the value of $Y$, i.e. $Y(s)$, depends on the entire trajectory of $X$. This model, for any given time $s$, is an extension of the traditional multivariate linear regression model by simply replacing the inner product in Euclidean space by that in $L^2$ space. Thus, estimation of this model can be given by analogy to multivariate linear regression
$$r_{XY} = R_{XX}\beta, \text{ for } \beta\in L^2(\mathcal{T}_X\times\mathcal{T}_X)$$
where $r_{XY}(s,t) = \text{cov}(X(s),Y(t))$, $R_{XX}: L^2\times L^2 \rightarrow L^2\times L^2$ is defined as $(R_{XX}\beta)(s,t) = \int r_{XX}(s,w)\beta(w,t)dw$ with $r_{XX}(s,t) = \text{cov}(X(s),X(t))$. Furthermore, regularization is needed because $R_{XX}$ is a compact operator and its inverse is not bounded<sup id="cite_ref-Wang_1-3" class="reference"><a href="#cite_note-Wang-1">[1]</a></sup>.<br />
In particular, taking $X(\cdot)$ as a constant function gives a special case of this model
$$Y(s) = \sum_{j=1}^p X_j \beta_j(s) + \epsilon(s)$$
which is a FLM with functional response and scalar covariates.
 
==== Concurrent models ====
Assuming that $\mathcal{T}_X = \mathcal{T}_Y := \mathcal{T}$, another model called varying-coefficient model is of the form
$$Y(s) = \alpha_0(s) + \alpha(s)X(s)+\epsilon(s)$$
Note that this model assumes the value of $Y$ at time $s$, i.e. $Y(s)$, only depends on that of $X$ at the same time, $X(s)$, and thus is a concurrent regression model. A possible way to estimate $\alpha$ is a two-step procedure: (i) For any $s\in\mathcal{T}$ fixed, an estimate of $\alpha(s)$ can be computed by applying <a href="/wiki/Ordinary_least_squares" title="Ordinary least squares">ordinary least squares</a> to a neighborhood of $s$. Let the corresponding estimate be denoted by $\tilde\alpha(s)$. (ii) The final estimate $\hat\alpha$ is then obtained by smoothing $\tilde\alpha(s)$ with respect to $s$<sup id="cite_ref-Wang_1-4" class="reference"><a href="#cite_note-Wang-1">[1]</a></sup>.
 
== Functional nonlinear models ==
=== Functional polynomial models ===
Functional polynomial models is an extension of the FLMs, analogous to extending multivariate linear models to polynomial ones. For a scalar response $Y$ and a functional covariate $X(\cdot)$ defined on an interval $\mathcal{T}$, a simplest example of functional polynomial models is functional quadratic regression<sup id="cite_ref-Yao_5-0" class="reference"><a href="#cite_note-Yao-5">[5]</a></sup>
$$Y = \alpha + \int_\mathcal{T}\beta(t)X^c(t)dt + \int_\mathcal{T} \int_\mathcal{T} \gamma(s,t) X^c(s)X^c(t) dsdt + \epsilon$$
where $X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))$ is the centered functional covariate, $\alpha$ is a scalar coefficient, $\beta(\cdot)$ and $\gamma(\cdot,\cdot)$ are coefficient functions defined on $\mathcal{T}$ and $\mathcal{T}\times\mathcal{T}$ respectively, and $\epsilon$ is a random error with mean zero and variance finite. By analogy to FLMs, estimation of functional polynomial models can be obtained through expanding both the centered covariate $X^c$ and the coefficient functions $\beta$ and $\gamma$ on an orthonormal basis. Then the model can be equivalently written as multivariate polynomial regression and thus the corresponding estimation is straightforward.
 
=== Functional single and multiple index models ===
A functional multiple index model is given by
$$Y = g\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)dt, \cdots, \int_{\mathcal{T}} X^c(t) \beta_p(t)dt \right) + \epsilon.$$
Taking $p=1$ yields a functional single index model. However, this model is problematic due to <a href="/wiki/Curse_of_dimensionality" title="Curse of dimensionality">curse of dimensionality</a>. In other words, with $p>1$ and relatively small sample sizes, this model often leads to high variability of the estimator<sup id="cite_ref-Chen_4-0" class="reference"><a href="#cite_note-Chen-4">[4]</a></sup>. Alternatively, a preferable $p$-component functional multiple index model can be formed as
$$Y = g_1\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)dt\right)+ \cdots+ g_p\left(\int_{\mathcal{T}} X^c(t) \beta_p(t)dt \right) + \epsilon.$$
 
=== Functional additive models ===
Given an expansion of a functional covariate $X$ on an orthonormal basis $\{\phi_k\}_{k=1}^\infty$: $X(t) = \sum_{k=1}^\infty x_k \phi_k(t)$, a functional linear model with scalar response as stated before can be written as
$$\mathbb{E}(Y|X)=\mathbb{E}(Y) + \sum_{k=1}^\infty \beta_k x_k.$$
A functional additive model can be given by replacing the linear function of $x_k$ by a general smooth function $f_k$
$$\mathbb{E}(Y|X)=\mathbb{E}(Y) + \sum_{k=1}^\infty f_k(x_k)$$
where $f_k$ satisfies $\mathbb{E}(f_k(x_k))=0$ for $k\in\mathbb{N}$<sup id="cite_ref-Wang_1-5" class="reference"><a href="#cite_note-Wang-1">[1]</a></sup>.
 
== Extensions ==
A direct extension of functional linear models with scalar response is to add a link function to create a <a href="/wiki/Generalized_functional_linear_model" title="Generalized functional linear model">generalized functional linear model</a> (GFLM) by analogy to extending <a href="/wiki/Linear_regression" title="Linear regression">linear regression</a> to <a href="/wiki/Generalized_linear_model" title="Generalized linear model">generalized linear regression</a>
$$Y=g\left(\beta_0 + \int_{\mathcal{T}} X^c(t)\beta(t)dt\right) +\epsilon$$
where $g$ is a pre-specific link function.
 
== See also ==
* [[Functional principal component analysis|Functional principal component analysis]]
* [[Functional data analysis|Functional data analysis]]
* [[Generalized linear model|Generalized linear model]]
* [[Generalized functional linear model|Generalized functional linear model]]
* [[Karhunen&ndash;Lo&egrave;Loève theorem|Karhunen&ndash;Lo&egrave;ve theorem]]
* [[Stochastic processes|Stochastic processes]]
* [[Lp space|Lp space]]
 
== Further reading ==
* <ref>Morris (2015). Functional regression. ''Annual Review of Statistics and Its Application''. '''2''':321&ndash;359.</ref>