Error correction model: Difference between revisions

Content deleted Content added
No edit summary
Engle and Granger 2-step approach: Demanded clarification
Tags: Mobile edit Mobile web edit
 
(14 intermediate revisions by 9 users not shown)
Line 1:
{{Short description|typeType of time series model}}
An '''error correction model''' ('''ECM)''') belongs to a category of multiple [[time series]] models most commonly used for data where the underlying variables have a long-run common stochastic trend, also known as [[cointegration]]. ECMs are a theoretically-driven approach useful for estimating both short-term and long-term effects of one time series on another. The term error-correction relates to the fact that last-period's deviation from a long-run equilibrium, the ''error'', influences its short-run dynamics. Thus ECMs directly estimate the speed at which a dependent variable returns to equilibrium after a change in other variables.
 
==History of ECM==
[[Udny Yule|Yule]] (1926) and [[Clive Granger|Granger]] and [[Paul Newbold|Newbold]] (1974) were the first to draw attention to the problem of [[spurious correlation]] and find solutions on how to address it in time series analysis.<ref>{{cite journal|last1=Yule|first1=Georges Udny|title=Why do we sometimes get nonsense correlations between time series? – A study in sampling and the nature of time-series|journal=Journal of the Royal Statistical Society|date=1926|volume=89|issue=1|pages=1–63|doi=10.2307/2341482 |jstor=2341482 }}</ref><ref>{{cite journal |lastlast1=Granger |firstfirst1=C.W.J. |first2=P.|last2=Newbold |year=1978 |title=Spurious regressions in Econometrics | volume=2| issue=2| journal=[[Journal of Econometrics]] |pages=111–120 |doi=10.1016/0304-4076(74)90034-7 |jstor=2231972 }}</ref> Given two completely unrelated but integrated (non-stationary) time series, the [[regression analysis]] of one on the other will tend to produce an apparently statistically significant relationship and thus a researcher might falsely believe to have found evidence of a true relationship between these variables. [[Ordinary least squares]] will no longer be consistent and commonly used test-statistics will be non-valid. In particular, [[Monte Carlo method|Monte Carlo simulations]] show that one will get a very high [[coefficient of determination|R squared]], very high individual [[t-statistic]] and a low [[Durbin–Watson statistic]]. Technically speaking, Phillips (1986) proved that parameter estimates will not [[Convergence in probability|converge in probability]], the [[Y-intercept|intercept]] will diverge and the slope will have a non-degenerate distribution as the sample size increases.<ref>{{cite journal|last1=Phillips|first1=Peter C.B.|title=Understanding Spurious Regressions in Econometrics|journal=Cowles Foundation Discussion Papers 757|date=1985|url=http://cowles.yale.edu/sites/default/files/files/pub/d07/d0757.pdf|publisher=Cowles Foundation for Research in Economics, Yale University}}</ref> However, there might be a common [[cointegration|stochastic trend]] to both series that a researcher is genuinely interested in because it reflects a long-run relationship between these variables.
 
Because of the stochastic nature of the trend it is not possible to break up integrated series into a deterministic (predictable) [[trend-stationary process|trend]] and a stationary series containing deviations from trend. Even in deterministically detrended [[random walk]]s spurious correlations will eventually emerge. Thus detrending does not solve the estimation problem.
Line 9:
In order to still use the [[Box–Jenkins|Box–Jenkins approach]], one could difference the series and then estimate models such as [[ARIMA]], given that many commonly used time series (e.g. in economics) appear to be stationary in first differences. Forecasts from such a model will still reflect cycles and seasonality that are present in the data. However, any information about long-run adjustments that the data in levels may contain is omitted and longer term forecasts will be unreliable.
 
This led [[John Denis Sargan|Sargan]] (1964) to develop the ECM methodology, which retains the level information.<ref>Sargan, J. D. (1964). "Wages and Prices in the United Kingdom: A Study in Econometric Methodology", 16, 25–54. in ''Econometric Analysis for National Economic Planning'', ed. by P. E. Hart, G. Mills, and J. N. Whittaker. London: Butterworths</ref><ref>{{cite journal |lastlast1=Davidson |firstfirst1=J. E. H. |first2=D. F. |last2=Hendry |authorlink2author-link2=David Forbes Hendry |first3=F. |last3=Srba |first4=J. S. |last4=Yeo |year=1978 |title=Econometric modelling of the aggregate time-series relationship between consumers' expenditure and income in the United Kingdom |journal=[[Economic Journal]] |volume=88 |issue=352 |pages=661–692 |doi=10.2307/2231972 |jstor=2231972 }}</ref>
 
==Estimation==
Several methods are known in the literature for estimating a refined dynamic model as described above. Among these are the [[Robert F. Engle|Engle]] and Granger 2-step approach, estimating their ECM in one step and the vector-based VECM using [[Johansen test|Johansen's method]].<ref>{{cite journal |lastlast1=Engle |firstfirst1=Robert F. |last2=Granger |first2=Clive W. J. |year=1987 |title=Co-integration and error correction: Representation, estimation and testing |journal=[[Econometrica]] |volume=55 |issue=2 |pages=251–276 |doi=10.2307/1913236 |jstor=1913236 |url=http://pe.cemi.rssi.ru/pe_2015_3_106-135.pdf }}</ref>
 
===Engle and Granger 2-step approach===
The first step of this method is to pretest the individual time series one uses in order to confirm that they are [[Stationary process|non-stationary]] in the first place. This can be done by standard [[unit root]] [[Dickey–Fuller test|DF]] testing and [[ADF test]] (to resolve the problem of serially correlated errors).
Take the case of two different series <math>x_t</math> and <math>y_t</math>. If both are I(0), standard regression analysis will be valid. If they are integrated of a different order, e.g. one being I(1) and the other being I(0), one has to transform the model.
 
Line 21:
 
: <math> A(L) \, \Delta y_t = \gamma + B(L) \, \Delta x_t + \alpha (y_{t-1} -\beta_0 - \beta_1 x_{t-1} ) + \nu_t. </math>
 
[define A and B]
 
''If'' both variables are integrated and this ECM exists, they are cointegrated by the Engle–Granger representation theorem.
 
The second step is then to estimate the model using [[ordinary least squares]]: <math> y_t = \beta_0 + \beta_1 x_t + \varepsilon_t </math>
If the regression is not spurious as determined by test criteria described above, [[Ordinary least squares]] will not only be valid, but in fact superalso [[consistent estimator|consistent]] (Stock, 1987).
Then the predicted residuals <math>\hat{\varepsilon_t}= y_t -\beta_0 - \beta_1 x_t </math> from this regression are saved and used in a regression of differenced variables plus a lagged error term
 
Line 31 ⟶ 33:
 
One can then test for cointegration using a standard [[t-statistic]] on <math>\alpha</math>.
While this approach is easy to apply, there are, however numerous problems:
 
* The univariate unit root tests used in the first stage have low [[statistical power]]
Line 48 ⟶ 50:
 
===An example of ECM===
The idea of cointegration may be demonstrated in a simple macroeconomic setting. Suppose, consumption <math>C_t</math> and disposable income <math>Y_t</math> are macroeconomic time series that are related in the long run (see [[Permanent income hypothesis]]). Specifically, let [[average propensity to consume]] be 90%, that is, in the long run <math>C_t = 0.9 Y_t</math>. From the econometrician's point of view, this long run relationship (aka cointegration) exists if errors from the regression <math>C_t = \beta Y_t+\varepsilon_t</math> are a [[Stationary process|stationary]] series, although <math>Y_t</math> and <math>C_t</math> are non-stationary. Suppose also that if <math>Y_t</math> suddenly changes by <math>\Delta Y_t</math>, then <math>C_t</math> changes by <math>\Delta C_t = 0.5 \, \Delta Y_t</math>, that is, [[marginal propensity to consume]] equals 50%. Our lastfinal assumption is that the gap between current and equilibrium consumption decreases each period by 20%.
 
In this setting a change <math>\Delta C_t = C_t - C_{t-1}</math> in consumption level can be modelled as <math>\Delta C_t = 0.5 \, \Delta Y_t - 0.2 (C_{t-1}-0.9 Y_{t-1}) +\varepsilon_t</math>. The first term in the RHS describes short-run impact of change in <math>Y_t</math> on <math>C_t</math>, the second term explains long-run gravitation towards the equilibrium relationship between the variables, and the third term reflects random shocks that the system receives (e.g. shocks of consumer confidence that affect consumption). To see how the model works, consider two kinds of shocks: permanent and transitory (temporary). For simplicity, let <math>\varepsilon_t</math> be zero for all t. Suppose in period ''t''&nbsp;−&nbsp;1 the system is in equilibrium, i.e. <math>C_{t-1} = 0.9 Y_{t-1}</math>. Suppose that in the period t, disposable income <math>Y_t</math> increases by 10 and then returns to its previous level. Then <math>C_t</math> first (in period t) increases by 5 (half of 10), but after the second period <math>C_t</math> begins to decrease and converges to its initial level. In contrast, if the shock to <math>Y_t</math> is permanent, then <math>C_t</math> slowly converges to a value that exceeds the initial <math>C_{t-1}</math> by&nbsp;9.
 
This structure is common to all ECM models. In practice, econometricians often first estimate the cointegration relationship (equation in levels), and then insert it into the main model (equation in differences).
Line 58 ⟶ 60:
 
==Further reading==
* {{cite book |lastlast1=Dolado |firstfirst1=Juan J. |last2=Gonzalo |first2=Jesús |last3=Marmol |first3=Francesc |chapter=Cointegration |pages=[https://archive.org/details/companiontotheor00balt/page/n646 634]–654 |title=A Companion to Theoretical Econometrics |url=https://archive.org/details/companiontotheor00balt |url-access=limited |editor-first=Badi H. |editor-last=Baltagi |___location=Oxford |publisher=Blackwell |year=2001 |isbn=0-631-21254-X |doi=10.1002/9780470996249.ch31 }}
* {{cite book |first=Walter |last=Enders |title=Applied Econometric Time Series |edition=Third |___location=New York |publisher=John Wiley & Sons |year=2010 |isbn=978-0-470-50539-7 |pages=272–355 }}
* {{cite book |last=Lütkepohl |first=Helmut |authorlinkauthor-link=Helmut Lütkepohl |title=New Introduction to Multiple Time Series Analysis |url=https://archive.org/details/newintroductiont00ltke |url-access=limited |___location=Berlin |publisher=Springer |edition= |year=2006 |isbn=978-3-540-26239-8 |pages=[https://archive.org/details/newintroductiont00ltke/page/n251 237]–352 }}
* {{cite book |lastlast1=Martin |firstfirst1=Vance |last2=Hurn |first2=Stan |last3=Harris |first3=David |title=Econometric Modelling with Time Series |___location=New York |publisher=Cambridge University Press |year=2013 |isbn=978-0-521-13981-6 |pages=662–711 }}
 
[[Category:Error detection and correction]]