Partial autocorrelation function: Difference between revisions

Content deleted Content added
Change image of ACF vs PACF with only PACF since ACF wasn't relevant to article text
Added headers for contents, updated pacf picture caption, and tweaked math formulas
Line 6:
This function plays an important role in data analysis aimed at identifying the extent of the lag in an [[autoregressive model]]. The use of this function was introduced as part of the [[Box–Jenkins]] approach to time series modelling, whereby plotting the partial autocorrelative functions one could determine the appropriate lags '''p''' in an AR ('''p''') [[autoregressive model|model]] or in an extended [[Autoregressive integrated moving average|ARIMA]] ('''p''','''d''','''q''') model.
 
==DescriptionDefinition==
 
Given a time series <math>z_t</math>, the partial autocorrelation of lag <math>k</math>, denoted <math>\phi_{kkk,k}</math>, is the [[autocorrelation]] between <math>z_t</math> and <math>z_{t+k}</math> with the linear dependence of <math>z_t</math> on <math>z_{t+1}</math> through <math>z_{t+k-1}</math> removed;. equivalentlyEquivalently, it is the autocorrelation between <math>z_t</math> and <math>z_{t+k}</math> that is not accounted for by lags <math>1</math> through <math>k-1</math>, inclusive.<math display="block">\phi_{1,1} = \operatorname{corr}(z_{t+1}, z_{t}),\text{ for }k= 1,</math><math display="block">\phi_{k,k} = \operatorname{corr}(z_{t+k} - \hat{z}_{t+k},\, z_{t} - \hat{z}_{t}),\text{ for }k\geq 2,</math>
 
<math display="block">\phi_{11} = \operatorname{corr}(z_{t+1}, z_{t}),\text{ for }k= 1,</math><math display="block">\phi_{kk} = \operatorname{corr}(z_{t+k} - \hat{z}_{t+k},\, z_{t} - \hat{z}_{t}),\text{ for }k\geq 2.</math>
 
where <math>\hat{z}_{t+k} = \beta_1 z_{t+k-1} + \beta_2 z_{t+k-2} + ... + \beta_{k-1} z_{t+1}</math> is the [[linear combination]] of <math>\{z_{t+k-1}, z_{t+k-2}, ..., z_{t+1}\}</math> that minimizes the [[mean squared error]], <math>\Epsilon[z_{t+k} - \hat{z}_{t+k}]^2</math>. Similarly, <math>\hat{z}_t = \beta_1 z_{t+1} + \beta_2 z_{t+2} + ... + \beta_{k-1} z_{t+k-1} </math> is a linear combination minimizing <math>\Epsilon[z_t - \hat{z}_t]^2</math>. For [[Stationary process|stationary processes]], the coefficients <math>\beta_1, \beta_2, ..., \beta_{k-1} </math> are the same.<ref>{{Cite book |last=Shumway |first=Robert H. |url=http://link.springer.com/10.1007/978-3-319-52452-8 |title=Time Series Analysis and Its Applications: With R Examples |last2=Stoffer |first2=David S. |date=2017 |publisher=Springer International Publishing |isbn=978-3-319-52451-1 |series=Springer Texts in Statistics |___location=Cham |pages=97-98 |language=en |doi=10.1007/978-3-319-52452-8}}</ref>
 
== Calculation ==
There are algorithms for estimating the partial autocorrelation based on the sample autocorrelations.<ref name=":0">{{Cite book |last=Box |first=George E. P. |title=Time Series Analysis: Forecasting and Control |last2=Reinsel |first2=Gregory C. |last3=Jenkins |first3=Gwilym M. |publisher=John Wiley |year=2008 |isbn=9780470272848 |edition=4th |___location=Hoboken, New Jersey |language=en}}</ref><ref>{{Cite book |last=Brockwell |first=Peter J. |title=Time Series: Theory and Methods |last2=Davis |first2=Richard A. |publisher=Springer |year=1991 |isbn=9781441903198 |edition=2nd |___location=New York, NY |language=en}}</ref> One of these procedures is the [[Levinson-Durbin|Levinson–Durbin Algorithm]]. The partial autocorrelation of any time series can be calculated by iteratively solving for increasing lags in the following formula:<math display="block">\phi_{nn} = \frac{\rho(n) - \sum_{k=1}^{n-1} \phi_{n-1, k} \rho(n - k)}{1 - \sum_{k=1}^{n-1} \phi_{n-1, k} \rho(k) }</math>where <math>\phi_{nk} = \phi_{n-1, k} - \phi_{nn} \phi_{n-1,n-k}</math> for <math>1 \leq k \leq n - 1</math> and <math>\rho(n)</math> is the autocorrelation with lag <math>n</math>.<ref>{{Cite journal |last=Durbin |first=J. |date=1960 |title=The Fitting of Time-Series Models |url=https://www.jstor.org/stable/1401322 |journal=Revue de l'Institut International de Statistique / Review of the International Statistical Institute |volume=28 |issue=3 |pages=233–244 |doi=10.2307/1401322 |issn=0373-1138}}</ref><ref>{{Cite book |last=Shumway |first=Robert H. |url=http://link.springer.com/10.1007/978-3-319-52452-8 |title=Time Series Analysis and Its Applications: With R Examples |last2=Stoffer |first2=David S. |date=2017 |publisher=Springer International Publishing |isbn=978-3-319-52451-1 |series=Springer Texts in Statistics |___location=Cham |pages=103-104 |language=en |doi=10.1007/978-3-319-52452-8}}</ref><ref>{{Cite book |last=Enders |first=Walter |url=https://www.worldcat.org/oclc/52387978 |title=Applied econometric time series |date=2004 |publisher=J. Wiley |isbn=0-471-23065-0 |edition=2nd |___location=Hoboken, NJ |pages=65-67 |language=en |oclc=52387978}}</ref>
 
[[File:Partial Autocorrelation Function Graph.png|alt=The partial autocorrelation graph has 3 spikes and the rest is close to 0.|thumb|PACF of an AR(3) time series]]
ThereThe are algorithms for estimating thetheoretical partial autocorrelation basedfunction onof thea samplestationary autocorrelations.<reftime name=":0">{{Citeseries bookcan |last=Boxbe |first=Georgecalculated E.by P. |title=Time Series Analysis: Forecasting and Control |last2=Reinsel |first2=Gregory C. |last3=Jenkins |first3=Gwilym M. |publisher=John Wiley |year=2008 |isbn=9780470272848 |edition=4th |___location=Hoboken, New Jersey |language=en}}</ref><ref>{{Cite book |last=Brockwell |first=Peter J. |title=Time Series: Theory and Methods |last2=Davis |first2=Richard A. |publisher=Springer |year=1991 |isbn=9781441903198 |edition=2nd |___location=New York, NY |language=en}}</ref> One of these procedures isusing the [[Levinson-Durbin|Levinson–Durbin Algorithm]]. The partial autocorrelation of any time series can be calculated by iteratively solving for increasing lags in the following formula:<math display="block">\phi_{nnn,n} = \frac{\rho(n) - \sum_{k=1}^{n-1} \phi_{n-1, k} \rho(n - k)}{1 - \sum_{k=1}^{n-1} \phi_{n-1, k} \rho(k) }</math>where <math>\phi_{nkn,k} = \phi_{n-1, k} - \phi_{nnn,n} \phi_{n-1,n-k}</math> for <math>1 \leq k \leq n - 1</math> and <math>\rho(n)</math> is the autocorrelation with lag <math>n</math>function.<ref>{{Cite journal |last=Durbin |first=J. |date=1960 |title=The Fitting of Time-Series Models |url=https://www.jstor.org/stable/1401322 |journal=Revue de l'Institut International de Statistique / Review of the International Statistical Institute |volume=28 |issue=3 |pages=233–244 |doi=10.2307/1401322 |issn=0373-1138}}</ref><ref>{{Cite book |last=Shumway |first=Robert H. |url=http://link.springer.com/10.1007/978-3-319-52452-8 |title=Time Series Analysis and Its Applications: With R Examples |last2=Stoffer |first2=David S. |date=2017 |publisher=Springer International Publishing |isbn=978-3-319-52451-1 |series=Springer Texts in Statistics |___location=Cham |pages=103-104 |language=en |doi=10.1007/978-3-319-52452-8}}</ref><ref>{{Cite book |last=Enders |first=Walter |url=https://www.worldcat.org/oclc/52387978 |title=Applied econometric time series |date=2004 |publisher=J. Wiley |isbn=0-471-23065-0 |edition=2nd |___location=Hoboken, NJ |pages=65-67 |language=en |oclc=52387978}}</ref>
 
There are algorithms for estimating the partial autocorrelation based on the sample autocorrelations. The formula above can be used with sample autocorrelations to find the sample partial autocorrelation function of any given time series.<ref name=":0">{{Cite book |last=Box |first=George E. P. |title=Time Series Analysis: Forecasting and Control |last2=Reinsel |first2=Gregory C. |last3=Jenkins |first3=Gwilym M. |publisher=John Wiley |year=2008 |isbn=9780470272848 |edition=4th |___location=Hoboken, New Jersey |language=en}}</ref><ref>{{Cite book |last=Brockwell |first=Peter J. |title=Time Series: Theory and Methods |last2=Davis |first2=Richard A. |publisher=Springer |year=1991 |isbn=9781441903198 |edition=2nd |___location=New York, NY |pages=102, 243-245 |language=en}}</ref>
 
== Autoregressive Model Identification ==
 
[[File:Partial Autocorrelation Function Graph.png|alt=The partial autocorrelation graph has 3 spikes and the rest is close to 0.|thumb|PACFSample partial autocorrelation function of ana simulated AR(3) time series]]
 
Partial autocorrelation plots are a commonly used tool for identifying the order of an [[autoregressive model]].<ref name=":0" /> The partial autocorrelation of an AR(''p'') process is zero at lag <math>p+1</math> and greater. If the sample autocorrelation plot indicates that an AR model may be appropriate, then the sample partial autocorrelation plot is examined to help identify the order. One looks for the point on the plot where the partial autocorrelations for all higher lags are essentially zero. Placing on the plot an indication of the sampling uncertainty of the sample PACF is helpful for this purpose: this is usually constructed on the basis that the true value of the PACF, at any given positive lag, is zero. This can be formalised as described below.