Box–Jenkins method: Difference between revisions

Content deleted Content added
Tags: Mobile edit Mobile web edit
wtf was that
 
(17 intermediate revisions by 13 users not shown)
Line 1:
{{Short description|Method to find best fit of a time-series model}}
In [[time series analysis]], the '''Box–Jenkins method,''',<ref>{{cite book |lastlast1=Box |firstfirst1=George |last2=Jenkins |first2=Gwilym |year=1970 |title=Time Series Analysis: Forecasting and Control |url=https://archive.org/details/timeseriesanalys0000boxg |url-access=registration |___location=San Francisco |publisher=Holden-Day }}</ref> named after the [[statistician]]s [[George Box]] and [[Gwilym Jenkins]], applies [[autoregressive moving average]] (ARMA) or [[autoregressive integrated moving average]] (ARIMA) models to find the best fit of a time-series model to past values of a [[time series]].
 
==Modeling approach==
The original model uses an iterative three-stage modeling approach:
 
#''[[Model identification]] and [[model selection]]'': making sure that the variables are [[stationary process|stationary]], identifying [[seasonality]] in the dependent series (seasonally differencing it if necessary), and using plots of the [[autocorrelation|autocorrelation (ACF)]] and [[partial autocorrelation|partial autocorrelation (PACF)]] functions of the dependent time series to decide which (if any) autoregressive or moving average component should be used in the model.
#''[[Parameter estimation]]'' using computation algorithms to arrive at coefficients that best fit the selected ARIMA model. The most common methods use [[maximum likelihood estimation]] or [[non-linear least-squares estimation]].
#''[[Statistical model validation|ModelStatistical model checking]]'' by testing whether the estimated model conforms to the specifications of a stationary univariate process. In particular, the residuals should be independent of each other and constant in mean and variance over time. (Plotting the mean and variance of residuals over time and performing a [[Ljung–Box test]] or plotting autocorrelation and partial autocorrelation of the residuals are helpful to identify misspecification.) If the estimation is inadequate, we have to return to step one and attempt to build a better model.
The data they used were from a gas furnace. These data are well known as the Box and Jenkins gas furnace data for benchmarking predictive models.
 
Commandeur & Koopman (2007, §10.4)<ref>{{cite book |lastlast1=Commandeur |firstfirst1=J. J. F. |last2=Koopman |first2=S. J. |year=2007 |title=Introduction to State Space Time Series Analysis |___location= |publisher=[[Oxford University Press]] |isbn= }}</ref> argue that the Box–Jenkins approach is fundamentally problematic. The problem arises because in "the economic and social fields, real series are never stationary however much differencing is done". Thus the investigator has to face the question: how close to stationary is close enough? As the authors note, "This is a hard question to answer". The authors further argue that rather than using Box–Jenkins, it is better to use state space methods, as stationarity of the time series is then not required.
 
==Box–Jenkins model identification==
 
===Stationarity and seasonality===
The first step in developing a Box–Jenkins model is to determine ifwhether the [[time series]] is [[Stationary process|stationary]] and ifwhether there is any significant [[seasonality]] that needs to be modelled.
 
====Detecting stationarity====
Stationarity can be assessed from a [[run sequence plot]]. The run sequence plot should show constant ___location and [[Scale (ratio)|scale]]. It can also be detected from an [[autocorrelation plot]]. Specifically, non-stationarity is often indicated by an autocorrelation plot with very slow decay. One can also utilize a [[Dickey-Fuller test]] or [[Augmented Dickey-Fuller test]].
 
====Detecting seasonality====
Seasonality (or periodicity) can usually be assessed from an autocorrelation plot, a [[seasonal subseries plot]], or a [[spectral plot]].
 
====Differencing to achieve stationarity====
Box and Jenkins recommend the differencing approach to achieve stationarity. However, [[curve fitting|fitting a curve]] and subtracting the fitted values from the original data can also be used in the context of Box–Jenkins models.
 
====Seasonal differencing====
At the model identification stage, the goal is to detect seasonality, if it exists, and to identify the order for the seasonal autoregressive and seasonal moving average terms. For many series, the period is known and a single seasonality term is sufficient. For example, for monthly data one would typically include either a seasonal AR 12 term or a seasonal MA 12 term. For Box–Jenkins models, one does not explicitly remove seasonality before fitting the model. Instead, one includes the order of the seasonal terms in the model specification to the [[ARIMA]] estimation software. However, it may be helpful to apply a seasonal difference to the data and regenerate the autocorrelation and partial autocorrelation plots. This may help in the model identification of the non-seasonal component of the model. In some cases, the seasonal differencing may remove most or all of the seasonality effect.
 
Tahir shah bs economics
 
===Identify ''p'' and ''q''===
Once stationarity and seasonality have been addressed, the next step is to identify the order (i.e. the ''p'' and ''q'') of the autoregressive and moving average terms. Different authors have different approaches for identifying ''p'' and ''q''. Brockwell and Davis (1991)<ref>{{cite book |lastlast1=Brockwell |firstfirst1=Peter J. |last2=Davis |first2=Richard A. |year=1991 |title=Time Series: Theory and Methods |publisher=Springer-Verlag |page=273|bibcode=1991tstm.book.....B }}</ref> state "our prime criterion for model selection [among ARMA(p,q) models] will be the AICc", i.e. the [[Akaike information criterion]] with correction. Other authors use the autocorrelation plot and the partial autocorrelation plot, described below.
 
====Autocorrelation and partial autocorrelation plots====
Line 55 ⟶ 54:
| Autoregressive model. Use the partial autocorrelation plot to help identify the order.
|-
! One or more spikes, rest are essentially zero (or close to zero)
| [[Moving average model]], order identified by where plot becomes zero.
|-
Line 67 ⟶ 66:
| Include seasonal autoregressive term.
|-
! No decay to zero (or it decays extremely slowly)
| Series is not stationary.
|}
 
Hyndman & Athanasopoulos suggest the following:<ref>{{cite webbook|last1=Hyndman|first1=Rob J|last2=Athanasopoulos|first2=George|title=Forecasting: principles and practice|url=https://www.otexts.org/fpp/8/5|accessdateaccess-date=18 May 2015}}</ref>
 
:The data may follow an ARIMA(''p'',''d'',0) model if the ACF and PACF plots of the differenced data show the following patterns:
Line 101 ⟶ 100:
 
==Further reading==
* {{citation | title= Comparison of Box–Jenkins and objective methods for determining the order of a non-seasonal ARMA model | author1-first= S. | author1-last= Beveridge | author2-first= C. | author2-last= Oickle | journal= [[Journal of Forecasting]] | year= 1994 | volume= 13 | issue= 5 | pages= 419-434419–434 | doi= 10.1002/for.3980130502}}
* {{citation |last=Pankratz |first=Alan |year=1983 |title=Forecasting with Univariate Box–Jenkins Models: Concepts and Cases |publisher= [[John Wiley & Sons]] }}
 
==External links==
* [https://web.archive.org/web/20070318000551/http://statistik.mathematik.uni-wuerzburg.de/timeseries/ A First Course on Time Series Analysis] – an open source book on time series analysis with SAS (Chapter 7)
* [http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc445.htm Box–Jenkins models] in the Engineering Statistics Handbook of [[NIST]]
* [http://robjhyndman.com/papers/BoxJenkins.pdf Box–Jenkins modelling] by Rob J Hyndman
Line 111 ⟶ 110:
 
{{NIST-PD}}
{{Authority control}}
 
{{DEFAULTSORT:Box-Jenkins}}