UNDERSTANDING AND IDENTIFICATION OF ARIMA MODELS

THIS SEMINAR SECTION EXPLAINS IN SIMPLE TERMS SOME OF THE WHY'S AND THE WHEREFORE'S OF SIMPLE AUTO-PROJECTIVE MODELS.

Time series analysis attempts to match the observable patterns in the data to an underlying model or sequences of models. These models may be identical or different for distinct ranges of time. Even if the model form is identical the model parameters may be different. It is possible and even probable that the variance of the series may not be constant. If it is not constant then it may be related to the level of the series or simply evidence of major changes or might even stochastically evolve over time.

Time series has three major dimensions. The first case and simplest is a single endogenous (dependent) variable that may or may not be effected by unusual events. The second case extends the first by the inclusion of input (causal) variables that may have a role in the prediction or modelling of this one dependent variable. The third case allows for simultaneous modelling of multiple dependent series. Just as unexpected interventions may have effected the Y variable in case 1, these same Intervention Variables might also impact 2 and 3.

Modeling skills required to do this are both rare and human intensive. Thus it is natural to develop and hone methods to aid or to automate the process. AUTOMATIC FORECASTING SYSTEMS has been doing this since 1976.

The fundamental objective of AUTOBOX is to aid the researcher by developing candidate models by comparing the theoretical autocorrelation structure for viable alternatives with the observed state of nature, i.e. the actual autocorrelation structure. Test statistics are developed to optimally match these thus suggesting a model that approximates the true but unknown model.

Statisticians can develop the theoretical fingerprints for different underlying models. What is done in practice is to follow the ideas originally proposed by Yogi Berra, "You can observe a lot by simply watching". We observe the data, characterize it and then attempt to match it to alternative states of nature. Sardonically, this is sometimes referred to as Hypothesis Generation.

The alternative approach, none to pretty I might add, is to assume knowledge of the underlying model before data is actually collected and analyzed.

For example if the underlying model is a simple autoregressive model of order 1 with a parameter of .8 , we can write it as follows.

This model has a theoretical autocorrelation that follows a simple recursive shape. If we observe a time series with a similar shaped autocorrelation function then we may have found a match and can then set out to actually test a statistical hypothesis.

We will now attempt to apply this paradigm to an actual time series. We show a plot of the original time series and ponder how to characterize these values in order to develop the underlying pattern or scheme.

As a first cut, we can compute various basic statistics such as the mean and variance. The problem is that these statistics don't speak to the underlying memory structure or auto-dependence between successive values.

The sample autocorrelation measures the auto-dependence or internal predictability between contiguous values. For example, this series has a simple correlation of .764 between values one period apart. Additionally, the simple correlation between values two periods apart is .315. Our task will be to attempt to match this with a good candidate. In this regard, we are behaving like a "yenta" in trying to get a good match.

There is a statistical tool called the Partial Regression Coefficient which measures the importance of additional or auxiliary variables. For example, as much as there is a one period auto-dependency implied by the sample PACF, partial autocorrelation function, of .764 there is evidence for an additional lag of two as the value -.645 is quite significant. Note that subsequent auxiliary lags are not significant (-.051,-.193,-.101 etc.).

The next 6 visuals present alternative models and an examination supports the ultimate selection of a mixed model (AR2 and MA1). This selection is a slight over-parameterization and leads us to a reduced form of AR2. Model identification is essentially a trial and error approach until one reaches a comfortable solution or match.

The six candidates, in order, are MA2, AR1, AR1/MA1, AR1/MA2, AR2 and finally AR2/MA1 the ultimate choice.

MA2

AR1

AR1/MA1

AR1/MA2

AR2

AR2/MA1

We show the results of estimation for the candidate model. Note the clear non-significance of the MA1 coefficient. The estimated value is .158 which has an associated probability value of .2. Clearly non-significant.

This test of necessity leads to a reduced model of an AR2.

Expressed as a lagged rational expectations model we have the following.

This leads to its utilization to make a prediction.

In terms of analysis we show the Fit, Actual and Forecast.

And now the Actual and Residuals. Notice how the residuals have been purged of any structure, thus being unpredictable and random in form. Statisticians are Noise Makers in practice.

Finally the Actual and the Forecasts.

Some software packages, SAS for example try to crunch their way to a solution. For more on the deficiencies of some very well known models follow the arrow.

I have been told that it is never a good idea to show students what shouldn't be done as they might get the idea that they should follow the example. However ..... To see an example of what you should studiously avoid please click on the arrow. One unusual value is enough to make the "pick-best" from a pre-set selection of models absolutlely go bonkers ! The data should be allowed to mix'n match the combination of history,causals and dummies. Try it yourself. Create a time series with a KNOWN structure or select a series from a textbook and perturb it with either a pulse(s) or a seasonal pulse or a level shift or a local time trend. Canned or "crunch approaches" simply don't work. The main idea is that pre-set or pick from the bunch approaches don't offer enough latitude or breadth in their approach.

CLICK HERE:Home Page For AUTOBOX