Philosophy



At AFS, our forecasting packages are based on two central beliefs. One, that the Box-Jenkins (BJ) approach, both univariate (ARIMA) and transfer function (causal models), to model identification, estimation, model diagnostics and forecasting provides the proper framework for forecasting. The Box-Jenkins modeling paradigm is very rich and in fact subsumes most other common forms of numerical forecasting techniques such as regression and exponential smoothing.

The second philosophy is quite simple, procedures that consist of methods applied in a consistent way are subject to automation. We strongly believe that BJ forecasting techniques are one such set of methods. So, we developed a sophisticated forecasting engine that applies the modeling philosophy of Box and Jenkins to time series and will at your option build the models automatically. The result is a powerful package that provides a set of tools for both beginners and experts.

We feel that it is crucial to understand our philosophy of forecasting before evaluating our products. Given these philosophies, the elegance and forethought in the design of these packages should be apparent.

Statistical methods can be very useful in summarizing information and very powerful in testing hypothesis. As in all things, there can be drawbacks. To proceed with the application of a statistical test, one has to be careful about validating the assumptions under which the test is valid. It is often possible to remedy the violation and then safer to proceed. One of the most often violated assumptions is that the observations are independent. Unfortunately, the real world operates in ignorance of many statistical assumptions, which can lead to problems in analysis. The good news is that these problems may be easily overcome, so long as they are recognized and dealt with correctly.

The assumption of independence of observations implies that the most recent data point contains no more information or value than any other data point, including the first one that was measured. In practice, this means that the most recent reading has no special effect on estimating the next reading or measurement. In summary, it provides no information about the next reading. If this assumption, i.e. independence of all readings, is true this implies that the overall mean or average is the best estimate of future values. If however there is some serial or autoprojective (autocorrelation) structure, the best estimate of the next reading will depend on the recently observed values. The exact form of this prediction is based on the observed correlation structure. Stock market prices are an example of an autocorrelated data set. Weather patterns move slowly, thus daily temperatures have been found to be reasonably described by an AR(2) model which implies that your temperature is a weighted average of the last two days temperature. Try it and see if doesn't work !

Statistical Process Control (SPC) in industry has proved to be a major factor in cost-savings and achieving productivity improvement. The Statistical Control Charts developed by Walter Shewhart in 1931 is still the most popular chart in use today, due to its simplicity and effectiveness. Shewhart assumed that the observations are normally distributed and statistically independent. If these assumptions are not met then the standard formulas don't apply.

The need for the samples to be statistically independent is critical while the lack of normality is less serious. In many cases, statistical independence can not be assumed. Inertial elements in the process frequently cause the observations to become positively autocorrelated: that is, if X(t) is positive, it is likely that X(t+1) will also be positive. In sales and marketing applications, high sales in one month might lead to low sales the following month. This is sometimes known as "lumpy demand". In this case, there is negative autocorrelation.

The mean of the observations also tends to meander or drift. This does not mean that the process is "out-of-control" -- it actually represents the inherent process variation. The application of a Shewhart control chart in this case results in many false alarms, leading to expensive and fruitless searches for assignable causes. There is then a clear need to incorporate the autocorrelation effect into the Control Charts.

Autoprojective tools or models are surrogates for omitted variables. An ARIMA model is the ultimate case of an omitted variable or sets of variables. In most cases, as long as true cause variables don't change, history is prologue. However, one should always try to collect and use information or data for the cause variables or potential cause variables. The second approach can be contrasted to the former by referring to autoprojective schemes as "rear-window driving" while causative models or approaches are "front and rear-window driving" because one often has good estimates of future values of the cause variables ( price, promotion schedule, occurrence of holidays, competitive price, etc. )

There is no question that "ALL MODELS ARE WRONG" but simply stated "SOME MODELS ARE USEFUL". When Box and Jenkins codified the process for rigorous systematic identification of models using (of all things) the observed data to aid this process, they laid out the ground rules for model validation. Many researchers failed to validate these assumptions due to a myriad number of reasons such as: cost of computers, time to perform checks, etc. Box and Jenkins themselves took shortcuts and often were misled by the results. Researchers point out that simple minded differencing rather than detrending can bear poor results. Very true! Autobox deals with this issue.

In all science there is one recurring theme "a more comprehensive approach will lead to better results". There is one caveat and that is "implemented correctly". The failure has not been in the method, but in those implementing the method. AUTOBOX tries to help the user avoid such pitfalls.  

 

 
Go to top