Frequently Asked Statistical Questions (INTERVENTION/OUTLIERS)

QUESTION:

Give me the formal presentation of the IMPACT of outliers.

ANSWER:

Outliers and structure changes are commonly encountered in time series data analysis. The presence of the extraordinary events could and have misled conventional time series analysts resulting in erroneous conclusion. The impact of these events is often overlooked however for the lack of a simple yet effective means to incorporate these isolated events. Several approaches have been considered in the literature for handling outliers in a time series. We will first illustrate the effect o f unknown events which cause simple model identification to go awry. We will then illustrate what to do in the case when one knows a priori about the date and nature of the isolated event. We will also point out a major flaw when one assumes an incorrect model specification. Then we introduce the notion of finding the intervention variables through a sequence of alternative regression models yielding maximum likelihood estimates of both the form and the effect of the isolated event. Standard identification of Arima models uses the sample ACF as one of the two vehicles for model identification. The ACF is computed using the covariance and the variance. An outlier distorts both of these and in effect dampens the ACF by inflating both measures. Another problem with outliers is that they can distort the sample ACF and PACF by introducing spurious structure or correlations. For example consider the circumstance where the outlier dampens the ACF:
ACF = COVARIANCE/VARIANCE

Thus the net effect is to conclude that the ACF is flat; and the resulting conclusion is that no information from the past is useful. These are the results of incorrectly using statistics without validating the parametric requirements. It is necessary to check that no isolated event has inflated either of these measures leading to an "Alice in Wonderland" conclusion. Various researches have concluded that the history of stock market prices is information-less. Perhaps the conclusion should have been that the analysts were statistic-less. Another way to understand this is to derive the estimator of the coefficient from a simple model and to evaluate the effect of a distortion. Consider the true model as an AR(1) with the following familiar form:

or

or

or

The variance of Y can be derived as: variance(Y) = PHI1*PHI1 variance(Y)+variance(A) thus

 

1 - P(B) or 1 - PB for a simple AR1 model case we have:

 

Now, if the true state of nature is where an intervention of form I(t) occurs at time period t with a magnitude of W we have:

with =true variance(Y) + distortion

thus

 

The inaccuracy or bias due to the intervention is not predictable due to the complex nature of the relationship. At one extreme the addition of the squared bias to variance(A) would increase the numerator and drive the ratio to 1 and the estimate of PHI1 to zero. The rate at which this happens depends on the relative size of the variances and the magnitude and duration of the isolated event. Thus the presence of an outlier could hide the true model. Now consider another option where the variance(Y) is large relative to variance(A). The effect of the bias is to drive the ratio to zero and the estimate of PHI1 to unity. A shift in the mean would generate an ACF that did not die out slowly thus leading to a misidentified first difference model. In conclusion, the effects of the outlier depend on the true state of nature. It can both incorrectly hide model form and incorrectly generate evidence of a bogus model.


CLICK HERE:Home Page For AUTOBOX