QUESTION:
Could someone give me a brief explanation of autocorrelation and maybe a formula? I was satisfied with
averaging until I started reading this group. Thanks.
ANSWER:
Autocorrelation is as the name suggests correlation on itself. If you take a sequence of equally spaced
readings this is called a time series. It is also called longitudinal data. When dealing with time series, we
are concerned that the Mean may not be representative or re-stated the Mean may not be the Expected
Value. The Mean is equal to the Expected Value if the random variable being analyzed has the distribution
X(t) = u + A(t) WHERE U = A CONSTANT and A(t) has a NIID or GAUSSIAN DIST. The issue is if for
example X(t) = u + .7 * X(t-1) + A(t) is more appropriate for the data then the Expected Value is no longer
the Mean. The way you find out that this might be the correct model is to lag the series that you observe
one period to create a new series called Z and to compute the simple correlation coefficient between X and
Z. By definition, the simple cross-correlation coefficient is equal to the autocorrelation coefficient. Other
autocorrelation coefficients for different lags can be computed.
The bottom line is that the autocorrelation coefficient measures the unconditional correlation between the
two series. Now the partial autocorrelation coefficient measures the conditional relationship between two
series. This is the same as the partial correlation coefficient. One would compute the partial for lag 2 by
estimating a multiple regression with two input series. The first would be the series lagged 1 period
while the second would be the series lagged 2 periods. The coefficient for the second series would be the
Partial Autocorrelation or Partial Correlation due to lag 2. Box and Jenkins used these concepts in their
work on IDENTIFYING model forms. Please see our Web Site ( http://www.autobox.com) for more stuff on
autocorrelation. AUTOBOX uses the sample autocorrelation and sample partial autocorrelation to
identify and refine the initially identified model. Like any other estimate based on minimizing an error
sum of squares these coefficients are not robust to outliers, be they pulses, level shifts, seasonal pulses
or local time trends thus the need for INTERVENTION DETECTION which provides more robust
(healthy) estimates. These are things that we have been concerned with in our work.