QUESTION:
When is it appropriate to use the Mean ?
ANSWER:
The Mean has been oversold as a describer of data. Early work in
Statistics focused on cross-sectional data (agricultural experiments, sampling
etc.) with little focus on correlated data. In my world, as a practicing statistician
I seldom if ever come upon independent data. All the data I see is longitudinal
or otherwise known as time series. The simple mean and the simple variance using the mean
rather than the expected value of an autocorrelated process is not useful and sends the wrong
message to students. I empathize with your concern and respond in that vein. To
understand the MEAN it is helpful to recognize that the super-structure for parameteric
statistics is a model. For example, the idea of introducing the mean as part of a
modeling approach would state that DATA = FIT + RESIDUALS. This approach
leads to regression, ANOVA , ARIMA and TRANSFER FUNCTIONS. The mean is always
equal to the first moment but is not always equal to the Expected Value. Expected value
has to do with a variable. For example if the variable
X(t) = U + A(t)
where u is a constant and A is a r.v. with mean zero
then E(X) = E(U) + E(A) + E(UA) and since E(U)=U E(A)=0 E(UA)=0 by assumption
then E(X) = U if however X(t) = U + .7 X(t-1) + A(t)
then E(X) = E(U) + .7 E(X) + 0. E(X) - .7
E(X) = E(U) E(X) = U /(1-.7B) =.3*U
NOTE : the backshift operator ( like X in algebra ) allows us to say that Y(t-1) = B * Y(T)
or in general Y(t-k)= [Bk][Y(t)]
If the random variable has some form of autoregressive or autocorrelated model
(for example) X(t)= b0 + b1T or X(t)= b0 + b1 X(t-1)
then the mean of the series is not particularly representative and
is NOT the EXPECTED VALUE . This issue arises when explaining why the standard deviation
of a series doesn't detect an outlier.
Consider the series 1,9,1,9,1,9,9,9,1,9
If you compute the standard deviation using the MEAN rather than the
Expected Value, you get a bad estimate and the unusual value at time period 7 goes
unnoticed. If, however, the expected value is used as it should be then the resulting
standard deviation leads to a detection of the anomaly at time period 7. In general,
the mean of an autocorrelated series is not the EXPECTED VALUE and thus has been
oversold as a describer.
CLICK HERE:Home Page For AUTOBOX