QUESTION:

 When is it appropriate to use the Mean ?

 

 ANSWER:

 The Mean has been oversold as a describer of data. Early work in

Statistics focused on cross-sectional data (agricultural experiments, sampling

etc.) with little focus on correlated data. In my world, as a practicing statistician

I seldom if ever come upon independent data. All the data I see is longitudinal

or otherwise known as time series. The simple mean and the simple variance using the mean

rather than the expected value of an autocorrelated process is not useful and sends the wrong

message to students. I empathize with your concern and respond in that vein. To

understand the MEAN it is helpful to recognize that the super-structure for parameteric

statistics is a model. For example, the idea of introducing the mean as part of a

modeling approach would state that DATA = FIT + RESIDUALS. This approach

leads to regression, ANOVA , ARIMA and TRANSFER FUNCTIONS. The mean is always

equal to the first moment but is not always equal to the Expected Value. Expected value

has to do with a variable. For example if the variable

X(t) = U + A(t)

where u is a constant and A is a r.v. with mean zero

then E(X) = E(U) + E(A) + E(UA) and since E(U)=U E(A)=0 E(UA)=0 by assumption

then E(X) = U if however X(t) = U + .7 X(t-1) + A(t)

then E(X) = E(U) + .7 E(X) + 0. E(X) - .7

E(X) = E(U) E(X) = U /(1-.7B) =.3*U

NOTE : the backshift operator ( like X in algebra ) allows us to say that Y(t-1) = B * Y(T)

or in general Y(t-k)= [Bk][Y(t)]

If the random variable has some form of autoregressive or autocorrelated model

(for example) X(t)= b0 + b1T or X(t)= b0 + b1 X(t-1)

then the mean of the series is not particularly representative and

is NOT the EXPECTED VALUE . This issue arises when explaining why the standard deviation

of a series doesn't detect an outlier.

 

Consider the series 1,9,1,9,1,9,9,9,1,9

If you compute the standard deviation using the MEAN rather than the

Expected Value, you get a bad estimate and the unusual value at time period 7 goes

unnoticed. If, however, the expected value is used as it should be then the resulting

standard deviation leads to a detection of the anomaly at time period 7. In general,

the mean of an autocorrelated series is not the EXPECTED VALUE and thus has been

oversold as a describer.

  

 

CLICK HERE:Home Page For AUTOBOX