DISCUSSION OF ARIMA AS A PARTICULAR CASE OF A TRANSFER
FUNCTION:
How does an ARIMA model capture the effects of omitted causal variables?
Consider the following ...
EQUATION A: Y(t ) = f ( X(t )
and ...
EQUATION B: X(t ) = g ( X(t-1 ))
thus we can substitute EQUATION B into EQUATION A ...
EQUATION C: Y(t ) = h ( X(t-1 ))
but since from EQUATION A ...
EQUATION D: Y(t-1 ) = h ( X(t-1 ))
rearranging EQUATION D we get ...
EQUATION E: X(t-1 ) = i ( X(t ))
thus we can substitute EQUATION E into EQUATION C ...
EQUATION F: Y(t ) = h ( Y(t-1 ))
or the now familar ARIMA model ...
Y(t ) = [T(B)/P(B)] A(t )
- As an example of this consider a regression : Y(t) = v0*X(t) + A(t) (1)
where A is an i.i.d. ( gaussian ) error distribution
where an auto-projective model for X exists such that
X(t) = w0*X(t-12) + E(t)
where a is an i.i.d. ( gaussian ) error distribution
or X(t) = ( [1 - w0*B**12 ] **-1 ) E(t) (2)
Substituting (2) into (1) we get
Y(t) = v0 ( [1 - w0*B**12] **-1 ) E(t) + A(t)
Thus since both A and E are normal random variables we can get
Y(t) = [T1(B)/P1(B)]AA(t)
{T1(B)/P1(B)} = ARMA model for unobserved series AA(t)
or restated as a Rational Expectations Model
Y(t) = VO + V1*Y(t-1) + V2*Y(t-2) + ...
Thus an ARIMA model is simply a regression model in sheep's clothing.
The effect of X has been captured in the history of Y, thus
while the ARIMA is explicit in the history of Y it is implicit in
the history of X.
What this means in practice is that for whatever reason , if you ignore or simply can't capture an
X variable you can incorporate it's effect throught the memory of Y. The important thing to remember is that this structure may
reflect either omitted stochastic (probabalistic) or omitted deterministic series.
If the omitted variable is deterministic (intervention) and arose
randomly then intervention detection (pulses) will deal with that effect while if the omitted deterministic variable arose
every S periods this will be dealt with using a seasonal pulse intervention.
Consider the following ...
A classic transfer function is as follows.
Y(t ) = [W(B)/D(B)] X(t ) + [T(B)/P(B)] A(t )
If we omit the X variable then we have to deal with ...
Y(t ) = [T1(B)/P1(B)] AA(t )
If the omitted variable is stochastic and has no internal time dependency
(white noise) then its effect is simply to increase the background variance resulting
in a downward bias of the tests of necessity and sufficiency. If however the omitted
series is stochastic and has some internal autocorrelation then this structure evidences
itself in the error process and can be identified as a regular phenomenon and appears
as ARIMA structure. For example, if "degree days" is needed to forecast beer sales, but if it's omitted a seasonal ARIMA
structure will be identified and becomes a surrogate for the omitted variable.
If the omitted variable is deterministic and without recurring pattern
it may be identified via surrogate (intervention) series.
AN UNKNOWN STOCHASTIC VARIABLE IS NEEDED