THE
NECESSARY AND SUFFICIENT CONDITIONS FOR AN
EFFICIENT
TIME SERIES MODEL
By
John C. Pickett
And
David P. Reilly
Robert M. McIntyre
Old Dominion University
Department of Psychology
Hampton Boulevard
Norfolk, VA 23507
Voice: 757.651.1082
FAX: 757-683-5087
Email: rmcingtyr@odu.edu
Applied economic practitioners have a long
history of forecasting economic time series. The challenge facing all
forecasters is to provide forecasts that minimize the forecasting error.
Reference to many introductory statistics text reveals popular--but at times
misguided--practices. Most texts fail
to offer an integrated discussion of causal variables, memory structure, and
intervention variables.
The practitioner seeks an understanding
of how to identify and estimate the most efficient model. This paper defines an
efficient model as the one that satisfies the necessary and sufficient
conditions.
Following the chapters on simple and multiple
regression, introductory business school statistics texts introduce the student
to time series models, usually in a pejorative way. This “introduction” may discuss extrapolation techniques and
seasonal components. In effect, because
the basis of this introduction is OLS, these discussions of “time series
techniques” can best be labeled deterministic trend extension strategies--not
bona fide time series techniques.
Econometric methods courses introduce students
to “intermediate” statistical methods.
The majority of modern econometric texts used in these courses take
students on a rather complicated journey that requires them to confront and
best a beast named matrix algebra,
all the while fighting little side skirmishes with simultaneous equations,
asymptotic distributions, covariance-stationary processes, unit roots,
cointegration, and general method of moments. However satisfying to erudite
mathematical authors, complicated, and confusingly described statistical
techniques do not serve the needs of students whose vocation is that of
practitioner. Our experience indicates
that what practitioners seek is sound understanding of how to identify and
estimate a valid and reliable forecasting model. Unfortunately--perhaps because of introductory texts are the only
accessible source to practitioners—they seem to rely on erroneous but
apparently simple techniques. Typical
forecasting practices used by practitioners involve attempts at pre-specifying
a forecasting model’s form, the use of OLS software to estimate the model, the
application of dummy variables to account for seasonal variations and outliers,
and the use of Hildreth-Lu techniques for dealing with the problem of first
order autocorrelation. Practitioners
identify the final model by extensive hands-on manipulation of the data, and
numerous iterations before it is identified.
In addition to using inappropriate model-building strategies, the
practitioner suffers the burden of a tedious iterative process.
The
time series techniques discussed below focus on the dynamic consequences of
events occurring over time.
Observations in a time series will be at equally spaced intervals
through time, such as hourly, daily, monthly, quarterly, or annually. The critically important distinction between
time series data and cross-sectional data is that observations in
cross-sectional data set are assumed to be randomly sampled from some
population. Put another way,
cross-sectional sample data are assumed to be independent of one another. As such, a relatively high value for one
data point would tell us nothing at all about whether another value is likely
to be high or low. In contrast, time
series analysts do not expect independence among time series data. Instead, the analysts expect there to be
dependence (covariation) among observations collected over time, particularly
among observations that occur at relatively close time intervals. In other words, knowledge of an observation
collected at time 1 may well provide information with regard to the value of
another observation collected at time 1 plus 1. Since time series data do not
behave as a random cross-sectional sample of data, they require special
statistical methods--Time Series Analysis.
The latter provides the basis for circumventing the non-independence in
time series data. In fact, beyond dealing with the non-independence of time
series data as a nuisance to be controlled, this set of techniques actually
exploits the non-independence while OLS techniques do not.
Time series methods should be used to
analyze time series data. OLS
techniques should only be used to analyze cross-sectional data set. This argument is based on the fact that time
series observations do not comprise a
random sample drawn from a population and cannot satisfy the underlying
assumption that the observations are independently distributed.
BEST PRACTICE
METHOD
The
following provides a description of how a time series model can be selected
from among all possible models. The
rapid growth in microcomputer power and powerful software innovations coupled
with the emerging finality of modeling techniques offer the analyst an
opportunity to adopt the “best practices” method of estimating the final
model. Here, ”best practices” refers to
the set of methods implied by cutting edge knowledge on efficient model
identification and estimation.
Before beginning our discussion of the
best practices, several points must be made, some of which have already been
mentioned above. First, most data analyzed by forecasting practitioners are of
the time series variety. As was pointed out above, OLS techniques, developed in
the early 20th Century, were designed to analyze cross-sectional
data. Since time series techniques were
not codified until the 1960s, analysts adopted OLS techniques from the
cross-sectional domain and applied them within the time series domain. Second,
OLS techniques require the analyst to pre-identify the functional form. There are thousands of candidate models, and
time constraints prevent the analyst from considering all the possible OLS
models. Third, most OLS techniques require the parameters to be linear. While non-linear estimation techniques are
available, most analysts do not begin with nonlinear methods. In contrast, time series techniques assume
non-linearity in parameters from the outset.
Linearity in parameters is a subset of the more general nonlinearity
assumption. Fourth, OLS technique can
be used to estimate autoregressive time series models but cannot be used to
estimate moving average models. This is
especially important deficiency. Moving
average models assume that an observation time t is, at least in part,
determined by residuals of observations occurring prior to time t. The effects of omitted variables are
technically embedded in the residuals of the past. Therefore, to omit the moving average class of models is to
mishandle models with omitted variables.
Fifth, current OLS techniques focus on testing each estimated parameter
to determine if all are statistically significant followed by performing a selected
number of ‘fix-up’ routines if heteroscedasticity, autocorrelation, and
seasonality appear to be present.
AN
EFFICIENT METHOD
Model
selection withusing OLS techniques entails
maximizing adjusted R2 or minimizing the forecasting error. The theme of this paper is that there are
necessary and sufficient conditions that should be used to select the best
forecasting model. Think of the
necessary and sufficient conditions that insure the selection of the efficient
model in the same way as a mathematician uses them to establish a local maximum
or minimum of a function.
NECESSARY
CONDITION
The necessary condition of an
efficient forecasting model is that it contains only essential parameter
estimates. One should think of the necessary condition as asking the question
“Within the candidate forecasting model, are all parameter estimates required?”
This question is implicitly answered by determining whether parameter estimates
are statistically significantly different from zero.
A simple example may be helpful to
understand the necessary condition and make the transition to the sufficient
condition. Consider a model:
Yt
= St + At
where:
Yt is the dependent series,
St
is the signal referring to a set of parameter estimates initially
hypothesized to be of importance, and
At is the error series.
The
necessary condition points to the requirement that only essential
(“statistically significant”) parameter estimates are included within St. Within OLS multiple-regression procedures,
the well-known procedure of stepwise regression
exemplifies a type of necessity testing.
The sufficient condition focuses on a
second question: “Have any statistically significant parameter estimates been
omitted from the model?” In other words, does At contain any
non-random “structure” that can be identified and moved into St? This question is the downfall of most
forecasters using OLS techniques.
Furthermore, answering the question requires the analyst to determine if
a statistically significant lag structure is present within the errors or if
there is any other “untapped” information embedded in the dependent and independent
variables. Answering this question is
complicated by the presence of outliers, which can mask the relationship
between the dependent and independent variables and the structure of the
errors.
SUFFICIENT
CONDITIONS
The sufficient conditions focus
on determining whether or not the assumptions underpinning time series
analysis--in particular, assumptions regarding the behavior of At--are
satisfied. The analyst begins with a
set of assumptions about the residuals and constructs a set of mathematical
calculations so that when the final model is determined, the residuals from the
model meet these assumptions. If none
of the assumptions isare
satisfied, then the model is completely inefficient. Stated another way, if the sufficient conditions are not
satisfied, then the second question-- Have any statistically significant
parameter estimates been omitted from the model?--is answered in the
affirmative.
_____________________________________________________________________________________________
Table 1. List of Sufficiency Conditions
|
|
Sufficiency Variable Type |
Sufficiency
Condition Description
|
1
|
Lagged
values of Y
|
2
|
Lagged
values of each X and lead values for each what?
|
3
|
Intervention
variable(s) representing a pulse(s).
|
4
|
Intervention
variable(s) representing a seasonal pulse(s).
|
5
|
Intervention
variable(s) representing a level shift(s)
|
6
|
Intervention
variable(s) representing local time trend(s).
|
7
|
Moving
average term(s) representing lagged values of the error terms
|
____________________________________________________________________________________________
The following is a brief
discussion along with an example of each of the sufficiency conditions and the
sufficiency variables designed to address it.
Note that the graphs presented are only single instances portraying the
failure of the sufficiency assumption.
Sufficiency Condition
1. E(ei) = 0
This condition implies that the mean of
the residuals should not deviate from zero in any subset of the time series.
The analyst must therefore investigate all possible subsets and not
limit the test to the overall mean of the residuals. If the mean of a subset of the residuals deviates significantly
from zero, then type 3 - 6 sufficient
variables (listed above) may be required.
Figure 1 shows the mean of the residuals increasing at observation
30. The difference in the means of the
two subsets is statistically significantly different from zero, and in this
case, visually obvious.
Sufficiency
Condition 2. si2
= k
The second sufficiency condition implies
that the residuals have constant variance throughout the series.
Here again, the analyst
must test all possible pairs of subsets.
If the variance of the residuals is not constant, then a number of
remedies are possible. A weighted
regression, a power transformation, or generalized autoregressive conditional
heteroskedasticity (GARCH) techniques may be used to ensure the variance is
constant. Figure 2 shows a plot of the
residuals, where the variance increases, beginning with observation 60. Notice that not only is there a difference
in the variance but the residuals are clearly autocorrelated. Oftentimes Often times a violation manifests
different symptoms.
Sufficiency
Condition 3. Cov(ei,ej)
= 0
This
condition requires that residuals not be autocorrelated for all lags. With regard to tests for autocorrelation,
readers should be aware that the Durbin-Watson statistic tests for the
existence of first-order autocorrelation only.
The analyst must, therefore, examine autocorrelation of residuals for
all possible lags. If the residuals are
autocorrelated, then type 1, 2, or 7 sufficient variables from Table 1 may be
required. Figure 3 shows a plot of the
residuals with a pattern that clearly evidences autocorrelation of residuals.
Sufficiency Condition 4: ei @ (NID)
This condition states that the residuals are normally and
independently distributed. Failure of
this so-called independence
assumption is closely related to the failure of assumption 3 above. If the residuals are not independent, then
type 1, 2, or 7 sufficient variables may be required. If the residuals are not
normally distributed, then either a power transform of the dependent variable
or a type 1 sufficient variable may cure the deficiency. Figure 4 shows a plot of the residuals that
do not meet this condition. A histogram
of the first 16 observations would have a different shape than the histogram of
the last 24 observations.
Sufficiency Condition 5: ei ¹ ¦(Xi
- t)
This condition states that the residuals
are not a function of lagged values of X.
If the residuals are a function of lagged values X, then the analyst has
omitted a statistically significant lag structure on X. This deficiency may be remedied by a type 2
sufficient variable. Figure 5 shows a
characteristic pattern of data that fail to meet this condition. A simple regression of the residuals on the
lagged values of X would show the estimated parameter to be statistically
significant.
Sufficiency
Condition 6: Xi ¹ ¦(ei - t)
This
condition states that the X values in a series are not a function of the lagged
residuals.
If the Xs are a function
of the lagged residuals, then the one-way causal model is the incorrect
functional form. Failure of this
assumption is frequently observed when modeling large macroeconomic systems. In
these models, the dependent and independent variables are all interdependent,
which requires multiple time series techniques (vector time series). Figure 6 shows a characteristic pattern
where this condition is not met. The parameter estimated from a simple
regression between X and the lagged residuals would be statistically
significant. However, there are many cases
where a series is affected by “future values” such as when customers forestall
purchasing products because they are aware of price changes that will occur.
This is a “lead effect “ and can be modeled without loss of generality.
Sufficiency Condition
7. The distribution of the residuals is
invariant over time.
One subset
of the series data should have the same covariance structure as another
subset. In Figure 7 the autocorrelation
function for the first 10 observations is different from the last nine
observations. The autoregressive
parameter for the first 10 observations would be positive while it would be
negative for the last nine. Hence the
parameter would not be consistent over time.
If the covariance structure is not the same, then the data exhibit
time-variant parameters. Corrections
for the failure of this assumption are not found in the seven sufficient
variables. Rather, modeling a change in
parameters while theoretically interesting can lead to “death by model” or a
model with too many parameters. It is quite reasonable to use “Occam’s Razor,”
and simply focus on the most recent homogenous set. It is fair to say that
modeling a realization that effectively are samples of more than one process is
technically beyond the state of the art at this time. If this sufficiency condition is not met, then the analyst
identifies when the change occurs and estimates the model within the set of
most recent observations. Figure 7 shows
a characteristic pattern. Clearly the slope of the line fitted for the first 10
observations would be positive while the slope of the line fitted for the last
nine observations would be negative.
Practitioners devote significant resources to
forecasting economic time series. Traditional OLS methods are not appropriate
techniques for forecasting time series. This follows because observations on a
time series are not a random sample from a population. Rather, time series
observations have dependencies among them, and, as a result, violate the
underlying assumption of independence.
The present paper's primary focus is to review the necessary and
sufficient conditions that the final model must satisfy before it can be
declared the efficient model. Think of
the necessary condition as a checklist to identify the statistically
significant parameters that must be included in the efficient model. The sufficiency condition ensures that the
data and their residuals satisfy the underlying assumptions. If one or more of the underlying
assumptions are not satisfied, then one or more of the above sufficient
variables are necessary for the final model to be deemed efficient.
The "best practices method"
described in this paper offers the analyst two insights. First, it provides a paradigm for focusing
on the necessary and sufficient conditions to ensure the final model is
efficient. Second, it encourages the analyst
to identify an efficient model from among all possible model
specifications. Referring to the
previous model,
Yt = St + At
the
feedback resulting from the necessary and sufficiency tests indicate that information embedded in At should
be moved to St by adding one or more of the seven sufficiency
variables.
An extensive
list of applicable references may be found at: