|
SPURIOUS CORRELATION
When we use Pearson's Correlation procedure, we sometimes
end up with coefficients that indicate a relationship when
there really isn't one. For example, there is a fairly
strong positive correlation between fire trucks and fire
damage. More fire trucks at a fire scene the more damage
done at a fire scene. Are the fire trucks are causing the
damage??? No! This correlation is a false, accidental
correlation. Statisticians and users of statistics refer to
this type of accidental association as a
SPURIOUS CORRELATION.
If the analyst uses Pearson's correlation
coefficient with time series, it can be a pitfall.
If the two series are normally distributed without
autocorrelation then it is correct to use such simple
procedures. Independent observations which are not
autocorrelated are typically found in
cross-sectional data and not in time series data.
Otherwise, you must use transfer function methods
as detailed by Box-Jenkins by building a
pre-whitened filter for the input series in order
to assess the conditional impact of one series on
another.
Spurious correlation is normally due to other extraneous
variables that are associated with the independent and
dependent variables focused on at the time. In the
fire damage example the extraneous variable was fire
intensity. Intensity of the fire was positively related to
the number of fire trucks at the scene and positively related
to the amount of damage at the scene. This situation will
result in the statistical appearance that fire trucks and
fire damage are directly related. They are related, but only
by accident (or spuriously).
Statistics of the amount of damage caused in house fires
show that the larger the number of firefighters attending the
scene, the worse the damage!
This is an example of what is called the Simpson's
Paradox. The apparent association is due to the omission of
some important information. In the example of house fires,
the size of the fire needs to be taken into account --- more
firefighters are sent to larger fires and the larger the
fires, the worse the damage.
This is also related to Simpson's paradox if you
consider fire size as categorical (e.g. small, medium,
large). The overall effect is that more firemen (seem to)
imply more damage, however, within each category of fire,
more firemen imply less damage. The relationship for every
subgroup is the opposite of the relationship for the entire
group taken as a whole.
In many situations, the explanation for some apparent
association cannot be identified easily. One example is the
association between smoking and lung cancer. It has been
argued that the apparent association between the two may be
due to some genetic factor that predisposes people both to
nicotine addiction and lung cancer. If this is true, then
smoking cannot be blamed for causing cancer. It is only
after considerable research, with the aid of statistical
methods, that it is now generally accepted that smoking is a
contributory cause of lung cancer.
CLICK HERE:Home Page For AUTOBOX
|