Tom Reilly

Waging a war against how to model time series vs fitting

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.

Machine Learning - It might be "machiney", but it's not learning

Posted by on in Forecasting
  • Font size: Larger Smaller
  • Hits: 30058
  • 4 Comments
  • Subscribe to this entry
  • Print
  • PDF

Let's take a look at Microsoft's Azure platform where they offer machine learning. I am not real impressed. Well, I should state that it's not really a Microsoft product as they are just using an R package. There is no learning here with the models being actually built. It is fitting and not intelligent modeling. Not machine learning.

The assumptions when you do any kind of modeling/forecasting is that the residuals are random with a constant mean and variance.  Many aren't aware of this unless you have taken a course in time series.

Azure is using the R package auto.arima to do it's forecasting. Auto.arima doesn't look for outliers or level shifts or changes in trend, seasonality, parameters or variance.

Here is the monthly data used. 3.479,3.68,3.832,3.941,3.797,3.586,3.508,3.731,3.915,3.844,3.634,3.549,3.557,3.785,3.782,3.601,3.544,3.556,3.65,3.709,3.682,3.511, 3.429,3.51,3.523,3.525,3.626,3.695,3.711,3.711,3.693,3.571,3.509

It is important to note that when presenting examples many will choose a "good example" so that the results can show off a good product.  This data set is "safe" as it is on the easier side to model/forecast, but we need to delve into the details that distinguish the difference between real "machine learning" vs. fitting approaches.  It's important to note that the data looks like it has been scaled down from a large multiple.  Alternatively, if the data isn't scaled and really is 3 digits out then you also are looking for extreme accuracy in your forecast.  The point I am going to make now is that there is a small difference in the actual forecasts, but the level(lower) that Autobox delivers makes more sense and that it delivers residuals that are more random.  The important term here is "is it robust?" and that is what Box-Jenkins stressed and coined the term "robustness".

Here is the model when running this using auto.arima.  It's not too different than Autobox's except one major item which we will discuss.

The residuals from the model are not random.  This is a "red flag". They clearly show the first half of the data above 0 and the second half below zero signaling a "level shift" that is missing in the model.

Now, you could argue that there is an outlier R package with some buzz about it called "tsoutliers" that might help.  If you run this using tsoutliers,  a SPURIOUS Temporary Change(TC) up (for a bit and then back to the same level is identified at period #4 and another bad outlier at period #13 (AO). It doesn't identify the level shift down and made 2 bad calls so that is "0 for 3". Periods 22 to 33 are at a new level, which is lower. Small but significant. I wonder if MSFT chose not to test use the tsoutliers package here.

 

Autobox's model is just about the same, but there is a level shift down beginning at period 11 of a magnitude of .107.

Y(T) =  3.7258                                azure                                                                     
       +[X1(T)][(-  .107)]                              :LEVEL SHIFT       1/ 11    11
      +     [(1-  .864B** 1+  .728B** 2)]**-1  [A(T)]

Here are both forecasts.  That gap between green and red is what you pay for.



Note that the Autobox upper confidence limits are much lower in level.

 

Autobox's residuals are random

 

 

 

 

 

Comments

  • Robinjack
    Robinjack Friday, 29 March 2019

    Your article constantly have many of really up to date info. Where do you come up with this? Just declaring you are very creative. Thanks again JNC Wheels

  • Robinjack
    Robinjack Thursday, 13 June 2019

    Utterly written content , Really enjoyed studying. binance

  • Robinjack
    Robinjack Saturday, 15 June 2019

    In this awesome scheme of things you actually get an A+ just for hard work. Exactly where you lost everybody ended up being in your particulars. You know, people say, details make or break the argument.. And that could not be much more true here. Having said that, permit me say to you what exactly did give good results. The article (parts of it) is actually incredibly powerful which is most likely the reason why I am making the effort in order to comment. I do not really make it a regular habit of doing that. Secondly, although I can certainly see the jumps in reason you come up with, I am not really sure of exactly how you seem to connect your ideas which inturn make the actual conclusion. For the moment I will yield to your point however wish in the foreseeable future you connect your dots much better. binance

  • Robinjack
    Robinjack Sunday, 16 June 2019

    Wonderful paintings! That is the kind of information that are meant to be shared across the web. Disgrace on the seek for no longer positioning this submit upper! Come on over and visit my website . Thanks =) Coinnama

  • Please login first in order for you to submit comments
Go to top