Tom Reilly

Waging a war against how to model time series vs fitting

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.

Machine Learning - It might be "machiney", but it's not learning

Posted by on in Forecasting
  • Font size: Larger Smaller
  • Hits: 25319
  • 9 Comments
  • Subscribe to this entry
  • Print
  • PDF

Let's take a look at Microsoft's Azure platform where they offer machine learning. I am not real impressed. Well, I should state that it's not really a Microsoft product as they are just using an R package. There is no learning here with the models being actually built. It is fitting and not intelligent modeling. Not machine learning.

The assumptions when you do any kind of modeling/forecasting is that the residuals are random with a constant mean and variance.  Many aren't aware of this unless you have taken a course in time series.

Azure is using the R package auto.arima to do it's forecasting. Auto.arima doesn't look for outliers or level shifts or changes in trend, seasonality, parameters or variance.

Here is the monthly data used. 3.479,3.68,3.832,3.941,3.797,3.586,3.508,3.731,3.915,3.844,3.634,3.549,3.557,3.785,3.782,3.601,3.544,3.556,3.65,3.709,3.682,3.511, 3.429,3.51,3.523,3.525,3.626,3.695,3.711,3.711,3.693,3.571,3.509

It is important to note that when presenting examples many will choose a "good example" so that the results can show off a good product.  This data set is "safe" as it is on the easier side to model/forecast, but we need to delve into the details that distinguish the difference between real "machine learning" vs. fitting approaches.  It's important to note that the data looks like it has been scaled down from a large multiple.  Alternatively, if the data isn't scaled and really is 3 digits out then you also are looking for extreme accuracy in your forecast.  The point I am going to make now is that there is a small difference in the actual forecasts, but the level(lower) that Autobox delivers makes more sense and that it delivers residuals that are more random.  The important term here is "is it robust?" and that is what Box-Jenkins stressed and coined the term "robustness".

Here is the model when running this using auto.arima.  It's not too different than Autobox's except one major item which we will discuss.

The residuals from the model are not random.  This is a "red flag". They clearly show the first half of the data above 0 and the second half below zero signaling a "level shift" that is missing in the model.

Now, you could argue that there is an outlier R package with some buzz about it called "tsoutliers" that might help.  If you run this using tsoutliers,  a SPURIOUS Temporary Change(TC) up (for a bit and then back to the same level is identified at period #4 and another bad outlier at period #13 (AO). It doesn't identify the level shift down and made 2 bad calls so that is "0 for 3". Periods 22 to 33 are at a new level, which is lower. Small but significant. I wonder if MSFT chose not to test use the tsoutliers package here.

 

Autobox's model is just about the same, but there is a level shift down beginning at period 11 of a magnitude of .107.

Y(T) =  3.7258                                azure                                                                     
       +[X1(T)][(-  .107)]                              :LEVEL SHIFT       1/ 11    11
      +     [(1-  .864B** 1+  .728B** 2)]**-1  [A(T)]

Here are both forecasts.  That gap between green and red is what you pay for.



Note that the Autobox upper confidence limits are much lower in level.

 

Autobox's residuals are random

 

 

 

 

 

Comments

  • ian mugoya
    ian mugoya Thursday, 06 September 2018

    The learning in machine shop varies from all other forms of learning. In my case the machine shop epoxy flooring has been an experience. different from all other forms of learning.

  • ian mugoya
    ian mugoya Thursday, 06 September 2018
  • bliss
    bliss Friday, 21 September 2018

    The group will be very fortunate having that one. They should be proud and cocky with that. - Paul Savramis

  • Livina Ruxandra
    Livina Ruxandra Tuesday, 25 September 2018

    Senior information science and administration positions, completing my PhD in machine learning the hang of; having created programming applications for best assignment writing services prescient investigation I can't help refuting the aim from this article.

  • Jack
    Jack Monday, 01 October 2018

    Great article, thank you for sharing.

    Drainage Unblocking Gateshead

  • Jack
    Jack Monday, 01 October 2018

    [URL="https://www.kwikflow.com/blocked-drains-gateshead/"]Drainage Unblocking Gateshead[/URL]

  • tepim
    tepim Thursday, 04 October 2018

    American Express is one of the largest multinational companies in the United States. American Express gift card balance It is famous for its businesses of charge card, credit card, and traveler’s cheque.

  • tepim
    tepim Thursday, 04 October 2018

    American Express is one of the largest multinational companies in the United States. American Express gift card balance It is famous for its businesses of charge card, credit card, and traveler’s cheque.

  • nikiblea
    nikiblea Friday, 05 October 2018

    Our Customer Service agents are sound audience members. They tune in to every one of your issues and think of the most ideal arrangement. The client benefit agents at Brother Printer Customer Service Phone Number 1-800-218-9750 are all around prepared to handle every one of your issues inside time. Demonstrable skill is the guideline pursued at our client benefit.
    Brother Printer Support
    Brother Printer Phone Support
    Brother Printer Toll-Free Number
    Brother Printer Customer Service Number

  • Please login first in order for you to submit comments
Go to top