Chenhui Hu, Vanja Paunic, Hong Ooi, Tao Wu, Wee Hyong Tok
Time series forecasting is one of the most important topics in data science. Imagine that you are a business owner, you might want to predict different sorts of future events to make better decisions and optimize your resource allocation. Typical examples of time series forecasting use cases are retail sales forecasting, package shipment delay forecasting, energy demand forecasting, and financial forecasting. As you can see, forecasting is everywhere! Given its ubiquitous nature and wide-ranging business applications, we have developed an open-source forecasting repo that puts world-class models and forecasting best practices in the hands of data scientists and industry experts – i.e., you!
Figure 1: Visualization of training and testing iterations of a sales forecasting scenario using LightGBM model
This repository provides examples of building forecasting solutions presented as Python Jupyter notebooks, R markdown files, and a library of utility functions. Our goal is to help you as a data scientist or machine learning engineer with varying levels of knowledge in forecasting:
In the repository, you will find state-of-the-art (SOAT) forecasting models using traditional machine learning and deep learning approaches. Implementations of SOTA models in this release are centered around retail sales forecasting and are written in Python and R, two of the most popular programming languages in the forecasting domain. To enable high-throughput forecasting scenarios, we have included notebooks for forecasting multiple time series with distributed training techniques such as Ray in Python, the parallel package in R, and multi-threading in LightGBM. The following is a quick summary of forecasting models covered in this repository.
Model |
Language |
Description |
Python |
Auto Regressive Integrated Moving Average (ARIMA) model that is automatically selected |
|
Python |
Linear regression model trained on lagged features of the target variable and external features |
|
Python |
Gradient boosting decision tree implemented with LightGBM package for high accuracy and fast speed |
|
Python |
Dilated Convolutional Neural Network that captures long-range temporal flow with dilated causal connections |
|
R |
Simple forecasting method based on historical mean |
|
R |
ARIMA model without or with external features |
|
R |
Exponential Smoothing algorithm with additive errors |
|
R |
Automated forecasting procedure based on an additive model with non-linear trends and Tidyverts framework |
The repository also comes with Azure Machine Learning (Azure ML) themed notebooks and best practices recipes to accelerate the development of scalable, production-grade forecasting solutions on Azure. You will find the following examples for forecasting with Azure AutoML as well as tuning and deploying a forecasting model on Azure.
Method |
Language |
Description |
Python |
Azure ML service that automates model development process and identifies the best machine learning pipeline |
|
Python |
Azure ML service for tuning hyperparameters of machine learning models in parallel on cloud |
|
Python |
Azure ML service for deploying a model as a web service on Azure Container Instance |
Developing an accurate forecasting solution can be a complex and time-consuming process. We hope the forecasting repo will help shorten your development cycle.
For more information, please visit: https://github.com/microsoft/forecasting
Contributions from open-source community are always welcome! Please feel free to check our contribution guide if you would like to contribute to the content and bring in the latest SOTA algorithms.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.