Application of Clustering for Computationally Light Short Term Demand Forecasting

Author(s): Parimesh Panda, Rohan Kumar, Tanish Verma, Ish Chaudhary


The retail manufacturers are grappling with predicting customer demand for each product at superior forecast accuracy levels with the ability to refresh the forecasts at weekly intervals due to computational expensiveness. The overall cost of demand forecasting is determined by adding product shortage cost (loss of missed customer demand), excess product cost (inventory holding costs), product pilferage cost (expired goods cost), and computational cost (involved in forecast models training). While the first three cost components can be attributed to the forecast accuracy levels, the computational cost lever can be controlled through computationally light forecasting systems. The motivation of this research is driven by very limited empirical evidence on the adoption of unsupervised techniques to identify groups of products with similar levels, trends, and seasonality patterns for the development of computationally light forecasting systems. This study aims to decrease the forecast model training cycles by leveraging unsupervised techniques like K-means Clustering and Hierarchical Clustering to identify clusters of products with similar customer purchasing behavior. This research introduces a Clustering-based Demand Forecasting Framework that determines the best forecasting algorithm for each cluster. The experimental approach utilized this framework to predict the customer demand for more than 500 dairy products for the next 8 weeks. A comparative study on computational time across product level model training and cluster level model training is presented for better realization of relaxation in computational costs. The Weighted Accuracy Percentage Error (WAPE) based product level forecast accuracy from Clustering-based Demand Forecasting Framework was found to be superior to the Standard Demand Forecasting Framework. Further, this research provides better interpretability of clusters by providing a product category-specific name to each cluster through Natural Language Processing (NLP) techniques. Holistically, this research effort provides empirical evidence to retail manufacturers for enterprise-wide adoption of the computationally light short-term forecasting system.