Exploration of Linear Model fit using observation weights and parameter constraints to handle poor quality data

Author(s): Ullas M S Rao, Mattias Jönsson


In the machine learning world, it is quite a common scenario to come across with the issues of variability and heteroscedasticity. One of the ways to address this is to assign weights to the observations. This helps us to ensure that the outliers are given lesser weights. Thus, reducing their effect on the model. Also, in certain cases, it is required to add some linear constraints on the estimates to restrict the predictor to be greater or lesser than a certain value or within a given range of values, this is mostly driven by the deep understanding of the real-world behavior and by certain business scenarios. Based on data analysis, it was found that market volume had an impact on the costs which required us to weigh the observations proportional to the volume of the container movement over twelve months. Added to it, algorithm the model had to accurately reflect the higher cost higher for refrigerated containers over the containers for dry cargos, given that all other features remained the same. This is a clear case of the combination of Linear regression model with weights and subjected to certain constraints.