Generative AI Crash Course for Non-Tech Professionals. Register Now >

ML based high-cardinality reduction methods to create geo-score to improve auto insurance Tweedie pricing model

Author(s):Suguna Jayaraj, Harmandeep Kaur


A typical automobile insurance rating plan contains a plethora of risk factors, ranging from driver and vehicle to policy characteristics. Including the geographical risk characteristics in the pricing has been challenging owing to its high cardinality. The traditional approach groups the postal codes based on the historical loss experience, which suffers from two major drawbacks: a) For geographies with low exposure, the loss cost is almost always zero b) Low confidence as we lose information on the latent variables. In this paper, we demonstrate a case study of a Greece automobile insurance product offered by a major US-based P&C provider, where a Geo-score was developed at a postal code level to improve risk segmentation in own damage cover pricing. The base loss cost(loss/exposure) model was built using Tweedie Compound Poisson regression, and geospatial attributes were added to the model without changing the existing rating structure. The external attributes like socio-demographic variables and highway/network data are sourced to create geographical clusters using partitioning around medoids (PAM). Further, various high cardinality feature reduction techniques were used to predict the residual loss cost. This paper illustrates the hybrid approach of the target-based encoding methods and XGBoost to create the geo-score.

Picture of Association of Data Scientists

Association of Data Scientists

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.