ADaSci Banner 2024

Default Rate Prediction Models for Self- employment in Korea using Ridge, Random Forest and Deep Neural Network

Author(s): Dongsuk Hong, Hanjong Baeck


This study introduces machine learning (ML) and deep learning (DL) models for predicting self-employment default rates using credit information. Most preceding studies regarding corporate credit risk often focus on bankruptcy prediction models, which involve and target list companies, where they utilize financial information as the main variables and also use macro-economic information as auxiliary variables. However, bankruptcy prediction models are difficult to apply to cases where financial information is insufficient, such as small-and-medium enterprise (SME) and self-employment businesses. In addition, there hardly exist studies on the prediction of corporate default rates by industry and also very limited. We hereby used micro-level variables that were processed by analysis of credit information such as loans and overdue history of individual businesses in the Korean manufacturing sector from April 2014 through June 2019, together with typical macro-economic ones, such that we reach to achieve performance enhancement in predicting default rates. We then evaluated the effect of the algorithms such as Ridge, Random Forest (RF), and Deep Neural Network (DNN) make on the performance of the proposed model, i.e. default-rates prediction model for self-employment. In this study, the DNN model is implemented for two purposes, where a submodel for the selection of credit information variables, and it also works for cascading to the final model that predicts default rates by receiving the selected input variables. Each consists of 2 and 3 hidden layers, respectively, and each layer again consists of 5 nodes. The activation function, solver and learning rate were determined through hyper-parameter tuning. As a result, when the credit information variable was used together with the macro-economic variable, the prediction performance was increased by 3.48% points (R2=0.981), compared to the Ridge model using only macro-economic variables, and the DNN performance of the final model was increased by 4.74% points (R2=0.993).

Picture of Association of Data Scientists

Association of Data Scientists

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.