Imbalance Handling with Combination of Deep Variational Autoencoder and NEATER

Author(s): Divye Singh, Jayaraman Valadi, Hrushikesh Bhosle, Aamod Sane, Kanchan Kalunge

Abstract

In real-world applications, the class imbalance is a very common problem which is encountered in different areas ranging from medical diagnosis to anomaly detection. This imbalanced class distribution makes extracting useful information very challenging for many popular algorithms. In this situation, optimizing the overall accuracy can highly skew the predictions toward the majority class label. Consequently, the false positive rate increases. Several methods have been introduced to address this problem; these methods are less effective when the minority class has very few examples. The increase in popularity of deep learning frameworks have led to the development of synthetic example generators like generative adversarial network (GAN) and variational autoencoder (VAE). Variational autoencoder is deep learning-based generative modelling technique which uses variational inference for learning data distribution. In this paper, we propose a synergistic over-sampling method with a view to generating informative synthetic minority class data by filtering the noise from the over-sampled examples. To generate the synthetic examples, disentangled variational autoencoder is used while the filtering is carried out using a game-theory based filtering algorithm, NEATER. This algorithm efficiently handles filtering noisy examples as a non-cooperative game. The experimental results on several real-life imbalanced datasets, taken from UCI and KEEL, prove the effectiveness of the proposed method for binary classification problems.

Picture of Association of Data Scientists

Association of Data Scientists

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.