Upskill your Team on Generative AI. Start here >

Imbalance Handling with Combination of Deep Variational Autoencoder and NEATER

Author(s): Divye Singh, Jayaraman Valadi, Hrushikesh Bhosle, Aamod Sane, Kanchan Kalunge

Abstract

In real-world applications, the class imbalance is a very common problem which is encountered in different areas ranging from medical diagnosis to anomaly detection. This imbalanced class distribution makes extracting useful information very challenging for many popular algorithms. In this situation, optimizing the overall accuracy can highly skew the predictions toward the majority class label. Consequently, the false positive rate increases. Several methods have been introduced to address this problem; these methods are less effective when the minority class has very few examples. The increase in popularity of deep learning frameworks have led to the development of synthetic example generators like generative adversarial network (GAN) and variational autoencoder (VAE). Variational autoencoder is deep learning-based generative modelling technique which uses variational inference for learning data distribution. In this paper, we propose a synergistic over-sampling method with a view to generating informative synthetic minority class data by filtering the noise from the over-sampled examples. To generate the synthetic examples, disentangled variational autoencoder is used while the filtering is carried out using a game-theory based filtering algorithm, NEATER. This algorithm efficiently handles filtering noisy examples as a non-cooperative game. The experimental results on several real-life imbalanced datasets, taken from UCI and KEEL, prove the effectiveness of the proposed method for binary classification problems.

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists

Explore more from Association of Data Scientists