ADaSci Premium Membership fee will be revised from 1st March 2024. Lock your membership for 1 year at current price.

Enhancing Zero-Shot Image Classification: A Triad Approach with Prompt Refinement, Confidence Calibration, and Ensembling

Author(s): Sabarish Vadarevu, Raghav Mehta, Rakshith Sundaraiah, Vijay Karamcheti

CLIP (Contrastive Language-Image Pre-training) excels in zero-shot image classification across diverse domains, making it an ideal candidate for pre-labelling unlabelled datasets. This paper introduces three pivotal enhancements designed to elevate CLIP-based pre-labelling efficacy without the need for labelled data. First, we introduce prompt refinement using a large language model (GPT-3.5- Turbo) to generate more descriptive prompts, significantly boosting accuracy on various datasets. Second, we address overconfident predictions through confidence calibration, achieving improved results without the need for a separate labelled validation set.

Lastly, we leverage the inductive biases of CLIP and DINOv2 through ensembling, demonstrating a substantial boost in zero-shot labelling accuracy. Experimental results across various datasets consistently demonstrate enhanced performance, particularly in handling ambiguous classes. This work not only addresses limitations in CLIP but also provides valuable insights for advancing multimodal models in real-world applications.

Access The Research Paper:

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists

Explore more from Association of Data Scientists