The Machine Learning Developers Summit (MLDS) 2024 in Bengaluru witnessed an illuminating session by Sriram Gudimella, Senior Manager of Analytics at Tredence. With a technologist’s heart and a management degree, Sriram delved into the intriguing realm of mitigating hallucinations in large language models. Tackling challenges in the marine classification domain, Sriram shared a journey marked by persistence, innovative strategies, and a triumphant 81% straight-through processing success.
Why Dive into Hallucination Mitigation?
Sriram began by addressing the pivotal question: why embark on the journey of understanding and handling hallucinations in language models? The trigger was a unique challenge faced by a prestigious US marine classification company. Tasked with certifying vessels’ seaworthiness and advising on global regulations, they grappled with a flood of queries from across the globe and time zones. The inability to respond promptly led to a week-long response time, posing a significant challenge in a highly regulated industry.
Hallucinations in Marine Classification
Sriram elucidated the challenges faced during the implementation. The first hurdle encountered was hallucinations in responses. Hallucination, in the context of language models, refers to information that appears accurate but is, in fact, incorrect upon verification. Sriram presented a captivating example involving the audience in a quick quiz about the capital of France, highlighting how context can lead to hallucinatory answers. In the marine classification scenario, these hallucinations posed a serious threat to the accuracy of responses.
Strategies in Action: Navigating Hallucinations
Sriram outlined the multifaceted strategies employed to overcome hallucinations. The journey started with proposing the adoption of Jina, a tool to answer queries and reduce the burden on human responders. However, challenges persisted, especially with fluctuating responses and question paraphrasing. Temperature setting, acronyms handling, and context selection emerged as crucial techniques. Sriram emphasized the significance of continuous monitoring, iterative improvements, and collaboration with Azure openAI, highlighting the importance of prompt engineering for precise results.
Deeper Challenges in Multi-Domain Expansion
Expanding the model to multiple domains brought forth additional challenges. Sriram shared the complexities arising from fluctuating responses and inaccurate passage selection. In domains like marine classification, where acronyms might have different meanings, the model struggled. The strategy shifted to domain constraints, context selection, and employing classification models to enhance accuracy.
Achieving 81% Straight-Through Processing
Despite the challenges, Sriram and the team achieved a commendable 81% straight-through processing success. The implementation reduced response times for marine classification queries, providing relevant information to subject matter experts. The transparency in the system increased by 40%, offering insights into the response process. The platform’s adoption and engagement soared, signifying a significant win in the quest to streamline and enhance marine classification processes.
Conclusion
Sriram Gudimella’s talk at MLDS 2024 showcased not only the challenges but also the innovative strategies employed to mitigate hallucinations in large language models. The journey from conceptualization to implementation, overcoming domain-specific hurdles, and ultimately achieving success serves as an inspiration for developers and organizations navigating the seas of generative AI. The talk not only addressed the intricacies of marine classification but also provided valuable insights into the broader landscape of managing hallucinations in language models. As the MLDS 2024 unfolded, Sriram’s presentation stood out as a beacon of innovation and problem-solving in the ever-evolving field of machine learning.