IntelliQSense: An intelligent, real-time Query Autocompletion Framework using GPT-2

Author(s): Taaniya Arora, Shashank Srivastava, Neha Prabhugaonkar

Abstract

Query Autocompletion (QAC) is a common feature for input text-based applications where a user’s partially typed prefix input is completed. It has primarily been studied for applications involving search-based queries that are short sequences or phrases. We present a novel approach to QAC for a Question Answering system in an Augmented Analytics platform where queries are essentially business and analytical questions in natural language. Our approach involves a combination of semantic search and natural language generation via beam search for completing questions. To enable generative completion in natural language and handle unseen prefixes, we use a pretrained distilgpt2 model that is fine-tuned for question completion task. In addition, we describe a method to synthesize training data from limited available past queries for fine-tuning the model and generate quality results for completion. We evaluate the proposed method using two datasets from different business domains based on fluency, style of text and business sense of the application with perplexity and precision. Experimental results demonstrate that our framework is effective and generalizes well on unseen prefixes with acceptable latency using suitable optimizations.

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Explore more from Association of Data Scientists

Become ADaSci Chapter Lead

As a chapter lead, you will have the opportunity to connect with fellow data professionals in your area, share knowledge and resources, and work together to advance the field of data science.