SEAD: Simple Ensemble and Knowledge Distillation Framework for Natural Language Understanding

Author(s):Moyan Mei,Rohit Sroch

Abstract:

Language models based on deep learning are widely used in a number of applications, including text summarization, conversational bots, Q&A, text generation, translation, semantic search, information retrieval etc. Many research and industry participants use pre-trained language models (PLMs) to build and architect these use cases. With the widespread use of pre-trained language models (PLM), there has been increased research on how to make them applicable, especially in limited resource or low latency high throughput scenarios. One of the dominant approaches is knowledge distillation (KD), where a smaller model is trained by receiving guidance from a large PLM. While there are many successful designs for learning knowledge from teachers, it remains unclear how students can learn better. Inspired by real university teaching processes, in this work, we further explore knowledge distillation and propose a very simple yet effective framework, SEAD, to further improve task-specific generalization by utilizing multiple teachers. Our experiments show that SEAD leads to better performance compared to other popular KD methods and achieves comparable or superior performance to its teacher models, such as BERT, on total of 13 tasks for the GLUE and SuperGLUE benchmarks.

Picture of Association of Data Scientists

Association of Data Scientists

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.