Memberships

Individual Membership
Join the world’s leading Data Science professional community. You can access both General & Premium Memberships.

Learn More

Corporate Membership
Any corporate, organization or academic institution having common interests in the AI field can become a member of ADaSci.

Learn More
Accreditations

Institutional Accreditation
Our accreditation is a mark of excellence, validating the quality, relevance, and industry alignment of your programs, products, and services.

Learn More

Chartered Data Scientist™
The Chartered Data Scientist (CDS) credential gives a strong understanding of advanced data science profession and in-depth, applied analytics skills.

Learn More

Certified Generative AI Engineer
An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models.

Learn More
Continuous Learning

Our Latest Courses

Advanced RAG with Pinecone

₹3,427.00
Add to cart

ADaSci Certified Vibe Coding Practitioner

₹21,339.00
Add to cart

ADaSci Certified Data Engineer

₹21,339.00
Add to cart

ADaSci Certified Agentic AI System Architect

₹21,339.00
Add to cart

Hi, Welcome back!

Keep me signed in
Forgot Password?

Don't have an account? Register Now

Access all Courses
Corporate Trainings
Contact

Lattice | Volume-4 ISSUE-1

Weighted clustering on fast sentence embeddings to determine themes from large unstructured data

Author(s): Paritosh Sinha, Mohan Krishna Askani

Explore more from ADaSci

A Practical Guide to Building AI Agents With LangGraph

Enhancing Search Engines with RAG and Knowledge Graphs

Ultimate Guide to High-Performance, Scalable Databases SingleStore

Charting the Evolution of Language Models: From LSTM to GPT4

Application of Ensemble Clustering to Create Consumption Behavior based Store Clusters

Analysis of Sectoral Profitability of the Indian Stock Market Using an LSTM Regression Model

What does it take to deploying an LLM at major cloud service providers?

Mastering AI Code Execution in Secure Sandboxes with E2B

Hands-on Guide to Langfuse for LLM-Based Applications

Parameter-Efficient Tuning of Large Language Models (LLMs) with Novel Ensemble Knowledge Distillation Framework – Rohit Sroch

Abstract

Most engineering product improvements are driven based on feedback from users and engineers. B2C products, such as the ones used to target customers or send personalised communications or manage order requests, track event-level actions and failures to improve product performance. However, the volume of failure logs (often in the order of a billion) and their unstructured nature (machine logs with minimal friendliness for human understanding) often hinder the detection of underlying themes from event failures. This paper discusses a unique and highly efficient approach to tune and leverage a language model for embedding generation. Using a weighted clustering technique, the embeddings are subsequently used to group failures into auto-detectable themes. The paper also proposes distinctive methods to manage embeddings that help improve the algorithm’s performance, while retaining its focus on efficiency and computation time. Our experiments show that the proposed technique provides similar performance to the latest language models while taking less than one-tenth of the overall computation time.

ADaSci

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our AI Courses

Build AI Agents with Google ADK
₹1,714.00
Add to cart

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.

Our Latest Courses

Weighted clustering on fast sentence embeddings to determine themes from large unstructured data

Explore more from ADaSci

Abstract

ADaSci

The Chartered Data Scientist Designation

Elevate Your Team's AI Skills with our Proven Training Programs

Our AI Courses

Build AI Agents with Google ADK

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.

The power of intelligence to propel humanity and make a difference

Our Accrediations

CDS Program

Membership

About

For Organizations

Journal