Deep Dives

MongoDB Atlas Vector Search for RAG powered LLM Applications

MongoDB Atlas Vector Search combines document databases with semantic search for smarter LLM applications.

Explore more from ADaSci

An Empirical Analysis of Deep Learning Models for Electric Vehicle Load Disaggregation

Positioning Your Enterprise as a Leader Through Strategic Upskilling with ADaSci

Stability.ai’s Stable Audio Open: A Text-to-Audio Generation Model

Revolutionizing Government Security with Dyota AI’s Radar Integration: A Leap into Enhanced Operational Efficiency

A Practical Guide to Building AI Agents With LangGraph

AdalFlow: A Hands-On Guide to Building and Optimizing LLM Task Pipelines

From Voxel to Vision: Unleashing the Potential of Brain Mapping in Healthcare

An Evolutionary Equational Approach to Blending Market Research and Applied Analytics to Improve Trust in Technology Services

Revolutionising Market Surveys through Unprecedented Generative AI for Efficient Data Synthesis

LLaMA 3-70B Vs Mixtral 8x7B: Analyzing the Logical Prowess on NVIDIA NIM

Traditional search methods often fall short in capturing the semantics and complexities of data. This is where vector search emerges as a solution. By representing data as numerical vector, vector search unlocks the ability to understand semantic meaning and relationships. MongoDB’s Atlas Vector Search is a tool that enables users to build intelligent applications by combining the flexibility of a document database with the efficiency of vector search. This article provides a hands-on tutorial on Atlas Vector Search with explanation.

Understanding Vector Search
Overview of Atlas Vector Search
Implementation of Atlas Vector Search in RAG Application

Understanding Vector Search

Vector search is a technique of finding and retrieving information based on the meaning and context of data. It uses the concept of numerical representations known as vectors. Vectors are mathematical objects/quantities having both magnitude and direction. They can be represented as arrays of numbers, where each element represents a specific dimension or component.

Vector search uses the concepts of embeddings, vector space and semantic meaning along with similarity search. An embedding, numerical representation of the data, is placed in a high-dimensional space. The position of the vector in this space reflects its semantic meaning, that means similar items will have vector closer together.

Instead of treating words as isolated entities, vector embeddings represent them as multidimensional vectors. Each dimension captures a specific aspect of the word, like its meaning, part of speech, or sentiment. Words with similar meanings tend to have similar vector representations in this multidimensional space. For example: “helicopter”, “drone” and “rocket” being close neighbours, reflect their semantic closeness.

The search query is also transformed into a vector. The system then calculates the distance between the query vector and all other vectors in the data. The items with vectors closest to the query vector are considered most similar and are ranked accordingly. As a final output, the system returns the top-ranked items as search results.

Vector Search Overview

Overview of Atlas Vector Search

MongoDB Atlas Search is a feature integrated directly into MongoDB’s cloud database service. It combines the robust document storage capabilities of MongoDB with advanced search functionalities allowing users to perform complex queries on their data without requiring a separate search engine.

Atlas Search enables full-text search across text fields in MongoDB documents. Users can search for specific words, phrases, or even fuzzy matches within the data. This goes beyond simple text matching as it incorporates advanced search elements such as relevance scoring, which helps in ranking results based on the degree of their match with search query.

Another standout feature of Atlas Search is its seamless integration with MongoDB’s aggregation pipeline. This allows uses to combine data processing operations with search queries, enabling data manipulations and analysis in a single operation.

Atlas Vector Search is a feature built on top of Atlas Search that allows for similarity-based searches using vector representations of data. It provides an efficient way to store, index, and query high-dimensional vector data, which is fundamental to many LLM applications. These models often represent text, images, etc. as dense vectors, and being able to quickly find similar vectors is the key to their functionality.

Atlas Vector Search integrates seamlessly with a wide variety of LLMs and frameworks such as LangChain, LlamaIndex, OpenAI, Cohere, Hugging Face, Haystack, MS Semantic Kernal and AWS which further increases its utility and application.

In order to perform vector search in Atlas, the users need to create an Atlas Vector Search index which are separate from basic database indexes and are used to efficiently retrieve documents that contain vector embeddings at query-time. In the vector search index definition, the users need to index the fields in the collection that contain embeddings for performing vector search against those fields.

Atlas Vector Search supports Approximate Nearest Neighbor (ANN) search with Hierarchical Navigable Small Worlds (HNSW) algorithm and Exact Nearest Neighbor (ENN) search.

Implementation of Atlas Vector Search in RAG Application

Let’s implement MongoDB’s Atlas Vector Search and perform Retrieval Augmentation Generation.

Prerequisites:

A: Signup/Login on MongoDB Atlas Webpage and create a default cluster (M0 – Free)

B: Make a note of the connnection string – this will be utilised in connecting with the cluster.

After Atlas has been setup, let’s go to out Python IDE and code.

Step 1: Install the required libraries

pip install --quiet --upgrade llama-index llama-index-vector-stores-mongodb llama-index-embeddings-openai pymongo

Step 2: Import the libraries

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.settings import Settings
from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, ExactMatchFilter, FilterOperator
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch
import getpass, os, pymongo, pprint

Step 3: Setup the OpenAI API and MongoDB Atlas Connection String

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
ATLAS_CONNECTION_STRING = getpass.getpass("MongoDB Atlas SRV Connection String:")

Step 4: Configure LlamaIndex settings for OpenAI Embedding Model

Settings.llm = OpenAI()
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.chunk_size = 256
Settings.chunk_overlap = 20

Step 5: Load the data and create a vector store

sample_data = SimpleDirectoryReader(input_files = ["data/waitgpt.pdf"]).load_data()

mongodb_client = pymongo.MongoClient(ATLAS_CONNECTION_STRING)

atlas_vector_store = MongoDBAtlasVectorSearch(
   mongodb_client,
   db_name = "llamaindex_db",
   collection_name = "test",
   index_name = "vector_index"
)

vector_store_context = StorageContext.from_defaults(vector_store = atlas_vector_store)

Step 6: Store the data as vector embeddings

vector_store_index = VectorStoreIndex.from_documents(
  sample_data, storage_context=vector_store_context, show_progress = True
)

Step 7: Setup Atlas Search for MongoDB Cluster and configure it using the following JSON file

{
  "fields": [
     {
        "type": "vector",
        "path": "embedding",
        "numDimensions": 1536,
        "similarity": "cosine"
     },
     {
        "type": "filter",
        "path": "metadata.page_label"
     }
  ]
}

We can see that the vector index is created. Let’s implement RAG and check if vector search is working properly.

Step 8: Implementing RAG over the vector index using vector search

vector_store_retriever = VectorIndexRetriever(index = vector_store_index, similarity_top_k = 2)

query_engine = RetrieverQueryEngine(retriever = vector_store_retriever)
response = query_engine.query('What is WaitGPT?')
print(response)

print("\nSource documents: ")
pprint.pprint(response.source_nodes)

Output:

Step 9: We can enhance the entire operation using a ChatEngine for back-and-forth communication as well

chat_engine = vector_store_index.as_chat_engine(
   chat_mode="condense_question", streaming=True
)

response_stream = chat_engine.stream_chat("What are the benefits of WaitGPT")
response_stream.print_response_stream()

Output:

WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling users to enhance error detection and increase their overall confidence in the results. Additionally, WaitGPT empowers users to monitor and steer data analysis performed by LLM agents, enhancing error detection rate and improving overall confidence in the results.

response_stream = chat_engine.stream_chat("How is WaitGPT able to enhance LLMs?")

response_stream.print_response_stream()

Output:

WaitGPT facilitates monitoring and steering of data analysis performed by LLMs by translating stream-based code into a growing visualization of key data operations, allowing for granular interactions. This approach empowers users to actively monitor and steer the data analysis process, leading to enhanced error detection rates and increased overall confidence in the results.

Step 10: Similarity search can also be implemented on the vector index

retriever = vector_store_index.as_retriever(similarity_top_k=2)

nodes = retriever.retrieve("How does WaitGPT addresses reliability issues and user challenge in LLM apps?")

for node in nodes:
   print(node)

Output:

Node ID: 640da0ef-c300-4876-af75-bf2e5c90af43

Text: By translating stream-based code into a growing visualization of the key data operations and affording granular interactions, WaitGPT empowers users to monitor and steer data analysis performed by LLM agents. A user study (N=12) covering basic data analysis tasks demonstrated that WaitGPT could enhance error detection rate and improve overall co…

Score: 0.926

Node ID: a1d10be9-fd5e-41e2-a25f-8171f77c7087

Text: Publication rights licensed to ACM. This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in The 37th Annual ACM Symposium on User Interface Software and Technology (UIST ’24), October 13–16, 2024, Pittsburgh, PA, USA…

Score: 0.926

Final Words

MongoDB’s Atlas Vector Search provides a comprehensive platform from implementing vector search, combining the strengths of a robust document database with advanced search capabilities. Its integration and support for various LLMs and framework makes it a versatile choice for building intelligent applications and agents.

References

Sachin Tripathi

Sachin Tripathi is the Manager of AI Research at AIM, with over a decade of experience in AI and Machine Learning. An expert in generative AI and large language models (LLMs), Sachin excels in education, delivering effective training programs. His expertise also includes programming, big data analytics, and cybersecurity. Known for simplifying complex concepts, Sachin is a leading figure in AI education and professional development.

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Latest Courses

MongoDB Atlas Vector Search for RAG powered LLM Applications

Explore more from ADaSci

Table of Contents

Understanding Vector Search

Overview of Atlas Vector Search

Implementation of Atlas Vector Search in RAG Application

Final Words

References

Sachin Tripathi

The Chartered Data Scientist Designation

Elevate Your Team's AI Skills with our Proven Training Programs

Our AI Courses

Agentic AI Workforce Readiness Strategies for CXOs

Build AI Agents with Google ADK

Vibe Coding Bootcamp: Build Apps with AI and No Code

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.

The power of intelligence to propel humanity and make a difference

Our Accrediations

CDS Program

Membership

About

For Organizations

Journal