Reader

GraphRAG with MongoDB Atlas: Integrating Knowledge Graphs with LLMs

| MongoDB Blog | Default

A key challenge AI developers face is providing context to large language models (LLMs) to build reliable AI-enhanced applications; retrieval-augmented generation (RAG) is widely used to tackle this challenge. While vector-based RAG, the standard (or baseline) implementation of retrieval-augmented generation, is useful for many use cases, it is limited in providing LLMs with reasoning capabilities that can understand relationships between diverse concepts scattered throughout large knowledge bases. As a result, the accuracy of vector RAG-enhanced LLM outputs in applications can disappoint—and even mislead—end users.

Now generally available, MongoDB Atlas’ new LangChain integration for GraphRAG—a variation of RAG architecture that integrates a knowledge graph with LLMs—can help address these limitations.

GraphRAG: Connecting the dots

First, a short explanation of knowledge graphs: a knowledge graph is a structured representation of information in which entities (such as people, organizations, or concepts) are connected by relationships. Knowledge graphs work like maps, and show how different pieces of information relate to each other. This structure helps computers understand connections between facts, answer complex questions, and find relevant information more easily.

Traditional RAG applications split knowledge data into chunks, vectorize them into embeddings, and then retrieve chunks of data through semantic similarity search; GraphRAG builds on this approach. But instead of treating each document or chunk as an isolated piece of information, GraphRAG considers how different pieces of knowledge are connected and relate to each other through a knowledge graph.

Figure 1. Embedding-based vector search vs. entity-based graph search.
Diagram showing embedding-based vector search vs. entity-based graph search. On the left, embedding based vector search are separated into boxes in a clean stacked line. On the right, entity-based graph search is show as hexagons that connect to one another, and the search has pulled out a single group.

GraphRAG improves RAG architectures in three ways:

First, GraphRAG can improve response accuracy. Integrating knowledge graphs into the retrieval component of RAG has shown significant improvements in multiple publications. For example, benchmarks in the AWS investigation, “Improving Retrieval Augmented Generation Accuracy with GraphRAG” demonstrated nearly double the correct answers compared to traditional embedding-based RAG.

Also, embedding-based methods rely on numerical vectors and can make it difficult to interpret why certain chunks are related. Conversely, a graph-based approach provides a visual and auditable representation of document relationships.

Consequently, GraphRAG offers more explainability and transparency into retrieved information for improved insight into why certain data is being retrieved. These insights can help optimize data retrieval patterns to improve accuracy.

Finally, GraphRAG can help answer questions that RAG is not well-suited for—particularly when understanding a knowledge base's structure, hierarchy, and links is essential. Vector-based RAG struggles in these cases because breaking documents into chunks loses the big picture.

For example, prompts like “What are the themes covered in the 2025 strategic plan?” are not well handled. This is because the semantic similarity between the prompt, with keywords like “themes,” and the actual themes in the document may be weak, especially if they are scattered across different sections. Another example prompt like, “What is John Doe’s role in ACME’s renewable energy projects?” presents challenges because if the relationships between the person, the company, and the related projects are mentioned in different places, it becomes difficult to provide accurate responses with vector-based RAG.

Traditional vector-based RAG can struggle in cases like these because it relies solely on semantic similarity search. The logical connections between different entities—such as contract clauses, legal precedents, financial indicators, and market conditions—are often complex and lack semantic keyword overlap. Making logical connections across entities is often referred to as multi-hop retrieval or reasoning in GraphRAG.

However, GraphRAG has its own limitations, and is use-case dependent to achieve better accuracy than vector-based RAG:

  • It introduces an extra step: creating the knowledge graph using LLMs to extract entities and relationships. Maintaining and updating the graph as new data arrives becomes an ongoing operational burden. Unlike vector-based RAG, which requires embedding and indexing—a relatively lightweight and fast process—GraphRAG depends on a large LLM to accurately understand, map complex relationships, and integrate them into the existing graph.

  • The added complexity of graph traversal can lead to response latency and scalability challenges as the knowledge base grows. Latency is closely tied to the depth of traversal and the chosen retrieval strategy, both of which must align with the specific requirements of the application.

  • GraphRAG introduces additional retrieval options. While this allows developers more flexibility in the implementation, it also adds complexity. The additional retrieval options include keyword and entity-based retrieval, semantic similarity on the first node, and more.

MongoDB Atlas: A unified database for operational data, vectors, and graphs

MongoDB Atlas is perfectly suited as a unified database for documents, vectors, and graphs. As a unified platform, it’s ideal for powering LLM-based applications with vector-based or graph-based RAG. Indeed, adopting MongoDB Atlas eliminates the need for point or bolt-on solutions for vector or graph functionality, which often introduce unnecessary complexity, such as data synchronization challenges that can lead to increased latency and potential errors.

The unified approach offered by MongoDB Atlas simplifies the architecture and reduces operational overhead, but most importantly, it greatly simplifies the development experience. In practice, this means you can leverage MongoDB Atlas' document model to store rich application data, use vector indexes for similarity search, and model relationships using document references for graph-like structures.

Implementing GraphRAG with MongoDB Atlas and LangChain

Starting from version 0.5.0, the langchain-mongodb package introduces a new class to simplify the implementation of a GraphRAG architecture.

Figure 2. GraphRAG architecture with MongoDB Atlas and LangChain
Diagram showing the GraphRAG architecture with MongoDB Atlas and LangChain. The top of the diagram begins with a query, which flows into MongoDB Atlas. Atlas than pulls in information from Document Corpus, on the left of the diagram, through graph extraction prompt. The LLM then retrieves entities and relationships through similarity search, and provides a contextually grounded response.

First, it enables the automatic creation of a knowledge graph. Under the hood, it uses a specific prompt sent to an LLM of your choice to extract entities and relationships, structuring the data to be stored as a graph in MongoDB Atlas.

Then, it sends a query to the LLM to extract entities and then searches within the graph to find connected entities, their relationships, and associated data. This information, along with the original query, then goes back to the LLM to generate an accurate final response.

MongoDB Atlas’ integration in LangChain for GraphRAG follows an entity-based graph approach. However, you can also develop and implement your own GraphRAG with a hybrid approach using MongoDB drivers and MongoDB Atlas’ rich search and aggregation capabilities.

Enhancing knowledge retrieval with GraphRAG

GraphRAG complements traditional RAG methods by enabling deeper understanding of complex, hierarchical relationships, supporting effective information aggregation and multi-hop reasoning. Hybrid approaches that combine GraphRAG with embedding-based vector search further enhance knowledge retrieval, making them especially effective for advanced RAG and agentic systems.

MongoDB Atlas’ unified database simplifies RAG implementation and its variants, including GraphRAG and other hybrid approaches, by supporting documents, vectors, and graph representations in a unified data model that can seamlessly scale from prototype to production. With robust retrieval capabilities ranging from full-text and semantic search to graph search, MongoDB Atlas provides a comprehensive solution for building AI applications. And its integration with proven developer frameworks like LangChain accelerates the development experience—enabling AI developers to build more advanced and efficient retrieval-augmented generation systems that underpin AI applications.

Ready to dive into GraphRAG? Learn how to implement it with MongoDB Atlas and LangChain.

Head over to the Atlas Learning Hub to boost your MongoDB skills and knowledge.