Courses & Documentary

MCP vs. RAG

The belief that AI agents inherently "know everything on the internet" is a critical misconception that IBM Technology’s SR.AI Productivity Expert Melissa Hadley addresses directly. As Melissa Hadley explains, large language models (LLMs) on their own are "kind of like brilliant interns with literally no memory and no access to your systems". They can communicate, but they "don't know your data" and "certainly cannot act on your behalf". This limitation reinforces the truth that "AI is only as good as the data you give it". To make AI agents more accurate and useful, Melissa Hadley details two distinct methodologies for providing agents with access to data: Retrieval Augmented Generation (RAG) and Model Context Protocol (MCP).

Melissa Hadley emphasizes that both RAG and MCP aim to make models "smarter and more useful" by helping AI "provide more insight, answer questions, help users while being grounded in actual information". A key similarity is that the data they access "doesn't actually live in the large language model, but is instead provided by outside knowledge". Both methods can also "reduce hallucinations by grounding the model in real-time or specialized information". However, the fundamental difference lies in their ultimate goals: "RAG helps models no more by pulling in the right information, while MCP helps models do more by connecting them to tools and systems that drive work". The information utilized can include documents, PDFs, videos, websites, or even systems and applications.

Using the example of an employee preparing for a vacation, Melissa Hadley breaks down the distinct purposes, data, and processes of these two protocols.

-RAG: Retrieval Augmented Generation (Knowing More)
Melissa Hadley states that RAG's main purpose is to "add information" to the LLM by providing it with additional context.

-RAG allows LLMs to access and reference proprietary or specialized knowledge bases so that generated responses are "grounded in up-to-date and authoritative information". RAG focuses on retrieving "static, semi-structured, or even unstructured" data, such as documents, manuals, and PDFs. Crucially, RAG enhances verification by providing the user with the "source of information from an answer," allowing the answer to be checked.

Model Context Protocol (MCP): A Game-Changer for AI Integration and Agentic  Workflows

Related article - Uphorial Shopify 

MCP vs RAG. Powering Up AI in Different Ways | by Tamanna | Medium

The RAG process, as outlined by Melissa Hadley, follows five steps:

1. Ask: The user submits a question, such as, "What is our vacation policy?".
2. Retrieval: The system transforms the prompt into a search query and retrieves the most relevant data from a knowledge base, such as an employee handbook, perhaps in PDF format.
3. Return: The retrieved passage is sent back to the integration layer for context building.
4. Augmentation: The system builds an enhanced prompt for the LLM by combining the user's question with the retrieved content.
5. Generation: The LLM uses the augmented prompt to produce a grounded answer and returns it to the user.
For the vacation example, Melissa Hadley notes that RAG would help the employee "read through the employee handbook, any payroll documentation to understand maybe the company's vacation policy, how it works, how employees accrue time off, and more".

MCP: Model Context Protocol (Doing More)
In contrast, Melissa Hadley states that MCP’s main purpose is to "take action". MCP is a communication protocol that allows the AI agent to connect to an external system, either to "gather information, update systems with new information, execute actions," or by "orchestrating workflows or going to get live data". Melissa Hadley emphasizes that MCP directly connects the LLM to external systems.

The MCP process, as detailed by Melissa Hadley, also works in five distinct steps:

1. Discover: The LLM connects to an MCP server and identifies what tools, APIs, and resources are available, checking for access to the payroll system if asked, "How many vacation days do I have?".
2. Understanding: The LLM reads each tool's schema, including the inputs and outputs, to know how to call it.
3. Plan: The LLM decides which tools to use and the order in which to use them to fulfill the user's request.
4. Execute: Structured calls are sent through the secure MCP runtime, which executes the tools and returns the results.
5. Integrate: The LLM uses these results to keep reasoning, make more calls if needed, or finalize an answer or an action.

In the vacation scenario, Melissa Hadley explains that an AI agent using MCP would be able to "pull the employee's open number of vacation days from an HR system and perhaps even submit a request to their manager for additional days off through that same system".

The key takeaway for IBM Technology clients is that "RAG is all about knowing more, while on the other hand, MCP is about doing more". Melissa Hadley advises that these methods are not mutually exclusive; "there are times that MCP uses RAG as a tool to be even more effective at information return for a user". When planning future AI projects, the key isn't choosing one pattern or the other, but "understanding when to retrieve knowledge, when to call tools, and how to architect both for things like security, governanc,e and scale".

site_map