Courses & Documentary

AI Agents with Hybrid RAG

Demands of highly regulated environments, particularly the legal sector, are pushing the boundaries of artificial intelligence agent development, necessitating a shift from merely 'smart' agents to those that are fundamentally trustworthy. This critical engineering mandate, highlighted in a discussion presented by Joseph Washington (Lead AI and Automation) from IBM Technology, centers on the absolute requirement for explainability when AI tools handle sensitive data. Trust, in fields like law and medicine, is foundational, meaning engineers must build trustworthy systems, not just intelligent ones.

The challenge is perhaps best exemplified by the arduous process of e-discovery following a major legal action, such as a discrimination suit filed by a former employee. The company's legal team faces the task of preserving, collecting, and sharing every relevant document and message—a vast reservoir of data encompassing thousands of files across platforms including Outlook, Gmail, Slack, Box, and SharePoint. This includes everything from performance review emails to contracts and personal text messages, all of which must be securely contained within a Document Management System (DMS) as potential legal evidence. To gain insight over this enormous data pool, AI research agents are employed to filter the material—for instance, identifying documents mentioning the employee alongside terms like "termination" or "performance review"—and summarizing the key findings for the legal team.

Retrieval-Augmented-Generation-RAG-KV-1.jpg

Related article - Uphorial Radio 

Build Enterprise Retrieval-Augmented Generation Apps with NVIDIA Retrieval  QA Embedding Model

However, the utility of these findings is constrained by a strict legal principle: the AI agent’s output is inadmissible and useless unless it is demonstrably trustworthy. Trustworthiness demands complete provenance, requiring the agent to answer critical questions about its output. It must specify precisely which source documents were utilized, the exact timestamp, the author of the message, and the specific keywords that triggered its retrieval. By answering these questions, the agent’s outputs become explainable, defensible, and reliable.

Many current AI systems, while useful for initial development and testing, fail this trust mandate because they rely on simple Retrieval-Augmented Generation (RAG). Simple RAG typically involves converting data from the DMS into vector embeddings and storing them in an instance like Milvus. This approach, however, often overlooks vital considerations necessary for high-stakes compliance. It fails to adequately handle the differences between structured and unstructured data, struggles with diverse file formats—including picture, video, and audio files—and critically ignores essential metadata associated with each file, such as access control and change history.

To achieve the necessary precision and traceable output, practitioners must adopt a hybrid RAG approach. This method mandates a tighter, more integrated connection with the DMS, allowing the agent to combine both semantic and structured search capabilities. Hybrid RAG enables the system to conduct semantic search, which identifies contextually similar documents, concurrently with robust keyword filtering to capture exact phrases pertinent to the case, such as "noncompete" or "harassment". Moreover, this approach facilitates the use of metadata filters, allowing users to narrow searches by criteria like author, date range, or platform, while also accessing structured access control and change history information. By leveraging both structured and semantic search simultaneously, hybrid RAG systems deliver a higher degree of precision and provide the necessary traceability to the Large Language Model (LLM) and the overall AI agent, ensuring the resulting data is defensible and trustworthy in a legal context.

site_map