构建 Agent RAG 系统

检索增强生成（Retrieval Augmented Generation，RAG）系统组合数据检索及生成模型的能力，提供上下文感知的响应。比如将用户查询传递给搜索引擎，再将检索到的结果连同查询一起提供给模型。然后模型基于查询及检索到的信息生成响应。

Agent RAG 通过将自主 Agent 与动态知识检索相结合的方式，扩展传统 RAG 系统。

传统 RAG 系统使用 LLM 基于检索数据回答用户查询，而 Agent RAG 支持智能控制检索和生成过程，从而提高效率和准确性。

传统 RAG 系统面临诸多关键限制，比如依赖单一的检索步骤，关注与用户查询的直接语义相似性，这可能忽略相关信息。

Agent RAG 通过让 Agent 自主制定搜索查询、评估检索结果，并且执行多步检索的方式，解决这些问题，从而产生更有针对性及更全面的输出。

1. 使用 `DuckDuckGo` 进行基础检索

下面构建使用 DuckDuckGo 搜索网站的简单 Agent。该 Agent 将检索信息，合成响应，回答查询。通过 Agent RAG，Alfred Agent 可以：

搜索最新的超级英雄派对趋势

优化结果，以包含奢华元素

将信息整合成完整的规划

以下是 Alfred Agent 实现这些功能的方式：

from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

# Initialize the search tool
search_tool = DuckDuckGoSearchTool()

# Initialize the model
model = HfApiModel()

agent = CodeAgent(
    model=model,
    tools=[search_tool],
)

# Example usage
response = agent.run(
    "Search for luxury superhero-themed party ideas, including decorations, entertainment, and catering."
)
print(response)

该 Agent 遵循以下流程：

分析请求：Alfred Agent 识别查询的关键要素 - 奢华的超级英雄主题派对规划，重点关注装饰、娱乐和餐饮。

执行检索：Agent 利用 DuckDuckGo 搜索最相关和最新的信息，确保其符合奢华活动的偏好。

整合信息：在收集结果后，Agent 将它们处理成连贯的、可执行的规划，涵盖派对的所有方面。

存储，以供将来参考：Agent 存储检索到的信息，以便在规划未来的活动时，易于访问，优化后续任务的效率。

2. 自定义知识库工具

对于专业任务，自定义知识库可以发挥重要作用。下面的工具用于查询技术文档或专业知识的向量数据库。通过语义搜索，Agent 可以找到最符合需求的相关信息。

向量数据库存储由机器学习模型创建的文本或其它数据的数值表示（embedding）。它通过在高维空间中识别相似含义的方式，实现语义搜索。

这种方式将预定义的知识与语义搜索相结合，为活动规划提供具有上下文感知的解决方案。通过访问专业知识，Alfred 可以完善派对的所有细节。

在本例中，将创建从自定义知识库中检索派对规划想法的工具。本例使用 BM25 检索器搜索知识库，返回最相关的结果，使用 RecursiveCharacterTextSplitter 将文档分割成更小的块，以实现更高效的搜索。

from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from smolagents import Tool
from langchain_community.retrievers import BM25Retriever
from smolagents import CodeAgent, HfApiModel


class PartyPlanningRetrieverTool(Tool):
    name = "party_planning_retriever"
    description = "Uses semantic search to retrieve relevant party planning ideas for Alfred’s superhero-themed party at Wayne Manor."
    inputs = {
        "query": {
            "type": "string",
            "description": "The query to perform. This should be a query related to party planning or superhero themes.",
        }
    }
    output_type = "string"

    def __init__(self, docs, **kwargs):
        super().__init__(**kwargs)
        self.retriever = BM25Retriever.from_documents(
            docs, k=5  # Retrieve the top 5 documents
        )

    def forward(self, query: str) -> str:
        assert isinstance(query, str), "Your search query must be a string"

        docs = self.retriever.invoke(
            query,
        )
        return "\nRetrieved ideas:\n" + "".join(
            [
                f"\n\n===== Idea {str(i)} =====\n" + doc.page_content
                for i, doc in enumerate(docs)
            ]
        )

# Simulate a knowledge base about party planning
party_ideas = [
    {"text": "A superhero-themed masquerade ball with luxury decor, including gold accents and velvet curtains.", "source": "Party Ideas 1"},
    {"text": "Hire a professional DJ who can play themed music for superheroes like Batman and Wonder Woman.", "source": "Entertainment Ideas"},
    {"text": "For catering, serve dishes named after superheroes, like 'The Hulk's Green Smoothie' and 'Iron Man's Power Steak.'", "source": "Catering Ideas"},
    {"text": "Decorate with iconic superhero logos and projections of Gotham and other superhero cities around the venue.", "source": "Decoration Ideas"},
    {"text": "Interactive experiences with VR where guests can engage in superhero simulations or compete in themed games.", "source": "Entertainment Ideas"}
]

source_docs = [
    Document(page_content=doc["text"], metadata={"source": doc["source"]})
    for doc in party_ideas
]

# Split the documents into smaller chunks for more efficient search
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    add_start_index=True,
    strip_whitespace=True,
    separators=["\n\n", "\n", ".", " ", ""],
)
docs_processed = text_splitter.split_documents(source_docs)

# Create the retriever tool
party_planning_retriever = PartyPlanningRetrieverTool(docs_processed)

# Initialize the agent
agent = CodeAgent(tools=[party_planning_retriever], model=HfApiModel())

# Example usage
response = agent.run(
    "Find ideas for a luxury superhero-themed party, including entertainment, catering, and decoration options."
)

print(response)

该强化版 Agent 可以：

首先检查文档，获取相关信息

整合知识库中的知识

在记忆系统中维护会话上下文

3. 增强版检索能力

构建 Agent RAG 系统时，Agent 可以采用复杂的策略，比如：

查询重构：Agent 不直接使用用户的原始查询，而是精心优化搜索词，以便更好地匹配目标文档

多步检索：Agent 可以执行多次搜索，利用初始结果指导后续查询

源整合：可以将来自网络搜索和本地文档等多个来源的信息进行组合

结果验证：在将检索到的内容纳入响应之前，可以分析其相关性和准确性

高效的 Agent RAG 系统需要仔细考虑几个关键方面。Agent 应该根据查询类型和上下文在可用工具之间进行选择。记忆系统有助于维护会话历史，避免重复检索。拥有 fallback 策略可以确保即使在主检索方法失败时，系统仍能提供值。此外，实现验证步骤有助于确保检索信息的准确性和相关性。

1. 使用 DuckDuckGo 进行基础检索

2. 自定义知识库工具

3. 增强版检索能力

1. 使用 `DuckDuckGo` 进行基础检索