AI 智能体 - 记忆管理

2025-11-26
浙江
本文字数：7191 字
阅读完需：约 24 分钟

📖 引言：拒绝做“金鱼”智能体

想象一下，你雇佣了一位私人助理。你们聊了半小时，商定了下周去巴黎的行程，确认了你对花生过敏，还讨论了你喜欢的酒店风格。然而，当你第二天问他：“帮我订酒店了吗？”时，他一脸茫然地看着你：“您好，我是您的助理。请问有什么可以帮您？”

这就是没有 记忆管理（Memory Management） 的 AI 智能体。

在大语言模型（LLM）的原生状态下，它们是“无状态”的。每一次 API 调用都是全新的开始，就像只有 7 秒记忆的金鱼。为了让智能体从简单的“问答机器”进化为真正的“智能伙伴”，我们需要赋予它们记忆——保留过往交互、追踪任务状态、积累长期知识的能力。

本章将带你深入 AI 的大脑深处，解构短期记忆与长期记忆的本质，并手把手教你使用 Google ADK 和最新的 LangChain / LangGraph 架构，为你的智能体构建一个强大的“海马体”。

第一部分：智能体记忆的认知架构

在计算机科学与认知心理学的交叉点上，我们将智能体的记忆分为两大核心类别。理解这两者的区别，是设计高效系统的第一步。

1.1 短期记忆 (Short-Term Memory)：上下文的艺术

短期记忆类似于人类的工作记忆（Working Memory）。它是智能体在当前对话或任务执行过程中，能够“立即”访问的信息。

物理载体：LLM 的上下文窗口（Context Window）。
内容：
最近的 User/AI 消息对。
当前任务的中间状态（如：工具调用的结果）。
即时的系统指令（System Prompt）。
局限性：
容量限制：虽然 Gemini 2.5 Pro 拥有 1M+ 的上下文，但无限堆叠信息会导致“大海捞针（Lost-in-the-Middle）”效应，且推理成本极高。
易失性：一旦会话（Session）结束或窗口溢出，信息即被丢弃。

1.2 长期记忆 (Long-Term Memory)：持久化的智慧

长期记忆是智能体的知识库。它存储了跨越时间、跨越会话的信息。

物理载体：外部数据库（向量数据库、SQL、图数据库、文件系统）。
分类：
语义记忆 (Semantic Memory)：事实性知识（如：“公司的差旅报销标准是每天 500 元”）。通常通过 RAG（检索增强生成）实现。
情景记忆 (Episodic Memory)：过往的经历（如：“用户上次在周五晚上预订了意大利餐厅”）。
程序性记忆 (Procedural Memory)：关于“如何做”的知识（如：智能体自我反思后更新的操作规则）。
机制：智能体在需要时，通过 检索（Retrieval） 将长期记忆中的相关片段“加载”到短期记忆（上下文窗口）中。

第二部分：Google ADK 中的记忆管理实战

Google Agent Developer Kit (ADK) 提供了一套非常严谨的、工程化的记忆架构。它并没有把所有东西都混在 Prompt 里，而是清晰地划分了 Session（会话）、State（状态） 和 Memory（记忆）。

2.1 架构三杰：Session, State, Memory

Session (会话线)：
代表一次完整的对话交互。
记录了事件日志（Events Log），即发生过的每一件事（用户说话、模型回复、工具调用）。
生命周期：开始 -> 交互 -> 结束。
State (状态)：
这是 ADK 的精髓。它不仅仅是文本历史，而是一个结构化的字典（Key-Value Store）。
它用于存储当前会话的“暂存数据”（Scratchpad）。
作用域管理：
user:*：用户级数据（跨 Session 存在）。
app:*：全局应用配置。
temp:*：仅在当前轮次有效的临时数据。
Memory (长期记忆)：
这是持久化的知识库，通常对接 Vertex AI RAG 或向量数据库。

2.2 深度实战：基于工具的状态管理 (State Management via Tools)

在 ADK 中，严禁直接在代码中修改 session.state 字典。所有的状态变更必须通过 事件 (Event) 来记录，以确保系统的可追溯性和并发安全性。

最佳实践：将状态更新封装在工具 (Tool) 内部。

场景案例：构建一个“购物车智能体”

我们需要智能体记住用户想买什么，并在结算时计算总价。

import timefrom typing import Dict, Anyfrom google.adk.tools.tool_context import ToolContextfrom google.adk.sessions import InMemorySessionServicefrom google.adk.agents import LlmAgentfrom google.adk.runners import Runnerfrom google.genai import types
# --- 1. 定义工具：负责安全的更新状态 ---def add_to_cart(tool_context: ToolContext, product_name: str, price: float) -> Dict[str, Any]:    """    将商品添加到购物车，并更新会话状态。        Args:        product_name: 商品名称        price: 商品价格    """    # 获取当前状态的引用（只读视角）    state = tool_context.state        # 获取现有购物车数据，如果不存在则初始化    # 注意：这里我们使用 user: 前缀，意味着即使开启新会话，该用户的购物车依然存在    current_cart = state.get("user:cart", [])    current_total = state.get("user:cart_total", 0.0)        # 执行业务逻辑    new_item = {        "item": product_name,        "price": price,        "timestamp": time.time()    }    current_cart.append(new_item)    new_total = current_total + price        # --- 关键步骤：通过 EventActions 更新状态 ---    # 在 Tool 内部，直接修改 state 字典通常在某些实现中有效，    # 但最规范的方式是依靠 ADK 的机制捕获这些变更。    # ADK 的 ToolContext 代理了对状态的写入，确保其被记录为 StateDelta。    state["user:cart"] = current_cart    state["user:cart_total"] = new_total        # 添加一个临时状态，告诉智能体下一步该做什么    state["temp:next_action"] = "ask_checkout"        print(f"🛒 [Tool Log] 已添加 {product_name} (${price})。当前总价: ${new_total}")        return {        "status": "success",         "message": f"Added {product_name}. Total items: {len(current_cart)}",        "current_total": new_total    }
# --- 2. 配置智能体与服务 ---# 使用内存存储（开发模式），生产环境请换成 VertexAiSessionServicesession_service = InMemorySessionService()
# 定义智能体shopper_agent = LlmAgent(    name="ShoppingAssistant",    model="gemini-2.0-flash",    instruction="""    你是一个购物助手。    当用户想要购买商品时，必须使用 `add_to_cart` 工具。    工具执行后，请告诉用户当前购物车的总金额。    """,    tools=[add_to_cart])
# --- 3. 模拟执行流程 ---async def run_shopping_demo():    app_name = "shop_app"    user_id = "customer_007"    session_id = "session_v1"        # 创建会话    session = await session_service.create_session(        app_name=app_name,         user_id=user_id,         session_id=session_id    )        runner = Runner(agent=shopper_agent, app_name=app_name, session_service=session_service)        # 用户请求    user_input = "我想买一个 iPhone 15，价格是 999 美元"    print(f"👤 用户: {user_input}")        content = types.Content(role='user', parts=[types.Part(text=user_input)])        # 运行智能体    async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):        if event.is_final_response():            print(f"🤖 智能体: {event.content.parts[0].text}")
    # --- 4. 验证状态持久化 ---    # 获取更新后的会话    updated_session = await session_service.get_session(app_name, user_id, session_id)    print("\n📊 [System Audit] 最终状态快照:")    print(updated_session.state)
# (运行代码需在异步环境中执行)# import asyncio# asyncio.run(run_shopping_demo())

复制代码

2.3 Vertex AI Memory Bank：托管的长期记忆

如果你不想自己维护向量数据库，Google 的 Memory Bank 是一个全托管的“黑科技”。它不仅存储数据，还会主动处理数据。

自动提取：它会在后台分析对话，自动提取用户偏好（如“用户喜欢靠窗座位”）。
冲突解决：如果用户先说喜欢红色，后说讨厌红色，Memory Bank 会智能更新。

from google.adk.memory import VertexAiMemoryBankService
# 初始化服务memory_service = VertexAiMemoryBankService(    project="your-project-id",    location="us-central1",    agent_engine_id="your-agent-engine-id")
# 在会话结束时，将整个会话“归档”进长期记忆# 系统会自动进行向量化、实体提取和索引await memory_service.add_session_to_memory(session)
# 在新会话开始时，检索相关记忆relevant_memories = await memory_service.search_memory(    query="用户以前买过电子产品吗？",    user_id="customer_007")

复制代码

第三部分：LangChain 与 LangGraph 的现代记忆管理

在 LangChain 的早期版本（0.1 之前），我们习惯使用 ConversationBufferMemory 配合 LLMChain。但在 LangChain 1.0 及现在的 LangGraph 时代，记忆管理的范式发生了根本性的转变。

现在的核心理念是：Persistence via Checkpointing (通过检查点实现持久化)。

3.1 为什么放弃旧的 `Memory` 类？

旧的 ConversationBufferMemory 只是简单的字符串拼接。而在复杂的 Agent 中，我们需要管理：

消息历史（Message History）。
结构化状态（Structured State，如当前步骤、工具输出）。
分支与回溯（Branching）。

因此，LangGraph 成为了构建有记忆智能体的首选标准。

3.2 实战：使用 LangGraph 构建带有“持久记忆”的聊天机器人

我们将构建一个智能体，它不仅能记住聊天记录（短期），还能将重要信息写入持久化存储（长期），并在系统重启后依然记得。

场景：一个旅行助手，需要记住用户的行程偏好。

环境准备

pip install langgraph langchain-openai langchain-core

复制代码

完整代码实现

import operatorfrom typing import Annotated, TypedDict, List, Union
from langchain_openai import ChatOpenAIfrom langchain_core.messages import SystemMessage, HumanMessage, AIMessage, BaseMessagefrom langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# LangGraph 核心组件from langgraph.graph import StateGraph, START, ENDfrom langgraph.checkpoint.memory import MemorySaver # 用于短期记忆（Checkpointer）from langgraph.store.memory import InMemoryStore    # 用于长期记忆（Store）
# --- 1. 定义状态 (The State) ---# 这是短期记忆的载体，类似于 Google ADK 的 session.stateclass AgentState(TypedDict):    # messages: 一个消息列表，add_messages reducer 会自动处理追加逻辑    messages: Annotated[List[BaseMessage], operator.add]    # user_profile: 从长期记忆中加载的用户画像    user_profile: dict
# --- 2. 初始化模型与存储 ---llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Checkpointer: 负责保存对话的每一步状态（短期记忆，支持回溯）checkpointer = MemorySaver()
# Store: 负责保存跨会话的长期记忆（知识库/用户画像）# 在生产环境中，这里会连接到 Redis, Postgres 或 Vector DBlong_term_store = InMemoryStore() 
# --- 3. 定义节点逻辑 ---
def load_memory(state: AgentState, config):    """    节点：在对话开始前，从长期记忆中加载用户画像    """    user_id = config["configurable"]["user_id"]    # 命名空间设计：(user_id, scope)    namespace = (user_id, "profile")        # 从 Store 中获取数据    memory_item = long_term_store.get(namespace, "preferences")        profile = memory_item.value if memory_item else {}    return {"user_profile": profile}
def call_model(state: AgentState, config):    """    节点：调用 LLM 生成回复    """    profile = state.get("user_profile", {})    messages = state["messages"]        # 构建系统提示词，注入长期记忆    system_prompt = "你是一个贴心的旅行助手。"    if profile:        system_prompt += f"\n\n已知用户信息：\n{profile}"            prompt = ChatPromptTemplate.from_messages([        ("system", system_prompt),        MessagesPlaceholder(variable_name="messages"),    ])        chain = prompt | llm    response = chain.invoke({"messages": messages})        return {"messages": [response]}
def save_memory(state: AgentState, config):    """    节点：分析对话，更新长期记忆    （这里简化为使用 LLM 提取偏好）    """    last_message = state["messages"][-1]    if not isinstance(last_message, AIMessage):        return {}            # 定义一个简单的提取指令    user_id = config["configurable"]["user_id"]        # 只有当包含特定关键词时才触发更新（模拟）    # 在真实场景中，这可以是专门的 Extraction Chain    content = last_message.content    if "记住了" in content or "更新" in content:        # 假设 LLM 已经确认更新，我们这里模拟写入 Store        # 实际应用中应解析 LLM 的工具调用来获取结构化数据        pass             # 演示：手动写入一个偏好    # 假设通过提取，我们发现用户喜欢 "靠窗座位"    # long_term_store.put(...)     return {}
# --- 4. 构建图 (The Graph) ---workflow = StateGraph(AgentState)
# 添加节点workflow.add_node("load_memory", load_memory)workflow.add_node("agent", call_model)# workflow.add_node("save_memory", save_memory) # 可选：添加自动保存节点
# 定义边workflow.add_edge(START, "load_memory")workflow.add_edge("load_memory", "agent")workflow.add_edge("agent", END)
# 编译图app = workflow.compile(checkpointer=checkpointer, store=long_term_store)
# --- 5. 运行演示 ---
# 步骤 A：预置一些长期记忆user_id = "traveler_joe"namespace = (user_id, "profile")long_term_store.put(    namespace,     "preferences",     {"diet": "素食主义者", "seat": "靠过道"})
# 步骤 B：开始对话config = {"configurable": {"thread_id": "thread_1", "user_id": user_id}}
print("--- 第一轮对话 ---")input_msg = HumanMessage(content="帮我查一下飞往伦敦的航班，我想吃点东西。")for update in app.stream({"messages": [input_msg]}, config=config):    pass
# 获取最终回复final_state = app.get_state(config)print(f"🤖 AI: {final_state.values['messages'][-1].content}")# 预期输出：AI 会提到素食餐，因为它读取了长期记忆
print("\n--- 第二轮对话 (同一个 thread，拥有短期记忆) ---")input_msg_2 = HumanMessage(content="那里天气怎么样？") # 指代“伦敦”for update in app.stream({"messages": [input_msg_2]}, config=config):    pass
final_state_2 = app.get_state(config)print(f"🤖 AI: {final_state_2.values['messages'][-1].content}")# 预期输出：AI 知道你在问“伦敦”的天气，因为 Checkpointer 保持了上下文

复制代码

3.3 关键概念解析：Store vs Checkpointer

在 LangGraph 架构中，这是最容易混淆但也最重要的区分：

第四部分：高级模式——反思性记忆 (Reflective Memory)

最高级的记忆管理不仅仅是“存储”，而是“加工”。程序性记忆的实现往往依赖于反思机制。

场景：自我进化的智能体

智能体在与用户的交互中，如果发现自己总是犯错（例如总是忘记用户要求使用 Markdown 格式），它应该能自我反思，并更新自己的“长期指令”。

实现逻辑：

执行：智能体回答用户问题。
反馈：用户说“格式不对！”。
反思：智能体调用内部的反思 Prompt：“我为什么错了？我应该如何修改我的系统指令？”
更新：智能体调用 store.put()，更新 system_instructions 命名空间下的内容。
应用：下一次对话时，load_memory 节点加载新的、改进后的指令。

# 伪代码逻辑def update_instructions(state, store):    # 1. 获取当前指令    current_instr = store.get(("system",), "core_rules")        # 2. 调用 LLM 反思    reflection = llm.invoke(f"基于用户反馈 {state['feedback']}，优化以下指令：{current_instr}")        # 3. 写入长期记忆    store.put(("system",), "core_rules", {"text": reflection.content})

复制代码

第五部分：核心要点总结

拒绝“无状态”：构建智能体时，必须从一开始就设计好 Session（短期）和 Memory（长期）的架构。
短期记忆靠 State：不要只把对话历史当成字符串。使用结构化的 State（如 LangGraph 的 TypedDict 或 ADK 的 session.state）来追踪任务进度。
长期记忆靠 Retrieval：利用向量数据库或键值存储（Store），实现跨会话的知识持久化。
LangGraph 是未来：在 Python 生态中，LangGraph 的 Checkpointer + Store 模式是目前处理复杂记忆最先进的标准范式。
安全第一：不要在 Prompt 中泄露所有记忆。使用 Namespace（命名空间）和 Scope（作用域）来隔离不同用户的数据。

记忆，是智能体产生“自我”错觉的基石。通过有效的记忆管理，你的 AI 将不再是一个冷冰冰的 API 接口，而是一个能够积累经验、理解偏好、与用户共同成长的数字伙伴。

参考资料

1.ADK Memory: https://google.github.io/adk-docs/sessions/memory/2.LangGraph Memory: https://langchain-ai.github.io/langgraph/concepts/memory/3.Vertex AI Agent Engine Memory Bank: https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-memory-bank-in-public-preview=4.Antonio Gulli 《Agentic Design Patterns》

发布于: 刚刚阅读数: 5

原文链接:【http://xie.infoq.cn/article/9df99178c225daf5e8414ccc4】。文章转载请联系作者。

Hernon AI

关注

创意心中美好世界 2020-08-19 加入

AI爱好者

发布

暂无评论

创作场景

AI 智能体 - 记忆管理

📖 引言：拒绝做“金鱼”智能体

第一部分：智能体记忆的认知架构

1.1 短期记忆 (Short-Term Memory)：上下文的艺术

1.2 长期记忆 (Long-Term Memory)：持久化的智慧

第二部分：Google ADK 中的记忆管理实战

2.1 架构三杰：Session, State, Memory

2.2 深度实战：基于工具的状态管理 (State Management via Tools)

场景案例：构建一个“购物车智能体”

2.3 Vertex AI Memory Bank：托管的长期记忆

第三部分：LangChain 与 LangGraph 的现代记忆管理

3.1 为什么放弃旧的 Memory 类？

3.2 实战：使用 LangGraph 构建带有“持久记忆”的聊天机器人

环境准备

完整代码实现

3.3 关键概念解析：Store vs Checkpointer

第四部分：高级模式——反思性记忆 (Reflective Memory)

场景：自我进化的智能体

第五部分：核心要点总结

参考资料

Hernon AI

评论

3.1 为什么放弃旧的 `Memory` 类？