Agent Builder 的记忆系统是如何构建的

LangChain 团队分享了 Agent Builder 记忆系统的完整设计历程:为何优先构建记忆系统、技术实现细节、构建过程中的经验教训,以及记忆系统解锁了哪些新能力。

记忆系统是 Agent 能够保持上下文、跨会话学习的核心基础设施。文章详细介绍了分层记忆设计、处理记忆冲突、确定写入时机等技术权衡,对开发 Agent 应用的工程师很有参考价值。

从行业发展趋势来看,这一进展反映了AI技术正在加速从实验室走向实际应用的过程。越来越多的企业和开发者开始将AI能力深度整合到产品和工作流中,推动了整个产业链的升级。对于关注AI前沿动态的从业者和研究者而言,这是一个值得持续跟踪的方向。

How we built Agent Builder’s memory system

How we built Agent Builder’s memory system

A key part of Agent Builder is its memory system. In this article we cover our rationale for prioritizing a memory system, technical details of how we built it, learnings from building the memory system, what the memory system enables, and discuss future work.

We launched LangSmith Agent Builder last month as a no-code way to build agents. A key part of Agent Builder is its memory system. In this article we cover our rationale for prioritizing a memory system, technical details of how we built it, learnings from building the memory system, what the memory system enables, and discuss future work.

What is LangSmith Agent Builder

LangSmith Agent Builder is a no-code agent builder. It’s built on top of the Deep Agents harness. It is a hosted web solution targeted at technically lite citizen developers. In LangSmith Agent Builder, builders will create an agent to automate a particular workflow or part of their day. Examples include an email assistant, a documentation helper, etc.

Early on we made a conscious choice to prioritize memory as a part of the platform. This was not an obvious choice – most AI products launch initially without any form of memory, and even adding it in hasn’t yet transformed products like some may expect. The reason we prioritized it was due to the usage patterns of our users.

Unlike ChatGPT or Claude or Cursor, LangSmith Agent Builder is not a general purpose agent. Rather, it is specifically designed to let builders customize agents for particular tasks. In general purpose agents, you are doing a wide variety of tasks that may be completely unrelated, so learnings from one session with the agent may not be relevant for the next. When a LangSmith Agent is doing a task, it is doing the same task over and over again. Lessons from one session translate to the next at a much higher rate. In fact, it would be a bad user experience if memory is not present – that would mean you would have to repeat yourself over and over to the agent in different sessions.

When thinking about what exactly memory would even mean for LangSmith Agents, we turned to a third party definition of memory. The COALA paper defines memory for agents in three categories:

Procedural: the set of rules that can be applied to working memory to determine the agent’s behavior

Semantic: facts about the world

Episodic: sequences of the agent’s past behavior

How we built our memory system

We represent memory in Agent Builder as a set of files. This is an intentional choice to take advantage of the fact that models are good at using filesystems. In this way, we could easily let the agent read and modify its memory without having to give it specialized tools - we just give it access to the filesystem!

When possible, we try to use industry standards. We use AGENTS.md to define the core instruction set for the agent. We use agent skills to give the agents particular specialized instructions for specific tasks. There is no subagent standard, but we use a similar format to Claude Code. For MCP access, we use a custom tools.json file. The reason we use a custom tools.json file and not the standard mcp.json is that we want to allow users to give the agent only a subset of the tools in an MCP server to avoid context overflow.

We actually do not use a real filesystem to store these files. Rather, we store them in Postgres and expose them to the agent in the shape of a filesystem. We do this because LLMs are great at working with filesystems, but from an infrastructure perspective it is easier and more efficient to use a database. This “virtual filesystem” is natively supported by DeepAgents - and is completely pluggable so you could bring any storage layer you want (S3, MySQL, etc).

We also allow users (and agents themselves) to write other files to an agent’s memory folder. These files can contain arbitrary knowledge as well, that the agent can reference as it runs. The agent would edit these files as it’s working, “in the hot path”.

The reason it is possible to build complicated agents without any code or any domain specific language (DSL) is that we use a generic agent harness like Deep Agents under the hood. Deep Agents abstracts away a lot of complex context engineering (like summarization, tool call offloading, and planning) and lets you steer your agent with relatively simple configuration.

These files map nicely on to the memory types defined in the COALA paper. Procedural memory – what drives the core agent directive – is AGENTS.md and tools.json. Semantic memory is agent skills and other knowledge files. The only type of memory missing is episodic memory, which we didn’t think was as important for these types of agents as the other two.

What agent memory in a file system looks like

We can look at a real agent we’ve been using internally – a LinkedIn recruiter – built on LangSmith Agent Builder.

AGENTS.md: defines the core agents instructions

subagents/: defines only one subagent

linkedin_search_worker: after the main agent is calibrated on a search, it will kick off this agent to source ~50 candidates.

tools.json: defines an MCP server with access to a LinkedIn search tool

There are also currently 3 other files in the memory, representing JDs for different candidates. As we’ve worked with the agent on these searches, it has updated and maintained those JDs.

How memory editing works: a concrete example

To make it more concrete how memory works, we can walk through an illustrative example.

You start with a simple AGENTS.md:

Summarize meeting notes.

The agent produces paragraph summaries. You correct it: "Use bullet points instead." The agent edits AGENTS.md to be:

Formatting Preferences

User prefers bullet points for summaries, not paragraphs.

You ask the agent to summarize a different meeting. It reads its memory and uses bullet points automatically. No reminder needed. During this session, you ask it to: "Extract action items separately at the end." Memory updates:

Formatting Preferences

User prefers bullet points for summaries, not paragraphs.

Extract action items in separate section at end.

Both patterns apply automatically. You continue adding refinements as new edge cases surface.

The agent's memory includes:

Formatting preferences for different document types

Domain-specific terminology

Distinctions between "action items", "decisions", and "discussion points"

Names and roles of frequent meeting participants

Meeting type handling (engineering vs. planning vs. customer)

Edge case corrections accumulated through use

The memory file might look like:

Meeting Summary Preferences

  • Use bullet points, not paragraphs
  • Extract action items in separate section at end
  • Use past tense for decisions
  • Include timestamp at top
  • Engineering meetings: highlight technical decisions and rationale
  • Planning meetings: emphasize priorities and timelines
  • Customer meetings: redact sensitive information
  • Short meetings (<10 min): just key points
  • Sarah Chen (Engineering Lead) - focus on technical details
  • Mike Rodriguez (PM) - focus on business impact

The AGENTS.md built itself through corrections, not through upfront documentation. We arrived iteratively at an appropriately detailed agent specification, without the user ever manually changing the AGENTS.md.

Learnings from building this memory system

There are several lessons we learned along the way.