Everything Gets Rebuilt: The New AI Agent Stack — Harrison Chase on MAD Podcast

Matt Turck / Harrison Chase · Original

LangChain 联合创始人 Harrison Chase 做客 MAD Podcast（2026-03-12），系统阐述 Agent 从简单 prompt 系统演变为具备规划、工具使用、代码编写、文件管理和长期记忆能力的软件后，Harness（控制框架）而非 Model 才是真正的差异化所在。

核心论点

1. Agent 两大类型正在融合

Conversational Agent：低延迟、语音交互、少量 Tool Call（客服/CX 场景）
Long Horizon Agent：长周期运行、能规划、能保持连贯性——几乎都是 Coding Agent
融合方向：一个同步对话式 Agent 在后台触发异步长周期 Agent，最终呈现为一个统一入口

2. Harness > Model

“Manus was an end-user product, but their harness was so good—that was the secret sauce. And it worked with any of the models under the hood.”

Harness = 模型如何与环境交互的整套机制（Tool、File System、Sub-Agent、Skill、Prompt Caching、Context Compression）
Manus、Claude Code、Deep Research 的成功都归功于 Harness + UI 的组合
有趣的脱节：Anthropic 模型内置的 file editing tool 和 Claude Code 实际用的是完全不同的一套。Harrison 问了几次没得到明确回答

3. 现代 Agent 五大核心 Primitive

Primitive	作用	关键洞察
System Prompt	SOP 载体	一部分 Harness 内置，一部分用户配置（如 CLAUDE.md）
Planning Tool	思维草稿本	不强制执行计划，只是放在 context 中让 Agent 参考
Sub-Agent	Context 隔离	“沟通是生活中最难的事——也是 Agent 协作最难的部分”
File System	LLM 自管理 Context	底层可以是任何东西，但对 LLM 暴露为文件接口
Skill	Progressive Disclosure	不预加载到 System Prompt，Agent 按需读取

4. Context Compaction 的新方向

当前主流：达到阈值（如 80% context window）时触发压缩
新方向：给 Agent 一个 Tool 让它自己决定何时压缩——符合”让模型承担更多职责”的精神
压缩时保留最近 ~10 条消息保持连贯性，原始消息转储到 File System 做兜底

5. Memory 三层

Semantic Memory：RAG 式事实检索——“怎么进入记忆？怎么被提取？这部分还没有答案”
Episodic Memory：历史对话记录——相对成熟
Procedural Memory：如何做事的指令 = Agent 的 Configuration——最有趣，因为 Agent 可以修改自己的 Procedural Memory 来”学习”

6. 差异化在哪？

“I would not get too attached to harnesses, skills, all these things—because the way of building will change. But that knowledge, those tools, that’s specific to your domain—that’s the stuff that won’t change.”

Harness 的 primitive 在趋同（所有人都有 File System + Sub-Agent + Skill + Code Execution）
真正的护城河在：领域知识编码成的 Instruction + 专用 Tool + Skill
对企业的建议：把精力集中在构建 Instruction 和 Tool 上，无论最终用什么 Harness 暴露它们

LangChain 产品演进

LangChain 0.x：抽象层 + Chain 模板（RAG chain 五行代码）→ 入门容易但缺乏生产控制
LangGraph：底层 orchestration，无隐藏 prompt，durable execution + streaming + human-in-the-loop → Agent Runtime
LangChain 1.0：在 LangGraph 之上重构，只保留 create_agent（LLM in a loop + tool calling）
DeepAgents：开箱即用的 Agent Harness（Planning + File System + Sub-Agent + Skill）
LangSmith：Observability++ → traces/evals/analytics 联动；No-code Agent Builder

值得标注的细节

Harrison 把 Sandbox 分两种用法：Agent 装在 Sandbox 里运行 vs Agent 在外部把 Sandbox 当 Tool 调用——实践中 50/50
Sandbox 安全：API Key 不应放在 Sandbox 内（LLM 可见 = prompt injection 漏洞），用 Proxy 在外部注入
LangGraph 在强监管行业仍有需求——“coding agent 行为不可预测，没有确定性保证”
Evals、Memory、Prompt Optimization 三者本质相关：都是 Agent 做事 → reward function 评判 → 更新参数

Discussion 补充（2026-04-12）

1. Skill 加载的可靠性是 Harrison 没展开的盲区

Harrison 把 Skill 描述为 “progressive disclosure, agent reads on demand”，但没有讨论 Agent 忘了加载怎么办。这在本次 session 就发生了——Claude 跳过 learning skill 直接给摘要，需要用户手动纠正。

Skill 触发的三层可靠性递进： - LLM 自行判断（Harrison 描述的）→ 最不可靠 - CLAUDE.md 里写规则（Justin 目前的做法）→ 规则可靠但执行者是 LLM，会跳过 - Hook 硬注入（Superpowers 的做法：SessionStart hook + 极端措辞）→ 注入可靠，但遵从仍依赖 LLM

Harrison 的隐含假设是”LLM 足够强就能靠 prompt 驱动一切”，但他自己也承认 “there’s no guarantee”。

2. Procedural Memory ≠ Agent Configuration 的信任边界问题

Harrison 说 DeepAgents 把 System Prompt、Skill、Tool 配置全部以文件形式暴露给 Agent 修改，这就是”学习”。Justin 的设计做了一个明确的 信任边界：

架构层（Skill、CLAUDE.md）→ 只有人类维护
经验层（memory 文件）→ Agent 写入、人类定期审计

关键质疑：Agent 修改自己配置的 reward signal 从哪来？ 没有明确的成功标准，Agent 凭什么判断”这次改配置是改对了”？这是 Harrison 没有回答的问题。配置漂移的风险在长期运行的系统中尤其严重——memory 文件漂移可以定期清理，Skill 文件漂移了整个行为模式都变了。

→ Justin 的分层设计（架构层人控、经验层 Agent 写）更适合长期演化的个人系统；Harrison 的做法更适合一次性任务型 Agent。

3. Skill 复杂度管理：编译 vs 解释

Harrison 的 Skill 是”一个 skill.md”扁平结构。Justin 的 Skill 是多层嵌套（hub SKILL.md → workflows → references → shared primitives），本质上是一个微型软件系统。

这不是 over-engineering——而是 把本来就要做的复杂调用过程自动化成步骤。如果不分层，要么一个巨大的 skill.md（token 更多），要么 Agent 每次手动判断该读哪些文件（更不可靠）。Harrison 的简单模型在 Skill 复杂度增长后会遇到这个问题。

4. Agent 主动 Compact 时机未到

Harrison 说的”给 Agent 一个 tool 让它自己决定何时压缩 context”是方向性愿景，目前没有证据表明效果好。Compact 的决策比 Skill 加载的决策难得多——Skill 加载错了成本低（用户提醒即可），Compact 判断错了信息永久丢失，不可逆。目前最务实的方案仍是阈值自动触发 + session-end 人工切断。

5. 整体定位：命名体系与行业坐标

这篇对 Justin 的价值不在于”学到新做法”，而在于 给已有实践提供了一套命名体系和行业坐标。Harrison 描述的 Harness 架构（System Prompt + Skill + Sub-Agent + File System + Context Compaction）与 CC 系统的工程实现高度对应，但两者的设计语境不同——Harrison 做通用产品化基础设施（面向所有开发者），Justin 做个人化长期演化系统（只服务一个用户）。同样的 primitive，不同语境下的设计选择自然不同。