May 22, 2026

May 22, 2026

framework

Agency Lives in the Model. Your Job Is Building the Vehicle.

A new open-source curriculum reframes how engineers should think about agent products: the model supplies agency through training, and the harness is the infrastructure that lets it operate. Here is what that split means for builders.

Most agent tutorials start with orchestration code. Learn Claude Code starts one layer earlier, with a claim that changes how you design everything: agency comes from model training, not from the surrounding code.

The distinction matters. If you believe the harness creates agency, you keep adding logic to compensate for model weaknesses. If you accept that agency was learned during training, you stop overbuilding the wrapper and focus on giving the model a clean environment to operate in.

The repo states it plainly: the model is the driver, the harness is the vehicle. An agent product needs both, but they are not equivalent. The model handles perception, reasoning, and action. The harness handles the specific environment that lets those capabilities land.

The historical evidence cited is hard to argue with. In 2013, DeepMind's DQN learned seven Atari 2600 games from raw pixels and game scores alone, with no game-specific rules. By 2015, it reached professional tester level across 49 games, published in Nature. In 2019, OpenAI Five played 45,000 years of Dota 2 through self-play over ten months, then defeated OG (the TI8 world champions) 2-0 in a live match, winning 99.4% of 42,729 public games. That same year, DeepMind's AlphaStar beat a professional StarCraft II player 10-1 in closed matches and reached Grandmaster rank on the European server, placing in the top 0.15% of 90,000 players. Also in 2019, Tencent's Jueyu system defeated KPL professional players in full 5v5 at the World Champion Cup.

None of those systems relied on scripted strategies. The agency was in the trained weights. The surrounding infrastructure just gave the model a place to act.

This framing has a direct consequence for product engineers. When your agent misbehaves, the first question should be whether the model can actually do the task, not whether your orchestration layer needs another conditional branch. Adding harness complexity to cover a capability gap does not fix the gap. It hides it and makes debugging harder.

The concrete implication: audit your current agent architecture and separate what is model responsibility from what is harness responsibility. If you are writing code to reason on behalf of the model, stop. Write code that gives the model better context, cleaner tools, and tighter feedback loops instead. The harness should serve the model, not substitute for it.