Recursive Coding Agents: Managing AI Geniuses

The "Mismanaged Genius" Problem

Modern LLMs possess sufficient intelligence to solve complex coding tasks, yet they fail to deliver reliable outcomes. The bottleneck is not raw model capability but the lack of a management layer to specify, verify, and reuse work. Raymond Weitekamp argues that the solution lies in treating agents as "mismanaged geniuses" that require better orchestration rather than more raw intelligence.

Recursive Language Models (RLMs) as Reasoning Engines

RLMs represent a paradigm shift in inference-time compute by marrying tool calling with reasoning. Unlike standard prompting, where a model reads a static context, an RLM treats the context as a variable to be manipulated symbolically.

Key characteristics of an RLM include:

Externalized Context: The prompt is not fully loaded into the context window; instead, the model uses tools to explore it symbolically.
Recursive Decomposition: The model breaks down a problem into sub-tasks, delegating them to sub-agents (or sub-RLMs) that operate independently.
Executable Environment: The reasoning process is unified with code execution, allowing the agent to "reason through tool calling."

This approach allows smaller models (e.g., Qwen 2.5 7B) to outperform frontier models on long-reasoning tasks by maintaining a thread of execution through recursive calls that exceed the limitations of a single context window.

Implementing Recursive Coding Agents

To transform standard coding agents into RLMs, developers must implement a harness that allows the agent to call itself or other agents recursively.

Pi (Coding Agent): A minimal, extensible agent that now supports pure recursive extensions, allowing for deep, nested task execution.
Open Pros: A markdown-based programming language that allows users to declare sub-agent workflows. It enables developers to define explicit dependencies (skills/tools) for sub-agents, ensuring they are configured correctly before executing their portion of a contract.
Dynamic Workflows: Tools like Anthropic's Claude Code now support dynamic workflows, which effectively turn them into RLMs by enabling recursive, multi-step agentic loops.

Capturing Reliability

Reliability is achieved by moving from "one-off" prompts to reusable workflows. Weitekamp demonstrates that agents can be instructed to deconstruct a "golden session"—a successful execution path—into a reusable program (such as an Open Pros markdown file). This allows developers to codify successful reasoning patterns, ensuring that the agent can replicate high-performance results consistently across different tasks, such as bug sweeps, large-scale refactors, or adversarial audits.

The "Mismanaged Genius" Problem

Recursive Language Models (RLMs) as Reasoning Engines

Implementing Recursive Coding Agents

Capturing Reliability

More from AI & LLMs

AIs Tackle Months of Verifiable SWE, Boosting Timelines

GLM-5.1 Excels in Long-Horizon Agentic Coding

Claude Opus 4.1 Reaches 74.5% on SWE-bench for Superior Coding

5 LLM Pitfalls Engineers Hit Building Agents