The "Mismanaged Genius" Problem
Modern LLMs possess sufficient intelligence to solve complex coding tasks, yet they fail to deliver reliable outcomes. The bottleneck is not raw model capability but the lack of a management layer to specify, verify, and reuse work. Raymond Weitekamp argues that the solution lies in treating agents as "mismanaged geniuses" that require better orchestration rather than more raw intelligence.
Recursive Language Models (RLMs) as Reasoning Engines
RLMs represent a paradigm shift in inference-time compute by marrying tool calling with reasoning. Unlike standard prompting, where a model reads a static context, an RLM treats the context as a variable to be manipulated symbolically.
Key characteristics of an RLM include:
- Externalized Context: The prompt is not fully loaded into the context window; instead, the model uses tools to explore it symbolically.
- Recursive Decomposition: The model breaks down a problem into sub-tasks, delegating them to sub-agents (or sub-RLMs) that operate independently.
- Executable Environment: The reasoning process is unified with code execution, allowing the agent to "reason through tool calling."
This approach allows smaller models (e.g., Qwen 2.5 7B) to outperform frontier models on long-reasoning tasks by maintaining a thread of execution through recursive calls that exceed the limitations of a single context window.
Implementing Recursive Coding Agents
To transform standard coding agents into RLMs, developers must implement a harness that allows the agent to call itself or other agents recursively.
- Pi (Coding Agent): A minimal, extensible agent that now supports pure recursive extensions, allowing for deep, nested task execution.
- Open Pros: A markdown-based programming language that allows users to declare sub-agent workflows. It enables developers to define explicit dependencies (skills/tools) for sub-agents, ensuring they are configured correctly before executing their portion of a contract.
- Dynamic Workflows: Tools like Anthropic's Claude Code now support dynamic workflows, which effectively turn them into RLMs by enabling recursive, multi-step agentic loops.
Capturing Reliability
Reliability is achieved by moving from "one-off" prompts to reusable workflows. Weitekamp demonstrates that agents can be instructed to deconstruct a "golden session"—a successful execution path—into a reusable program (such as an Open Pros markdown file). This allows developers to codify successful reasoning patterns, ensuring that the agent can replicate high-performance results consistently across different tasks, such as bug sweeps, large-scale refactors, or adversarial audits.