Anthropic Just Dropped a Masterclass on Building Agent Harnesses (for Large Codebases)

Working effectively with AI coding agents in large codebases requires building a comprehensive 'AI layer' of context and tools around the model, not just rel...

By Sean Weldon

Building an AI Layer: How to Make Coding Agents Work in Large Codebases

TL;DR

AI coding agents struggle with large codebases unless you build a comprehensive AI layer - a harness of context and tools around the model. This layer includes global rules, hooks, skills, LSP/MCP servers, and subagents that matter as much as the model itself. Claude Code uses genetic search with command-line tools rather than traditional RAG, requiring upfront context curation through layered claude.md files. The harness enables successful navigation of multi-million line monorepos at enterprise scale.

Key Takeaways

Why Do AI Coding Agents Struggle with Large Codebases?

Coding agents face a fundamental challenge when dealing with tens or hundreds of thousands of lines of code. Strategies that work perfectly in simple codebases fail catastrophically as complexity grows. Developers often assume their codebase is "too complex" for AI assistance.

This assumption is incorrect. Claude Code already operates successfully at enterprise scale across multi-million line monorepos and legacy systems. The difference between success and failure isn't the model's capabilities - it's the infrastructure built around it.

The real problem is navigation without proper context. AI agents need to know where to look before they can effectively work with complex code.

How Does Claude Code Navigate Large Codebases?

Claude Code uses genetic search with command-line tools rather than traditional RAG (Retrieval-Augmented Generation) or semantic search. The system relies on grep, folder structure inspection, and other CLI tools to explore codebases dynamically.

This approach eliminates codebase indexing entirely. No sync overhead exists because Claude Code doesn't maintain a separate index that needs updating. The agent searches the actual codebase in real-time using the same tools developers use.

The system works best when given sufficient starting context to know where to look. Upfront context curation becomes critical for complex codebases - the agent needs navigation hints to begin its genetic search effectively.

What Is the AI Layer Framework?

The AI layer represents the set of context and tools you give your coding agent to work on a codebase. This harness matters as much as the underlying model itself. The ecosystem and context built around the model determine success or failure in real-world applications.

The framework consists of seven components:

Global rules form the foundation and dictate Claude Code's behavior continuously. The other components - hooks, skills, LSP, and MCP servers - activate sporadically based on need. This progressive disclosure pattern prevents context overload while maintaining comprehensive coverage.

How Should You Structure Global Rules?

Keep your global rules lean and layered. Avoid thousands-of-line files that overwhelm the LLM and consume valuable context window space. The root claude.md file should contain only core information that applies universally.

Include these essential elements in global rules:

Implement a layered approach with claude.md files in subdirectories. The root claude.md loads for all sessions automatically. Subdirectory claude.md files load progressively when Claude Code navigates into those areas, adding context only when relevant.

Initialize Claude Code in specific subdirectories to scope navigation and context to relevant parts of large codebases. This initialization strategy prevents the agent from wandering into irrelevant sections. Build codebase maps in global rules when directory structure doesn't provide sufficient navigation hints - these maps help Claude discover which parts of the codebase to focus on for specific tasks.

What Are Hooks and How Do They Enable Self-Improvement?

Hooks are automation scripts that run at specific points in your Claude Code session. They enable dynamic context management and continuous improvement of your AI layer without manual intervention.

Stop hooks run at session end, executing separate headless Claude sessions to reflect on changes made during the session. These hooks analyze what was modified and propose updates to claude.md files while context remains fresh. This mechanism prevents rules from becoming stale as your codebase evolves - the AI layer improves itself automatically.

Start hooks load team-specific context dynamically based on developer role or codebase location. These hooks can pull documentation from external sources like Confluence automatically, ensuring relevant context loads without manual copying. A start hook might detect which team you're on and load that team's specific conventions and patterns into the session context.

How Do Skills Enable Progressive Disclosure?

Skills are reusable prompts and processes for specific task types. A skill might define the complete workflow for adding an API route, including file creation, routing configuration, and test setup. These specialized workflows load only when needed.

Skills can be scoped to specific directory paths using path parameters. A skill for frontend component creation activates only when working in the components directory. This scoping prevents overwhelming the agent with context for tasks that don't apply to the current work.

The distinction from global rules is clear: rules are conventions and requirements that always apply, while skills are workflows and processes for specific situations. Rules tell Claude what standards to follow; skills tell Claude how to accomplish particular types of tasks. This separation enables progressive disclosure - load specialized knowledge only when the task requires it.

Why Do You Need LSP and MCP Servers?

Language Server Protocol (LSP) provides Claude the same code navigation capabilities developers have in IDEs. LSP enables control-click to definition, type hints, symbol references, and syntax highlighting. These capabilities become critical in large codebases where string matching falls short.

MCP (Model Context Protocol) servers expose LSP functionality to Claude Code through custom tools. Custom MCP servers enable intelligent symbol-level searches beyond grep capabilities. Instead of searching for string matches, Claude can search specifically for function definitions, class references, or type declarations.

These tools become essential for codebases with six-digit line counts. Grep becomes slow and token-inefficient at this scale - searching for a common term might return thousands of matches. MCP servers enable directed searches for definitions and references rather than string matching, dramatically reducing token consumption and improving accuracy.

How Do Subagents Prevent Context Window Bloat?

Subagents split exploration tasks from editing tasks using separate context windows. When Claude needs to research something - whether web documentation or codebase analysis - dispatch that work to a subagent rather than consuming your primary session's context.

Subagents return summaries to the primary session after completing exploration work. Exploration tasks can consume hundreds of thousands of tokens investigating documentation, analyzing code patterns, or researching solutions. These tokens don't impact your editing context when handled by subagents.

Built-in explorer subagents are available in Claude Code and Codex, eliminating the need for custom definitions. Simply dispatch exploration work to these subagents and receive concise summaries. This separation keeps your primary session focused on the actual coding task while background research happens in parallel context windows.

How Should Organizations Implement the AI Layer?

Assign ownership to an individual or small team to champion AI layer development. This owner builds the foundational rules, skills, and infrastructure during a quiet investment period. Avoid the scenario where every developer creates separate AI layers - inconsistent results and wasted effort follow.

Start with core global rules and expand progressively. Build the root claude.md file with essential codebase information. Add subdirectory rules for major components. Develop skills for common workflows your team performs repeatedly.

Roll out the standardized AI layer to your organization over time. Prevent user disappointment by ensuring the AI layer exists before widespread adoption of Claude Code. Developers trying Claude without proper context infrastructure will conclude the tool doesn't work, when actually the supporting harness was missing.

The plugin approach allows quick integration of AI layer strategies into existing codebases. Package your global rules, hooks, and skills as plugins that new projects can adopt immediately. This standardization ensures consistent AI agent performance across your organization's codebases.

What the Experts Say

"The harness matters as much as the model."

This statement captures the core thesis that model capabilities alone don't determine success with AI coding agents. The infrastructure and context you build around the model - the harness - determines whether the agent can navigate and modify your codebase effectively.

"Keep your global rules lean and layered."

This principle prevents the common mistake of creating massive claude.md files that overwhelm the LLM's context window. Layered rules that load progressively based on location provide comprehensive coverage without context overload, enabling effective navigation of even multi-million line codebases.

Frequently Asked Questions

Q: Does Claude Code require codebase indexing or RAG setup?

No, Claude Code uses genetic search with command-line tools like grep and folder structure inspection rather than RAG or semantic search. This approach eliminates codebase indexing and sync overhead entirely, working directly with the actual codebase in real-time.

Q: What should I include in the root claude.md file?

Include only core information that applies universally: codebase purpose, tech stack and architecture, coding conventions and standards, and common commands. Keep this file lean - thousands of lines overwhelm the LLM. Use subdirectory claude.md files for area-specific context.

Q: How do subdirectory claude.md files work?

Subdirectory claude.md files load automatically when Claude Code navigates into those directories, adding context progressively. The root claude.md always loads for all sessions, while subdirectory files load only when working in those specific areas, preventing context overload.

Q: What's the difference between global rules and skills?

Global rules are conventions and requirements that always apply throughout sessions - coding standards, architecture patterns, and core commands. Skills are reusable workflows and processes for specific task types that load only when needed, like procedures for adding API routes or creating components.

Q: How do stop hooks improve the AI layer automatically?

Stop hooks run separate headless Claude sessions at session end to analyze changes made during the session and propose updates to claude.md files while context is fresh. This prevents rules from becoming stale as the codebase evolves, enabling continuous self-improvement.

Q: When do I need MCP servers instead of just grep?

MCP servers become essential for codebases with six-digit line counts where grep becomes slow and token-inefficient. They expose LSP functionality for symbol-level searches of definitions and references rather than string matching, dramatically reducing token consumption and improving accuracy.

Q: How do subagents prevent context window bloat?

Subagents operate with separate context windows for exploration tasks like web research or codebase analysis. These tasks can consume hundreds of thousands of tokens without impacting your primary editing session. Subagents return concise summaries to the primary session.

Q: Should each developer create their own AI layer?

No, assign ownership to an individual or small team to create a standardized AI layer for your organization. Everyone developing separate AI layers wastes effort and produces inconsistent results. Roll out the organizational standard progressively to ensure consistent AI agent performance.

The Bottom Line

Building an effective AI layer around your coding agent matters as much as the model's underlying capabilities. The harness of global rules, hooks, skills, LSP/MCP servers, and subagents determines whether AI agents can successfully navigate and modify large codebases at enterprise scale.

Claude Code already operates successfully across multi-million line monorepos when supported by proper context architecture. The key is layered global rules that load progressively, self-improving hooks that keep context fresh, and specialized tools that activate only when needed. This infrastructure enables genetic search with CLI tools to scale effectively without the overhead of RAG indexing.

Start building your AI layer today by creating a lean root claude.md file with core codebase information, then expand progressively with subdirectory rules and skills for common workflows. Assign ownership to ensure organizational standardization rather than fragmented individual efforts. The investment in this third pillar of your codebase - alongside code and tests - will determine your success with AI-assisted development at scale.


Sources


About the Author

Sean Weldon is an AI engineer and systems architect specializing in autonomous systems, agentic workflows, and applied machine learning. He builds production AI systems that automate complex business operations.

LinkedIn | Website | GitHub