How Building with AI Can Double the Throughput of Your Engineering Team — Brian Scanlan, Intercom

Intercom doubled engineering productivity in under a year by treating AI agents as first-class citizens in the software development lifecycle, requiring deci...

2026-05-19 By Sean Weldon

Abstract

This research synthesis examines Intercom's systematic approach to doubling engineering productivity through comprehensive AI agent integration within a twelve-month period. The company established an agent-first software development paradigm by consolidating around Claude Code as a unified platform, developing durable skill-based architectures, and implementing organizational changes that treat AI agents as first-class engineering citizens. Key achievements include reaching the 90th percentile for AI-generated pull requests, attaining 17.6% automatic code approval rates while maintaining SOC 2, ISO 27001, and HIPAA compliance, and processing 2.2 million customer support resolutions weekly through their proprietary Finn agent. The findings demonstrate that productivity gains require decisive platform consolidation, continuous skill refinement through backtesting, and comprehensive organizational commitment rather than incremental tool adoption. This case study provides empirical evidence for the viability of treating AI agents as full engineering participants rather than auxiliary assistance tools.

1. Introduction

The integration of artificial intelligence into software engineering workflows represents a fundamental transformation in how technical organizations operate. While many companies experiment with AI-assisted coding tools, few have systematically restructured their entire engineering organization around agent-first principles. This synthesis examines Intercom's comprehensive approach to achieving a 2x productivity increase in engineering throughput within one year, accomplished without proportional headcount expansion.

Agent-first development refers to the architectural and organizational principle that all technical work executable by human engineers must be equally executable by AI agents. This paradigm shift extends beyond code generation to encompass debugging, testing, planning, and incident response. The central thesis posits that achieving substantial productivity gains requires treating AI agents as first-class citizens in the software development lifecycle, necessitating platform consolidation, skill-based architecture, and decisive organizational change management.

The significance of this research extends beyond a single company's transformation. Intercom's approach demonstrates that the question is not whether AI will transform knowledge work, but rather how organizations can systematically implement this transformation while maintaining quality, security, and compliance standards. The analysis proceeds through five sections: establishing the business context and transformation rationale, examining the platform consolidation strategy, analyzing the skill-based architecture framework, evaluating metrics and compliance outcomes, and synthesizing broader implications for software engineering organizations.

2. Background and Related Work

Intercom's transformation occurred against the backdrop of rapid advancement in large language model capabilities for code generation. The company pivoted to AI-focused product development coinciding with ChatGPT's launch in late 2022, subsequently releasing their customer support agent Finn on the same day GPT-4 became available. This agent now serves over 8,000 customers and approaches $100 million in revenue, processing 2.2 million resolutions weekly with industry-leading resolution rates. The company developed a proprietary model serving 100% of Finn's English text conversations, demonstrating superior performance relative to frontier models while achieving lower costs and higher speed.

This external product success established organizational credibility for aggressive AI adoption internally. The productivity initiative launched mid-2024, deliberately timed with what leadership identified as the most significant capability shift in coding models, particularly around December 2024. The primary metric—code changes per research and development person—was measured through developer surveys and developer experience tools, providing quantitative validation of throughput improvements.

The theoretical foundation draws on parallel transformations in software engineering history, particularly the transition from system administrators to site reliability engineers that accompanied cloud computing adoption. This historical analogy suggests that engineer and product builder roles are similarly moving up the abstraction stack, with AI agents handling lower-level technical execution while humans focus on problem definition and architectural decisions.

3. Core Analysis

3.1 Platform Consolidation Strategy

The decision to consolidate around Claude Code as a single unified platform emerged as a critical success factor. Following an initial period of tool diversity—including Cursor, Augment, and GitHub Copilot—leadership made the platform consolidation decision in December 2024, with rollout beginning in January 2025. This strategic choice addressed two fundamental challenges: model anxiety stemming from multi-cloud fragmentation and the inability to achieve compounding optimization benefits when effort disperses across multiple agent platforms.

The platform consolidation enabled treating Claude Code as a senior engineer capable of any technical task across the organization. Engineering leadership connected Claude Code to all engineering workflows with mature controls, permissions, audits, and sufficient confidence to deploy it with the same autonomy granted to human engineers. This integration extended beyond simple code generation to encompass the full spectrum of technical work, including debugging, testing, planning, and incident response.

The consolidation strategy yielded measurable results. Pull requests generated by Claude Code achieved the 90th percentile range following platform consolidation, indicating quality comparable to or exceeding typical human-generated contributions. The company built tens of thousands of lines of code in internal Claude Code plugins, which were pushed to all laptops bypassing standard update mechanisms to avoid installation debugging at scale.

3.2 Skill-Based Architecture and Knowledge Encapsulation

The technical implementation centered on developing durable, testable skills rather than multi-agent orchestrators or custom workflows. This architectural choice prioritized small, high-quality, reusable components that encapsulate Intercom-specific knowledge including Rails conventions, architecture patterns, React standards, testing requirements, and security rules. These skills were implemented as engineering captures, guidance documents, and hooks that enforce best practices automatically.

The skill development methodology employed a continuous improvement flywheel: when agents encountered issues or pursued incorrect approaches, engineers updated guidance documents, creating a self-reinforcing cycle of capability enhancement. This approach proved superior to prescriptive task-based systems. Rather than instructing agents on specific steps to execute, the framework gave agents problems and allowed them to determine which skills to invoke, paralleling how senior engineers approach complex technical challenges.

Validation of skill quality occurred through backtesting against historical data, including previous code changes, incidents, and work artifacts. This empirical validation approach ensured skills produced reliable outcomes before deployment. A notable example involved developing a flaky test fixer skill: rather than prescribing specific requirements, engineers engaged in a feedback loop with the agent, which produced a well-organized solution featuring lookup tables and progressive disclosure for managing hundreds of thousands of flaky tests.

3.3 Organizational Change Management

Achieving systematic adoption required interventions extending far beyond tool provisioning. Leadership established binary performance expectations: failure to adopt AI constituted performance failure regardless of role, explicitly including designers, product managers, and engineers. Job descriptions were updated to reflect this requirement, and leadership repeated this message over 100 times across every organizational forum to maintain urgency and consistency.

The change management approach incorporated multiple reinforcement mechanisms. Leadership celebrated AI adoption wins in dedicated Slack channels, showcasing effective techniques and successful outcomes across teams. The organization conducted hackathons and AI immersion days to accelerate learning for hundreds of engineers. Critically, a dedicated 2x team was staffed full-time with the organization's best people, reflecting the conviction that medium and large organizations require top talent focused exclusively on this transformation rather than treating it as an auxiliary responsibility.

A maturity model helped individuals understand their current capability level and progression path. The model defined five stages: using Claude Code for basic tasks, automating repetitive work, moving to skill-based approaches, writing and approving skills, and finally optimizing the entire environment—including software architecture and documentation—for agent effectiveness. This framework addressed the reality that AI adoption remained unevenly distributed even within Intercom, providing clear guidance for capability development.

3.4 Problem-Driven Agent Invocation

The implementation philosophy emphasized giving agents problems rather than tasks, allowing autonomous determination of skill invocation. This approach proved particularly effective in incident response scenarios. In one documented example, an engineer opened Claude Code and described a security incident. The agent automatically invoked the data breach policy skill, analyzed relevant files, concluded the incident was innocuous, and provided next steps—completing in two minutes what would have required twenty minutes of manual work.

This problem-driven approach reflects a fundamental architectural principle: everything humans can do, agents must be able to do. The scope explicitly extended beyond code production to encompass all technical work. This comprehensive mandate drove adoption beyond traditional engineering boundaries, with product managers, designers, and other roles demanding access to Claude Code. Single-person teams began conducting product experiments and shipping code, with agents even assuming product management responsibilities in some contexts.

4. Technical Insights

The technical implementation achieved several notable outcomes that provide actionable insights for other organizations. The 17.6% automatic code approval rate represents a significant milestone, accomplished through sophisticated backtesting, human labeling, and confidence scoring mechanisms. Importantly, these automatic approvals maintained full SOC 2, ISO 27001, and HIPAA compliance without requiring humans in the approval loop, demonstrating that regulatory requirements do not inherently preclude agent autonomy when appropriate controls and auditing mechanisms exist.

Code quality metrics validated the approach's effectiveness. Pull requests from Claude Code consistently reached the 90th percentile range, with Stanford research group validation confirming code quality metrics increased over time. Defect rates initially increased during the transformation but subsequently began closing faster than ever, with some teams pursuing backlog zero—complete elimination of outstanding defects.

The skill-based architecture scaled to hundreds of contributors producing tens of thousands of lines of code in Claude Code plugins. This distributed contribution model enabled rapid capability expansion while maintaining quality through the backtesting validation framework. The architecture's durability manifested in skills that self-updated and leveraged new capabilities as they became available, avoiding the brittleness characteristic of tightly coupled automation systems.

Implementation trade-offs emerged in several areas. Platform consolidation required abandoning tools some engineers preferred, creating short-term friction. The binary adoption expectation generated pressure that not all team members adapted to equally quickly, necessitating the maturity model framework to provide structured progression paths. The investment in a dedicated 2x team represented significant resource allocation, though leadership deemed this essential for achieving systematic rather than incremental change.

5. Discussion

The findings synthesize into several broader implications for software engineering organizations. First, the research demonstrates that substantial productivity gains require treating AI adoption as a comprehensive organizational transformation rather than incremental tool deployment. The platform consolidation decision, while potentially controversial, enabled compounding optimization benefits impossible with fragmented tooling approaches. This suggests organizations should resist the temptation to support multiple AI coding platforms simultaneously, despite individual preferences.

Second, the skill-based architecture framework offers a generalizable approach to knowledge encapsulation and agent capability development. The emphasis on durable, testable, reusable skills rather than complex multi-agent orchestrators aligns with software engineering principles of modularity and composability. The continuous improvement flywheel—where agent failures drive guidance updates—provides a systematic mechanism for capability enhancement that scales across organizations.

Third, the compliance outcomes challenge assumptions about regulatory requirements and automation. Achieving SOC 2, ISO 27001, and HIPAA compliance without human-in-loop approvals demonstrates that appropriate controls, auditing mechanisms, and confidence scoring can satisfy regulatory standards while enabling agent autonomy. This finding has significant implications for regulated industries considering AI adoption.

Areas for future investigation include longitudinal studies of productivity sustainability, analysis of skill architecture patterns across different technology stacks, and examination of how the agent-first paradigm affects software architecture decisions. The observation that some teams pursued backlog zero suggests potential quality improvements beyond productivity gains that warrant systematic study. Additionally, the viral adoption beyond engineering—with product managers and designers demanding access—indicates broader applicability of agent-first principles to knowledge work generally.

6. Conclusion

This research synthesis documents Intercom's systematic approach to doubling engineering productivity through comprehensive AI agent integration. The key contributions include empirical validation of the agent-first development paradigm, demonstration of platform consolidation benefits, articulation of skill-based architecture principles, and proof that regulatory compliance does not preclude agent autonomy with appropriate controls.

The practical takeaways for technical organizations are clear: achieving substantial productivity gains requires decisive platform consolidation, investment in durable skill-based architectures validated through backtesting, comprehensive organizational change management with binary adoption expectations, and dedicated teams of top talent focused exclusively on the transformation. The approach extends beyond code generation to encompass all technical work, treating AI agents as first-class engineering citizens rather than auxiliary assistance tools.

Organizations seeking to implement similar transformations should prioritize platform selection and consolidation early, establish clear maturity progression frameworks, invest in knowledge encapsulation through testable skills, and maintain unwavering leadership commitment to adoption expectations. The evidence suggests that the question facing software engineering organizations is not whether to pursue agent-first development, but rather how quickly and systematically they can execute this fundamental transformation in their engineering practices.

Sources

How Building with AI Can Double the Throughput of Your Engineering Team — Brian Scanlan, Intercom - Original Creator (YouTube)
Analysis and summary by Sean Weldon using AI-assisted research tools

About the Author

Sean Weldon is an AI engineer and systems architect specializing in autonomous systems, agentic workflows, and applied machine learning. He builds production AI systems that automate complex business operations.

LinkedIn | Website | GitHub