The Prompt is the Platform - Dominik Tornow, Resonate HQ

Software engineering is moving toward bespoke implementations generated on-demand from abstract specifications using agentic engineering, where agents partic...

By Sean Weldon

From Implementation to Specification: Enabling Agent-Driven System Design Through Deterministic Simulation

Abstract

This paper examines the emerging paradigm of agentic engineering, wherein AI agents generate bespoke software implementations from abstract specifications rather than relying on general-purpose reusable code. The central challenge addressed is bridging the gap between abstract system specifications and production-ready distributed systems that handle concurrency, partial failure, and eventual consistency. Through analysis of the Resonate durable execution platform development, this work demonstrates a four-stage methodology: abstract specification → simulation implementation → concrete specification → concrete implementation. The key innovation involves deterministic simulation environments that expose "forbidden fruit" information - internal system states invisible to production algorithms but critical for agent learning. This approach enables agents to participate in system design through executable simulation, discovering correct algorithms under partial order and partial failure conditions. The findings suggest a fundamental shift in software value from implementation artifacts to specifications and protocols.

1. Introduction

The software engineering discipline confronts a fundamental transformation in how systems are designed, implemented, and distributed. Traditional models emphasizing reusable, general-purpose implementations face displacement by on-demand generation of bespoke systems tailored to specific infrastructure contexts. This shift is precipitated by advances in AI agent capabilities, yet early deployment attempts revealed critical limitations: agents produced prototypes functioning under ideal conditions but failing catastrophically when confronted with distributed systems challenges including concurrency conflicts, process failures, and network partitions.

Agentic engineering represents a methodology wherein AI agents participate not merely as code generators but as active contributors to system design. The central question addressed herein concerns the architectural and methodological requirements for enabling agents to design and implement production-grade distributed systems from high-level specifications. Initial approaches that tasked agents with jumping directly from abstract specifications to concrete implementations proved inadequate - the semantic gap was too large, and agents lacked the intermediate guidance necessary to reason about partial failure and eventual consistency.

This analysis examines the evolution of the Resonate durable execution platform as a case study in agent-driven development. The platform's three-year development trajectory, characterized by iterative protocol reduction and the introduction of deterministic simulation environments, illuminates how agents can be elevated from implementation tools to design participants. The subsequent sections establish theoretical foundations, analyze the multi-stage development methodology with its critical simulation layer, examine technical implementation details, and synthesize broader implications for software engineering practice and platform architecture.

2. Background and Related Work

2.1 Durable Execution and Distributed Systems Challenges

Durable execution platforms provide infrastructure for long-running, fault-tolerant workflows that survive process failures and maintain execution state across restarts. The Resonate platform exemplifies this category, implementing its protocol using two core primitives: durable promises and durable tasks. These systems must handle fundamental distributed systems challenges including eventual consistency, where replicas may temporarily diverge, and partial failure scenarios where network partitions or process crashes occur independently across system components.

The target infrastructure for Resonate implementations, NATS.io, provides three key primitives: message queues, versioned key-value stores, and delayed message scheduling. Notably, the versioned key-value store operates under a consistency model permitting stale reads - data retrieval operations may return outdated values while more recent versions exist elsewhere in the system. Optimistic concurrency control mechanisms address this challenge by versioning data and rejecting write operations when the version read by a transaction no longer represents current state. This approach requires algorithms that remain correct despite operating on potentially stale views of system state, a design constraint that proved particularly challenging for agents to navigate without additional scaffolding.

2.2 Deterministic Simulation Testing

Deterministic simulation testing provides a methodology for validating distributed systems under controlled conditions that remain repeatable and inspectable. Unlike testing against real distributed infrastructure, where timing variations and network conditions introduce non-determinism, simulation environments can reproduce identical execution traces across runs. This repeatability proves essential for debugging complex failure scenarios and, as this analysis demonstrates, for providing agents with consistent feedback about why their designs fail under specific conditions.

3. Core Analysis

3.1 The Inadequacy of Direct Specification-to-Implementation Translation

Initial attempts to leverage agents for system implementation followed a two-stage process: abstract specification directly to concrete implementation. The abstract specification described system behavior at a high level without committing to specific infrastructure choices or data representations. Agents were tasked with generating production implementations directly from these specifications.

This approach failed systematically. As observed in the development process, "the implementation was closer to a prototype, but not a production system. It broke on the concurrency. It broke on the process failure. It broke on the network failure." The agents successfully generated code that functioned along the happy path - when requests arrived sequentially, processes remained stable, and network connections persisted. However, the implementations exhibited fundamental correctness violations under realistic distributed systems conditions.

The root cause was identified as an excessive semantic gap. Abstract specifications intentionally omit implementation details to remain infrastructure-agnostic and reusable across contexts. However, this abstraction meant agents received insufficient guidance about critical design decisions: how to structure data schemas, where to place transaction boundaries, which operations require atomic execution, and how to handle stale reads from eventually consistent storage. Without this intermediate layer of concrete decision-making, agents lacked the scaffolding necessary to reason about failure modes.

3.2 The Three-Stage Process: Concrete Specification as Bridge

Recognition of this gap led to the introduction of a concrete specification as an intermediary artifact positioned between abstract specification and implementation. The concrete specification makes target-specific decisions explicit: it defines precise data schemas, specifies which indices must exist, articulates exact SQL queries or key-value operations, and establishes transaction boundaries. Critically, the concrete specification commits to specific infrastructure primitives while remaining at a higher abstraction level than actual code.

With this intermediate artifact in place, agents successfully generated production-quality implementations. The concrete specification provided sufficient detail that the translation to code became a more mechanical process, while the gap between abstract and concrete specifications could be managed through structured decision-making about infrastructure mapping. This success revealed an important limitation: agents were now effective at building systems from detailed specifications, but they were not yet participating in system design. The concrete specification itself was still human-authored, representing the actual design work that determined system correctness under failure conditions.

3.3 Enabling Design Participation Through Deterministic Simulation

To elevate agents from implementation tools to design participants, the methodology required a fourth stage: simulation implementation. This created a four-stage pipeline: abstract specification → simulation implementation → concrete specification → concrete implementation. The simulation implementation serves as executable design - not the product itself, but a vehicle for exploring and validating design decisions before committing to concrete specifications.

The deterministic simulation environment replicates target platform behavior, specifically the versioned key-value store semantics of NATS.io, under controlled conditions. The simulation incorporates a deterministic random generator that decides whether each read operation returns the latest version or an older version, mimicking the stale reads possible under eventual consistency. Critically, this simulation is deterministic, repeatable, and inspectable - identical initial conditions produce identical execution traces, enabling systematic exploration of failure scenarios.

The simulation's power derives from its exposure of "forbidden fruit" information - data invisible to production algorithms but invaluable for learning. Trace events record not only the values returned by read operations but also whether each read was fresh or stale, what the latest value was if a stale read occurred, which logic was triggered based on the (possibly stale) read, which write operations failed due to version conflicts, and which system invariants were violated as a consequence. This rich feedback enables agents to establish cause-and-effect relationships: they observe not merely that invariants failed, but precisely why they failed - because the algorithm made decisions based on a stale view of system state.

3.4 Protocol Minimalism as Design Foundation

The effectiveness of agent-driven design through simulation depends critically on protocol simplicity. The Resonate protocol, after three years of iterative reduction, centers on just two primitives: durable promises and durable tasks. This minimalism was not a starting point but a finish line - achieved through systematic elimination of abstractions, erasure of unnecessary properties, and breaking of extraneous relationships.

Simplicity proves essential because even minimal concurrent distributed protocols exhibit complex state and behavior spaces. The design question becomes: how can the protocol be expressed using only target platform primitives - queues, key-value stores, and delayed messages - while maintaining correctness under partial failure? The constrained primitive set forces design discipline while the simulation environment enables validation that correctness properties hold even when reads are stale and writes conflict. Agents can explore this design space systematically, proposing algorithms and receiving immediate, causally-informative feedback about failures.

4. Technical Insights

4.1 Simulation Architecture and Implementation

The deterministic simulation environment implements a versioned key-value store abstraction that mirrors NATS.io behavior. Each key maintains a version counter that increments with successful writes. Read operations invoke the deterministic random generator to decide whether to return the current version or an earlier one. Write operations specify both the key and the version observed during the preceding read; writes succeed only if that version remains current, otherwise raising an exception to signal a concurrent modification.

This architecture enables systematic exploration of concurrency scenarios. By controlling the random seed, developers and agents can reproduce specific interleavings of operations that expose race conditions or incorrect handling of stale data. The trace events generated during simulation provide unprecedented visibility: for each operation, the trace records the operation type, the value returned, whether the return was fresh or stale, the actual current value if different, and any invariant violations that resulted from subsequent actions taken based on the (possibly stale) data.

4.2 Implementation Trade-offs and Limitations

The four-stage methodology introduces additional artifacts and process steps compared to direct implementation. Each stage requires validation: abstract specifications must be verified for completeness, simulation implementations must be tested for determinism and platform fidelity, concrete specifications must be checked for correct mapping to target primitives, and final implementations must be validated against both specifications and real-world conditions.

Furthermore, the simulation environment's value depends on its fidelity to target platform behavior. If the simulation's consistency model diverges from actual platform semantics - for example, if it permits stale reads that the real platform guarantees are fresh, or vice versa - agents may learn incorrect lessons about which algorithms are safe. Maintaining simulation fidelity as target platforms evolve requires ongoing investment in the simulation infrastructure itself.

4.3 Reusability and Value Migration

The methodology fundamentally alters where value resides in software products. As implementations become generatable from specifications, the specification - not the implementation - becomes the reusable artifact. A single abstract specification can yield multiple bespoke implementations for different infrastructure contexts: one targeting NATS.io, another for AWS services, a third for Google Cloud Platform primitives. Each implementation is minimal, containing only the extensions necessary for that specific infrastructure rather than general-purpose abstraction layers attempting to accommodate all possible targets.

This shift has profound implications for platform strategy. The product is no longer the implementation but rather the specification and protocol. Reuse moves upstream: rather than distributing general-purpose implementation libraries that users must adapt to their contexts, vendors distribute specifications from which users generate implementations perfectly fitted to their specific infrastructure. This approach eliminates the impedance mismatch between general-purpose libraries and specific deployment contexts, while enabling continuous optimization as agents improve their generation capabilities.

5. Discussion

The findings presented herein suggest several broader implications for software engineering practice and research. First, the inadequacy of direct specification-to-implementation translation highlights fundamental limitations in current agent capabilities around reasoning about distributed systems failure modes. Agents demonstrate proficiency at code generation when provided sufficient scaffolding but struggle to independently derive correct concurrency control and failure handling strategies from high-level requirements alone. The introduction of intermediate artifacts - concrete specifications and simulation implementations - effectively decomposes the problem into manageable steps that align with current agent capabilities.

Second, the "forbidden fruit" approach to agent training through simulation represents a general methodology applicable beyond the specific case of durable execution platforms. Any domain where correctness depends on handling partial information, stale data, or concurrent modifications could benefit from simulation environments that expose causally relevant hidden information. This approach transforms opaque failures - where agents observe only that something went wrong - into transparent learning opportunities where agents can trace failures to specific design decisions. Future research might explore applying this methodology to other distributed systems domains including consensus protocols, eventual consistency resolution, or distributed transaction coordination.

Third, the three-year iterative process of protocol reduction underscores that simplicity itself is a design achievement requiring sustained effort. The final two-primitive protocol emerged not from initial minimalism but from systematic elimination of complexity. This finding suggests that agent-driven design may be most effective when applied to already-simplified protocols rather than to complex systems where the design space remains poorly understood. The relationship between protocol complexity and agent design effectiveness warrants further investigation.

Several questions remain open for future work. The current methodology relies on human-authored abstract specifications; whether agents can participate in specification development itself remains unexplored. Additionally, the simulation approach requires that target platform behavior can be modeled deterministically, which may not hold for all infrastructure primitives or consistency models. Finally, the methodology's applicability to domains beyond distributed systems - such as user interface design, security protocol development, or performance optimization - has not been established.

6. Conclusion

This analysis demonstrates that effective agent participation in system design requires carefully structured intermediate artifacts and feedback mechanisms. The four-stage methodology - abstract specification, simulation implementation, concrete specification, concrete implementation - enables agents to discover correct distributed systems algorithms through deterministic simulation that exposes causally relevant hidden information. This approach bridges the gap between abstract requirements and production-ready implementations that handle concurrency, partial failure, and eventual consistency.

The practical implications are significant. Software value migrates from implementations to specifications, enabling generation of bespoke systems tailored to specific infrastructure contexts rather than distribution of general-purpose libraries. Platform vendors can provide specifications from which users generate optimally-fitted implementations, eliminating impedance mismatches while maintaining protocol correctness. The methodology's success depends critically on protocol simplicity - complex protocols with large state spaces remain challenging even with simulation support.

For practitioners, the key takeaway is that agent-driven development requires investment in intermediate artifacts that provide agents with appropriate scaffolding and feedback. For researchers, the findings suggest productive directions including extending the forbidden fruit approach to other distributed systems domains, investigating the relationship between protocol complexity and agent design effectiveness, and exploring whether agents can participate in specification development itself. As agent capabilities continue advancing, the methodology presented herein offers a validated pathway for elevating agents from code generators to genuine design participants.


Sources


About the Author

Sean Weldon is an AI engineer and systems architect specializing in autonomous systems, agentic workflows, and applied machine learning. He builds production AI systems that automate complex business operations.

LinkedIn | Website | GitHub