What if the network was the sandbox? — Remy Guercio, Tailscale

Network-layer identity and permissions can be leveraged to build secure AI agent sandboxes that provide complete visibility into LLM usage, tool calls, and r...

By Sean Weldon

Network-Layer Identity as a Foundation for Secure AI Agent Sandboxing

Abstract

This paper examines a novel architectural approach to securing AI agent sandboxes by relocating authentication and authorization mechanisms from the application layer to the network layer. Traditional sandbox implementations embed credentials within the sandbox environment, creating inherent exfiltration vulnerabilities. The proposed architecture leverages WireGuard protocol and Tailscale identity primitives combined with a centralized AI gateway (Aperture) to eliminate credential exposure while providing comprehensive observability. Implementation across multiple LLM providers (Anthropic, OpenAI, Gemini, Vertex, Bedrock) demonstrates that network-layer identity enables granular access control, complete tool call visibility, and per-identity resource tracking without requiring agents to possess credentials. Empirical deployment reveals that bash command execution constitutes the dominant execution pattern in production environments, challenging assumptions about structured tool calling prevalence and informing future guardrail development priorities.

1. Introduction

The deployment of autonomous AI agents within sandboxed environments represents a critical challenge at the intersection of artificial intelligence and systems security. Sandboxes—isolated execution environments designed to contain potentially untrusted code—traditionally comprise two fundamental components: boundaries that enforce separation between internal and external environments, and permission systems that govern resource access through identity verification and authorization policies. Contemporary implementations typically position authentication mechanisms within the sandbox itself, granting agents direct access to API keys or OAuth tokens necessary for external service interaction.

This architectural pattern, while functionally sufficient for many use cases, introduces fundamental security vulnerabilities. When authentication credentials reside within the sandbox boundary, compromised or malicious agents can exfiltrate these credentials for unauthorized use, exceed intended authorization scopes, or obscure their activities from monitoring systems. As AI agents become increasingly autonomous and are granted access to production systems, code repositories, and sensitive data, the security implications of credential exposure intensify.

This analysis proposes and examines an alternative architectural paradigm: relocating authentication and authorization mechanisms entirely to the network layer, external to the sandbox environment. By treating the network itself as the primary security boundary and leveraging network-layer identity primitives, this approach addresses credential exposure risks while simultaneously enabling unprecedented visibility into agent behavior, tool invocations, and resource consumption. The following sections establish the theoretical foundation for network-layer identity systems, describe the technical implementation of a centralized AI gateway architecture, analyze observability capabilities, and discuss practical implications for secure agent deployment in production environments.

2. Background and Related Work

2.1 Limitations of Application-Layer Permission Models

Contemporary sandbox architectures implement permissions through two primary mechanisms: API key authentication and OAuth/OIDC (OpenID Connect) protocols. API keys provide simple bearer token authentication, while OAuth/OIDC offers more sophisticated delegated authorization with token refresh capabilities. Both approaches, however, execute authentication logic within the sandbox boundary, requiring agents to possess and manage credentials directly. This design pattern creates attack surfaces where malicious or compromised agents can extract credentials for unauthorized use beyond intended scopes, share credentials across contexts, or bypass usage monitoring through direct API access.

2.2 Network Identity and Zero Trust Architecture

WireGuard protocol establishes encrypted peer-to-peer connections between arbitrary network nodes through a lightweight, modern VPN implementation. Unlike traditional VPN architectures that route traffic through centralized servers, WireGuard enables direct encrypted tunnels with minimal protocol overhead. Tailscale extends WireGuard's capabilities by adding a comprehensive identity layer that associates each network connection with user identity, group membership, and arbitrary tags. This identity metadata accompanies every network request, transforming network connections from anonymous IP-based communications into authenticated, attributed interactions that carry rich contextual information.

The LLM gateway pattern introduces an intermediary service that mediates all interactions between agents and LLM providers, enabling centralized policy enforcement, request logging, and cost management. When combined with network-layer identity primitives, gateways can make authorization decisions based on authenticated identity rather than embedded credentials, fundamentally altering the security model.

3. Core Analysis

3.1 Architecture Design Principles

The proposed architecture relocates authentication and authorization from the application layer to the network layer through three core mechanisms. First, sandbox environments receive network identities expressed as tags (e.g., tag:pr-review-bot-project-x) rather than API credentials. These tags encode both the agent's functional role and its authorization scope. Second, all LLM interactions traverse a centralized gateway node (Aperture) that maintains provider API keys external to any sandbox environment. Third, network-layer access control lists determine which tagged identities can reach which resources, enforcing permissions before requests reach application endpoints.

This design eliminates credential exfiltration vectors entirely: agents possess no API keys, OAuth tokens, or other bearer credentials that could be extracted or misused. The architectural principle can be summarized as: "What if we took the components of authN and authZ and we just stuck them at the network level?" This relocation transforms the network itself into the primary security boundary, with sandboxes becoming truly untrusted execution environments that cannot authenticate independently.

3.2 Implementation Through Tailscale and Aperture

The technical implementation leverages Tailscale's identity primitives to attach metadata to network connections. When a sandbox environment initializes—such as a GitHub Actions runner spinning up—it uses federated OIDC to automatically join the Tailscale network (tailnet) with appropriate tags assigned based on repository, workflow, or other contextual factors. These tags persist throughout the connection lifecycle and accompany every network request originating from that sandbox.

Aperture functions as a specialized node within the tailnet that implements the AI gateway pattern. It maintains single provider API keys for multiple LLM services (Anthropic, OpenAI, Gemini, Vertex, Bedrock) and accepts requests from tagged sandbox identities. When a request arrives, Aperture extracts the network identity, evaluates access policies defined in Tailscale ACL files or its own configuration, and either forwards the request with the provider API key or denies access. The critical security property is that "the moment you say no, it's not like it has a key and can try another endpoint—it's just a dash." Denied agents have no alternative authentication path.

3.3 Comprehensive Observability and Tool Call Extraction

Network-layer mediation provides guaranteed visibility into agent behavior. All LLM requests flow through Aperture, enabling complete logging of headers, request bodies, response bodies, token consumption, cost per request, and model selection. This observability extends beyond simple request logging to include tool call extraction—the identification and recording of every bash command, Model Context Protocol (MCP) tool invocation, and function call executed by the agent.

The architectural guarantee of tool call visibility stems from network topology: "There's no hiding it from you because it's not like, 'I'm going to be super helpful and go do this thing'—it just has to be here." Any tool execution requiring external resources (API calls, database queries, file system access beyond the sandbox) must traverse the network layer, making it observable at Aperture. This property holds regardless of whether agents use structured tool calling protocols or execute arbitrary bash commands, addressing a critical gap in application-layer monitoring approaches that can be bypassed through obfuscation.

Empirical deployment reveals that bash command execution dominates actual usage patterns in production environments, contrary to assumptions that structured MCP tool calls would predominate. This finding has significant implications for guardrail development, suggesting that command-level analysis (detecting patterns like rm -rf /) should receive priority over structured tool call validation.

3.4 Granular Access Control and Resource Management

Permission configuration operates through two complementary mechanisms: visual policy editors in the Aperture UI and JSON-based policy files compatible with infrastructure-as-code workflows. Tailscale ACL files support application capabilities—arbitrary metadata sent with identity information to control fine-grained access. This enables per-user budgets, per-group quotas, per-model restrictions, and per-provider limits, all enforced at the network layer before requests reach LLM endpoints.

The quota system supports union logic, allowing flexible allocation strategies where team budgets and individual budgets can be combined or kept separate. Request logging captures cost metrics with high precision; for example, a simple "hello world" request to Claude Code consumes $0.20 due to context window overhead, demonstrating the importance of per-request cost tracking for budget management. Usage metrics aggregate by user identity, agent tag, or group membership, providing organizational visibility into AI resource consumption patterns.

4. Technical Insights

4.1 Implementation Considerations

The architecture's extensibility derives from TSnet, an open-source Go library that enables custom programs to join tailnets and read identity information from network connections. This primitive allows organizations to replicate Aperture's functionality or build custom MCP servers and API endpoints using the same identity-based access control without implementing OAuth infrastructure. Webhook integrations enable forwarding of tool call data and request logs to third-party systems for additional processing, compliance logging, or security analysis.

A significant design decision involves explicit base URL configuration rather than transparent network interception. Agents must explicitly direct requests to Aperture's endpoint rather than having the network layer transparently redirect provider API calls. This choice prioritizes clarity and debuggability over transparency: developers can inspect network traffic and understand the gateway's role in the request path, facilitating troubleshooting and reducing cognitive overhead.

4.2 Trade-offs and Limitations

The network-layer approach was selected over MCP-layer implementation specifically because it captures all execution patterns, including bash commands, code execution, and tool calls, regardless of protocol. MCP-layer monitoring would only observe structured tool calls, missing the dominant execution pattern observed in production. However, this comprehensiveness requires that all external interactions traverse the network layer; purely local computations within the sandbox remain invisible unless they generate network traffic.

The single API key model—one provider credential stored on the gateway, zero credentials in sandboxes—eliminates credential exposure vectors but introduces the gateway as a critical dependency and potential single point of failure. High availability and disaster recovery considerations become paramount in production deployments. Additionally, the architecture assumes network connectivity between sandboxes and the gateway; air-gapped or highly restricted network environments may require alternative approaches.

Future development priorities include guardrails for dangerous command detection (e.g., identifying and blocking destructive filesystem operations) and more sophisticated policy languages for expressing complex authorization rules. The current implementation provides foundational visibility and access control, but advanced use cases may require integration with external policy engines or runtime analysis systems.

5. Discussion

The network-layer identity approach represents a fundamental shift in how security boundaries are conceptualized for AI agent systems. Traditional application-layer security treats the network as a transport mechanism and implements trust boundaries within applications. This architecture inverts that model, treating the network itself as the primary trust boundary and reducing applications to untrusted execution environments. This paradigm aligns with zero trust security principles, where identity verification occurs at every access point rather than assuming trust within network perimeters.

The empirical finding that bash commands dominate production usage patterns challenges prevailing assumptions about AI agent behavior. Much contemporary research focuses on structured tool calling protocols and formal verification of tool invocations, yet actual deployment reveals agents frequently resort to shell commands for flexibility and capability. This observation suggests that security research should prioritize command-level analysis, process isolation, and filesystem access controls alongside protocol-level tool call validation.

The architecture's observability guarantees—that tool calls cannot be hidden because they must traverse observable network boundaries—provide a foundation for compliance and audit requirements in regulated industries. Complete request and response logging, combined with per-identity cost tracking, enables organizations to demonstrate control over AI system behavior and resource consumption. However, the volume of data generated by comprehensive logging presents challenges for storage, analysis, and retention policy implementation.

6. Conclusion

This analysis demonstrates that network-layer identity and permissions provide a viable foundation for secure AI agent sandboxing, addressing credential exfiltration vulnerabilities while enabling comprehensive observability. By relocating authentication and authorization to the network layer through WireGuard protocol, Tailscale identity primitives, and centralized AI gateway patterns, the architecture eliminates the need for agents to possess credentials while maintaining granular access control and complete visibility into tool calls and resource consumption.

The practical implications extend beyond security to operational concerns: per-identity cost tracking, quota management, and audit logging become inherent properties of the architecture rather than additional features requiring separate implementation. The finding that bash commands dominate actual usage patterns informs future research priorities, suggesting that command-level guardrails and process isolation deserve greater attention than structured tool calling protocols alone.

Organizations deploying autonomous AI agents in production environments should consider network-layer identity as a foundational security primitive, particularly when agents require access to sensitive resources, production systems, or external APIs. The open-source availability of TSnet and the extensibility of the Aperture model enable custom implementations tailored to specific organizational requirements, making this approach accessible beyond the specific Tailscale ecosystem. Future work should explore integration with runtime analysis systems, development of sophisticated policy languages for complex authorization rules, and evaluation of performance characteristics under high-throughput production workloads.


Sources


About the Author

Sean Weldon is an AI engineer and systems architect specializing in autonomous systems, agentic workflows, and applied machine learning. He builds production AI systems that automate complex business operations.

LinkedIn | Website | GitHub