Skip to main content

What You’ll Learn

  • Uploading a dataset once as a Declaw volume
  • Fanning out N agents, each in an isolated sandbox
  • Attaching the same volume to every sandbox at create time (no per-sandbox re-upload)
  • Wiring a single run_shell tool that dispatches into the sandbox the agent is attached to

Prerequisites

export DECLAW_API_KEY="your-api-key"
export DECLAW_DOMAIN="your-declaw-instance.example.com:8080"
pip install "declaw[openai-agents]"
Env:
  • DECLAW_API_KEY, DECLAW_DOMAIN — your Declaw creds
  • OPENAI_API_KEY — OpenAI key for GPT-4.1

The Pattern

One upload, N sandboxes, N agents asking different questions of the same data.
             ┌────────────────────────────┐
             │   Volumes.create("sales")  │   one tar.gz upload
             └─────────────┬──────────────┘

                           ▼  (blob in GCS)
         ┌─────────────────┴──────────────────┐
         │                 │                  │
   Sandbox A          Sandbox B         Sandbox C
   /data/sales.csv    /data/sales.csv   /data/sales.csv
   Agent: revenue     Agent: avg/region Agent: outliers

Code Walkthrough

Upload the dataset once:
from declaw.openai import AsyncVolumes

vol = await AsyncVolumes.create(name="sales-dataset", data=open("sales.tar.gz", "rb"))
Spin up N sandboxes in parallel, each with the volume attached and its own GPT-4.1 agent:
import asyncio
from agents import Agent, Runner, function_tool
from declaw.openai import SecurityPolicy, PIIConfig, InjectionDefenseConfig, SandboxNetworkOpts, VolumeAttachment
from declaw.sandbox_async.main import AsyncSandbox


async def analyze(question: str, volume_id: str) -> str:
    sbx = await AsyncSandbox.create(
        template="python",
        network=SandboxNetworkOpts(allow_out=["api.openai.com"]),
        security=SecurityPolicy(
            pii=PIIConfig(enabled=True, action="redact"),
            injection_defense=InjectionDefenseConfig(enabled=True, sensitivity="medium"),
        ),
        volumes=[VolumeAttachment(volume_id=volume_id, mount_path="/data")],
    )

    @function_tool
    async def run_shell(command: str) -> str:
        r = await sbx.run_command(command, timeout=60)
        return (r.stdout or "") + (r.stderr or "")

    agent = Agent(
        name="analyst",
        model="gpt-4.1",
        instructions="You are a data analyst. The dataset is at /data/sales.csv. "
                     "Use run_shell to inspect with awk/python. Answer with numbers.",
        tools=[run_shell],
    )
    try:
        result = await Runner.run(agent, question, max_turns=6)
        return result.final_output or ""
    finally:
        await sbx.kill()


async def main():
    vol = await AsyncVolumes.create(name="sales", data=open("sales.tar.gz","rb"))
    try:
        answers = await asyncio.gather(
            analyze("Which product has the highest total revenue?", vol.volume_id),
            analyze("Average total_usd per order for each region?",   vol.volume_id),
            analyze("Any orders with total_usd > 50× the median?",     vol.volume_id),
        )
        for a in answers:
            print(a, "\n---")
    finally:
        await AsyncVolumes.delete(vol.volume_id)

asyncio.run(main())

Why a Volume Instead of sbx.files.write() per Sandbox?

The naive way would be to upload the CSV separately into each sandbox with sbx.files.write("/data/sales.csv", csv_bytes). That re-sends the bytes for every fan-out branch. A 500 MiB dataset across 10 agents = 5 GB of egress from your process. With a volume:
  1. The bytes cross the network once (the AsyncVolumes.create upload to Declaw’s object store).
  2. Every Sandbox.create(volumes=[...]) streams the same blob straight from object storage into its own overlay — in parallel, with no back-pressure between the sandboxes.
  3. Your script doesn’t re-read or re-send the dataset past step 1.

Security Surface Still Applies

Attaching a volume does not bypass any of Declaw’s guardrails. Each sandbox still runs behind its own network proxy, with PII redaction, prompt-injection detection, and the other SecurityPolicy scanners scoped to that sandbox. The volume’s bytes are never exposed to the network proxy (they arrive via a direct orchestrator→envd hop, not through the user-facing traffic path).

Full Example

The runnable version is at cookbook/openai_agents_volumes.py:
python cookbook/openai_agents_volumes.py
It synthesizes a 1000-row sales CSV, plants two outlier rows, and fans out three agents (revenue per product, average-per-region, outlier detector). Each agent returns a concrete numerical answer — the outlier agent correctly flags the planted rows.

Phase 1 Limits Recap

  • Volume body must be application/gzip (a tar archive gzipped).
  • 4 GiB upload cap.
  • Volumes are read-at-boot. Edits a sandbox makes to files under mount_path stay private to that sandbox and do not flow back to the volume.