Skip to main content

What You’ll Learn

  • Uploading a volume once and reusing it across many sandbox creates
  • Avoiding per-sandbox upload bandwidth and time cost
  • Verifying each sandbox sees a private-but-identical copy under the mount path

Prerequisites

export DECLAW_API_KEY="your-api-key"
export DECLAW_DOMAIN="your-declaw-instance.example.com:8080"

Why Volumes for Fan-out

Without volumes, every new sandbox repeats the upload: each sbx.files.write() or sbx.files.put_raw() call pushes bytes from the SDK, through sandbox-manager, into that specific sandbox’s overlay. If you run 20 sandboxes over the same 500 MiB dataset you pay for 10 GB of ingress. With volumes, the 500 MiB blob lives in object storage. Each Sandbox.create(volumes=[...]) hydrates it from there, directly, in parallel.

Code Walkthrough

from declaw import Sandbox, Volumes, VolumeAttachment

# 1. Upload the dataset once.
with open("training.tar.gz", "rb") as f:
    vol = Volumes.create(name="training-set", data=f)
attachment = VolumeAttachment(volume_id=vol.volume_id, mount_path="/data")

# 2. Create N sandboxes, each attaching the same volume.
sandboxes = [
    Sandbox.create(template="python", timeout=300, volumes=[attachment])
    for _ in range(4)
]

try:
    # Each sandbox sees its own copy of /data; writes don't cross sandbox boundaries.
    for i, sbx in enumerate(sandboxes):
        r = sbx.commands.run("wc -l /data/rows.csv")
        print(f"sandbox {i}: {r.stdout.strip()}")
finally:
    for sbx in sandboxes:
        sbx.kill()
    Volumes.delete(vol.volume_id)

What’s Happening Under the Hood

  1. Volumes.create streams the tarball into Declaw’s object store under an owner-scoped key. The SDK doesn’t retain any state about the payload past the returned volume_id.
  2. Each Sandbox.create hands the orchestrator a VolumeAttachment list. After the VM boots (warm-pool or cold), the orchestrator streams the blob back from object storage, unpacks it directly into the VM’s overlay via the in-VM file API, then acknowledges the create.
  3. Sandboxes are isolated — a write to /data/foo in sandbox A does not appear in sandbox B. The volume itself is read-only at hydrate time.

Tips

  • The same volume_id can be attached concurrently from many processes; there is no lock or contention.
  • Multi-GiB uploads should use request_timeout on Volumes.create and on the client-side HTTPS call — a 3 GiB upload over a 25 MB/s pipe is ~2 minutes, and Python’s default httpx timeout is 30 seconds.
  • If a hydrate fails for any reason (object-store outage, tar corruption), the sandbox still boots and reports healthy; the files simply won’t be present. Check /data existence before running logic that depends on it.

Phase 1 Limits

  • Phase 1 volumes are read-at-boot. There is no write-back channel — if a sandbox edits a file under mount_path those changes die with the sandbox.