VM structure
Rootfs isolation
Each sandbox gets an independent rootfs through Linux OverlayFS:-
Base image (
/opt/declaw/rootfs/rootfs.ext4): Read-only ext4 image shared across all sandboxes. Contains the OS, packages, and tools from the selected template. -
Overlay image (
/opt/declaw/run/<sandbox_id>/overlay.ext4): Per-sandbox writable ext4 image. All writes go here. Deleting a sandbox deletes only this file. - OverlayFS mount: The VM sees the merged view — the base image as the lower layer and the overlay as the upper layer.
Sandbox ID validation
Sandbox IDs must match the patternsbx-[a-z0-9][a-z0-9-]{0,63} before they are used in any filesystem path. The orchestrator rejects IDs containing .., /, or any path separator. This prevents path traversal attacks where a crafted sandbox ID like sbx-../../../etc could resolve to an unintended directory.
Network setup
Each sandbox gets a dedicated Linux network namespace (ns-sbx-<id>) with:
- veth pair: One end in the host network namespace (for routing), one end in the sandbox namespace.
- TAP device: Inside the sandbox namespace, bridges the namespace to the Firecracker VM’s virtual NIC.
- iptables rules: Applied to the veth interface in the sandbox namespace for CIDR-based network policies.
Network slot pool
The orchestrator maintains a pool of 256 pre-allocated network slots. Each slot includes a pre-created network namespace, veth pair, and TAP device. When a sandbox is created, a slot is claimed from the pool (instant), and the pool replenishes in the background. This eliminates networking setup time from the sandbox creation hot path.Boot process
Cold boot time fromCreateVM() call to envd ready is approximately 125 milliseconds. With snapshot restore, it drops to approximately 30 milliseconds.
Resource defaults
| Resource | Default | Range |
|---|---|---|
| vCPUs | 1 | 1–8 |
| Memory | 256 MB | 128 MB–8 GB |
| Disk (overlay) | 20 GB ext4 | Fixed |
| Boot time | ~125 ms cold | ~30 ms from snapshot |
Concurrency limits
TheFirecrackerManager enforces two semaphores to prevent resource exhaustion:
- Create semaphore: Maximum 1024 concurrent sandbox creations. Prevents OOM when many sandboxes start simultaneously.
- Exec semaphore: Set to
4 × CPU cores(e.g., 128 slots on a 32-core machine). Limits concurrentfork+execcalls to reduce CPU contention during parallel sandbox creation.
envd: the in-VM daemon
envd is a Go binary that starts as PID 1 inside every Firecracker VM. It exposes a ConnectRPC server on port 49983 (accessible via the TAP/veth bridge from the host) with three service groups:
| Service | Methods |
|---|---|
| Filesystem | Read, Write, WriteBatch, ListDir, Exists, Info, Remove, Rename, MakeDir, Watch |
| Process | Start, Wait, Kill, SendStdin, List |
| PTY | Create, Kill, SendStdin, Resize |
Snapshot mechanism
Whencreate_snapshot() is called:
- The VM is briefly paused (CRIU-style memory dump)
- The memory image and overlay diff are written to GCS (GCP) or S3 (AWS)
- The VM resumes
- A fresh network slot is claimed
- The memory image and overlay are downloaded and placed in the sandbox directory
- Firecracker restores the VM from the memory image (~30 ms)
- envd is already running — no boot phase