Use case
Let an agent scrape a single site and synthesize a summary, with an airtight guarantee that it cannot reach any other host. A compromised page or prompt injection cannot make the agent call home because the edge proxy simply won’t connect it there.Template
python — Python + pip, everything else installed inside the
sandbox at runtime.
Run it
Security policy — the star of this recipe
allow_out is the only way out of the sandbox. Any request to
any other host returns a connection failure inside the VM. Try
this: change the agent’s instructions to curl evil.example and
watch the tool call fail — there’s nothing the agent can do from
inside the VM to reach it.
Env isolation
USER_AGENT via env means the agent sets the scraper UA
from a guaranteed-consistent value — even if an attacker tries to
inject a different UA through the prompt, the instruction tells
the agent to read $USER_AGENT, not to compose one.
What the agent does
printenv TARGET_URL USER_AGENT SCRAPER_ID(logged to/workspace/run.log).pip install beautifulsoup4 lxml requests.- Fetch
$TARGET_URLwith$USER_AGENT, extract top 5 items, dump JSON to/workspace/results.json. - Return the JSON.
Filesystem isolation
Thepip install cache, the BeautifulSoup dep tree, the raw HTML,
and results.json all live in the sandbox’s overlay. Nothing is
ever written to your host. The next run of this script gets a
fresh VM — no lingering cache or cookies.
Full source
Seecookbook/openai_agents_web_scraper.py in the repo.