Data analyst agent

Use case

You have a dataset and want an agent to do the analysis — load, summarize, visualize, report — without the analysis tooling ever running on your host machine. The agent gets a fresh microVM with pandas and matplotlib, a locked-down network (only the OpenAI API plus one dataset host), and PII redaction on every outbound call.

Template

python — ships with Python 3.11, pip, common scientific packages on request. PII scanner and injection defense run at the sandbox’s edge proxy.

Run it

export DECLAW_API_KEY=dcl_...
export DECLAW_DOMAIN=api.declaw.ai
export OPENAI_API_KEY=sk-...

python cookbook/examples/openai-agents-data-analyst/main.py

Security policy

SecurityPolicy(
    pii=PIIConfig(enabled=True, action="redact", rehydrate_response=True),
    injection_defense=InjectionDefenseConfig(enabled=True, sensitivity="high"),
    network=NetworkPolicy(
        allow_out=[
            "api.openai.com",
            "pypi.org",
            "files.pythonhosted.org",
            "raw.githubusercontent.com",
        ],
    ),
)

rehydrate_response=True matters here: the analyst’s pandas output may echo back PII that the scanner redacted on the way out. The edge proxy restores the originals before the sandbox receives the response, so the agent’s code sees a normal API response, not a pile of REDACTED_* tokens.

What the agent does

printenv to confirm the sandbox-provided config variables.
curl the CSV into /workspace/data.csv.
pip install pandas matplotlib.
Generate a script that loads, summarizes, plots, and writes /workspace/report.md.
Return the report path. The Python driver then reads /workspace/report.md back through the sandbox API.

Expected output

== agent output ==
/workspace/report.md

== /workspace/report.md ==
# COVID-19 time series summary
- Rows: ...
- Columns: ...
- Top 5 countries by latest confirmed case count:
  1. ...
(+ bar chart at /workspace/top5.png)

Why filesystem isolation matters here

Every artifact (downloaded CSV, pip cache, plot, report) lives in a fresh overlay that’s discarded when client.delete(session=...) runs. The next caller gets a clean VM with none of this caller’s state. You don’t need to pre-provision scratch directories or clean them up — the sandbox lifecycle handles it.

Full source

See cookbook/examples/openai-agents-data-analyst/main.py in the repo.

PII redaction Code reviewer agent

​Use case

​Template

​Run it

​Security policy

​What the agent does

​Expected output

​Why filesystem isolation matters here

​Full source