Skip to main content

Use case

Hand a dataset and a training spec to an agent, get back metrics and a plot. The code-interpreter template exists for exactly this — scikit-learn, pandas, numpy, matplotlib, seaborn are pre-installed, so training starts in under a second without any pip install detour.

Template

code-interpreter — rich scientific Python stack. Start this for anything data-science-shaped.

Run it

export DECLAW_API_KEY=dcl_...
export DECLAW_DOMAIN=api.declaw.ai
export OPENAI_API_KEY=sk-...

python cookbook/openai_agents_ml_model.py

Security policy

SecurityPolicy(
    injection_defense=InjectionDefenseConfig(enabled=True, sensitivity="medium"),
    network=NetworkPolicy(allow_out=["api.openai.com"]),
)
No PII scanning here because the sklearn demo datasets (iris, wine, digits) don’t contain any. For real workloads, add PIIConfig(enabled=True, rehydrate_response=True). Network is OpenAI-only. Training on a local dataset doesn’t need any other host; if someone tries to prompt-inject the agent into calling an exfiltration endpoint, the connection will fail.

Env isolation

envs={
    "MODEL_FAMILY": "logistic_regression",
    "RANDOM_SEED": "42",
    "DATASET": "iris",
}
Training runs are parameterized entirely by env. Re-running with a different MODEL_FAMILY requires no prompt change — just env changes — so the agent’s instructions stay deterministic.

What the agent does

  1. printenv MODEL_FAMILY RANDOM_SEED DATASET.
  2. Load sklearn.datasets.{iris|wine|digits}.
  3. Train the selected family (logistic_regression or random_forest) with seed $RANDOM_SEED.
  4. Run 5-fold cross-validation.
  5. Save /workspace/confusion.png and /workspace/metrics.json (accuracy_mean, accuracy_std, classes, n_samples).
  6. Return cat /workspace/metrics.json.

Expected output

{
  "accuracy_mean": 0.9733,
  "accuracy_std":  0.0249,
  "classes":       ["setosa", "versicolor", "virginica"],
  "n_samples":     150
}

Full source

See cookbook/openai_agents_ml_model.py in the repo.