ML training agent

Use case

Hand a dataset and a training spec to an agent, get back metrics and a plot. The code-interpreter template exists for exactly this — scikit-learn, pandas, numpy, matplotlib, seaborn are pre-installed, so training starts in under a second without any pip install detour.

Template

code-interpreter — rich scientific Python stack. Start this for anything data-science-shaped.

Run it

export DECLAW_API_KEY=dcl_...
export DECLAW_DOMAIN=api.declaw.ai
export OPENAI_API_KEY=sk-...

python cookbook/examples/openai-agents-ml-model/main.py

Security policy

SecurityPolicy(
    injection_defense=InjectionDefenseConfig(enabled=True, sensitivity="medium"),
    network=NetworkPolicy(allow_out=["api.openai.com"]),
)

No PII scanning here because the sklearn demo datasets (iris, wine, digits) don’t contain any. For real workloads, add PIIConfig(enabled=True, rehydrate_response=True). Network is OpenAI-only. Training on a local dataset doesn’t need any other host; if someone tries to prompt-inject the agent into calling an exfiltration endpoint, the connection will fail.

Env isolation

envs={
    "MODEL_FAMILY": "logistic_regression",
    "RANDOM_SEED": "42",
    "DATASET": "iris",
}

Training runs are parameterized entirely by env. Re-running with a different MODEL_FAMILY requires no prompt change — just env changes — so the agent’s instructions stay deterministic.

What the agent does

printenv MODEL_FAMILY RANDOM_SEED DATASET.
Load sklearn.datasets.{iris|wine|digits}.
Train the selected family (logistic_regression or random_forest) with seed $RANDOM_SEED.
Run 5-fold cross-validation.
Save /workspace/confusion.png and /workspace/metrics.json (accuracy_mean, accuracy_std, classes, n_samples).
Return cat /workspace/metrics.json.

Expected output

{
  "accuracy_mean": 0.9733,
  "accuracy_std":  0.0249,
  "classes":       ["setosa", "versicolor", "virginica"],
  "n_samples":     150
}

Full source

See cookbook/examples/openai-agents-ml-model/main.py in the repo.

DevOps auditor Custom transformation rules

​Use case

​Template

​Run it

​Security policy

​Env isolation

​What the agent does

​Expected output

​Full source