OpenAI Agents — PII redaction end-to-end

When an agent handles data that may contain PII, there are two distinct egress points to protect:

The prompt itself leaving the agent process and going to the LLM — OpenAI sees whatever you send it. Use PIIHandler.anonymize to strip PII before the call and PIIHandler.deanonymize to reconstruct the original values on the way back.
Outbound HTTP from the sandbox when the agent’s tool code calls external APIs — enforced by SecurityPolicy.pii at the sandbox’s edge proxy; the LLM-generated code never gets a chance to leak PII through a curl or requests.post.

Both paths use declaw’s guardrails service under the hood; this cookbook shows them working together.

What you’ll learn

Anonymizing a user goal with PIIHandler before it goes to the LLM
Setting PIIConfig(action="redact", rehydrate_response=True) so the sandbox-side guardrails redact in transit and rehydrate on the return path
Rehydrating the model’s final output back to the original PII

Prerequisites

export DECLAW_API_KEY="your-api-key"
export DECLAW_DOMAIN="your-declaw-instance.example.com:8080"

export OPENAI_API_KEY="sk-..."
pip install "declaw[openai-agents]"

Code

import asyncio
import os
import sys

from agents import Runner
from agents.run import RunConfig
from agents.sandbox import SandboxAgent, SandboxRunConfig

from declaw.openai import (
    DeclawSandboxClient,
    DeclawSandboxClientOptions,
    PIIConfig,
    PIIHandler,
    SecurityPolicy,
)


USER_GOAL = (
    "Our customer is Alice (email: alice@acme.com, SSN 123-45-6789). "
    "Write her case details to /workspace/case.txt, then tell me the byte count."
)


async def main() -> None:
    # --- Layer 1: anonymize the prompt before it reaches the model ---
    pii = PIIHandler()
    (anonymized_goal,), rmap = pii.anonymize([USER_GOAL])

    print("original goal:")
    print(" ", USER_GOAL)
    print("anonymized goal (what the LLM sees):")
    print(" ", anonymized_goal)
    print(f"redaction map entries: {len(rmap)}")

    # --- Layer 2: sandbox with edge-proxy PII scanning ---
    options = DeclawSandboxClientOptions(
        template="base",
        timeout=180,
        security=SecurityPolicy(
            pii=PIIConfig(
                enabled=True,
                action="redact",
                rehydrate_response=True,  # keep the sandbox program's view intact
            ),
        ),
    )

    client = DeclawSandboxClient()
    session = await client.create(options=options)
    try:
        agent = SandboxAgent(
            name="pii-demo",
            model="gpt-5.4",
            instructions="You are a customer-ops agent. Use the bash tool as instructed.",
        )
        result = await Runner.run(
            agent,
            anonymized_goal,
            run_config=RunConfig(sandbox=SandboxRunConfig(session=session)),
        )

        # Rehydrate the model's output so downstream customer-facing
        # code sees the original PII again.
        final = pii.deanonymize(result.final_output, rmap)
        print("\n== final (rehydrated) ==")
        print(final)
    finally:
        await client.delete(session)


if __name__ == "__main__":
    asyncio.run(main())

What happens under the hood

Step	Where	What gets scanned
`PIIHandler.anonymize([goal])`	Your agent process	The goal string. `alice@acme.com` → `REDACTED_EMAIL_<token>`, `123-45-6789` → `REDACTED_SSN_<token>`. Map kept locally.
`Runner.run(agent, anon_goal)`	Your agent process	OpenAI sees only the anonymized text.
Agent tool calls `bash("echo … > /workspace/case.txt")`	Declaw sandbox	Command runs inside the VM. The output returned to the Agent loop is whatever the command printed — in this example, no network egress, so the sandbox-side PII policy doesn’t fire.
Agent tool calls `curl -X POST https://crm.internal/?email=alice@acme.com` (if the agent writes code to call an external API)	Declaw edge proxy	Request body is scanned by the guardrails service. The outbound request is rewritten with redacted tokens before reaching `crm.internal`. The response is rehydrated on the way back so the sandbox program sees real values. Audit log records `pii_redactions`.
`pii.deanonymize(result.final_output, rmap)`	Your agent process	The model’s response may reference tokens like `REDACTED_EMAIL_…`. The map restores the originals.

Why both layers?

Layer 1 alone isn’t enough if the agent’s code is allowed to call external APIs from inside the sandbox — the model might reconstruct PII or emit new PII in its generated code. Layer 2 catches that at the network edge regardless of what the model produced. Layer 2 alone isn’t enough if you don’t want OpenAI’s servers to see customer PII in the prompt. Running them together gives full coverage.

Core concepts → Security — full list of guardrail scanners (PII, prompt injection, toxicity, code security, invisible text, language) and their configuration.
declaw.openai reference — every option field in detail.
OpenAI Agents quick start — the minimal wiring without PII.

​What you’ll learn

​Prerequisites

​Code

​What happens under the hood

​Why both layers?

​Related

What you’ll learn

Prerequisites

Code

What happens under the hood

Why both layers?

Related