Security Policy

from declaw import (
    SecurityPolicy,
    PIIConfig, PIIType, RedactionAction,
    InjectionDefenseConfig, InjectionAction, InjectionSensitivity,
    ToxicityConfig,
    CodeSecurityConfig,
    InvisibleTextConfig,
    NetworkPolicy,
    TransformationRule, TransformDirection,
    AuditConfig, AuditEntry,
    EnvSecurityConfig, SecureEnvVar,
    SandboxNetworkOpts, ALL_TRAFFIC,
)

A SecurityPolicy is passed to Sandbox.create() via the security parameter. It composes PII detection, injection defense, toxicity / code-security / invisible-text scanners, network policy, transformation rules, audit logging, and environment variable security into a single object.

SecurityPolicy

from declaw import SecurityPolicy, PIIConfig, InjectionDefenseConfig

policy = SecurityPolicy(
    pii=PIIConfig(enabled=True, action="redact"),
    injection_defense=InjectionDefenseConfig(enabled=True, action="block"),
    audit=True,
)

sbx = Sandbox.create(security=policy, api_key="key", domain="host:8080")

pii

PIIConfig

default:"PIIConfig()"

PII detection and redaction configuration. See PIIConfig.

injection_defense

bool | InjectionDefenseConfig

default:"False"

Prompt injection defense. Pass True to enable with defaults, or an InjectionDefenseConfig for custom settings. See InjectionDefenseConfig.

transformations

list[TransformationRule]

default:"[]"

List of regex-based request/response body transformations. See TransformationRule.

network

NetworkPolicy | None

default:"None"

Network allowlist/denylist policy. See NetworkPolicy.

audit

bool | AuditConfig

default:"False"

Audit logging. Pass True to enable with defaults, or an AuditConfig for custom retention and body logging settings.

toxicity

ToxicityConfig | None

default:"None"

Toxicity scanner for outbound HTTP request bodies. See ToxicityConfig.

code_security

CodeSecurityConfig | None

default:"None"

Code-security scanner for outbound HTTP request bodies. See CodeSecurityConfig.

invisible_text

InvisibleTextConfig | None

default:"None"

Invisible-Unicode scanner for outbound HTTP request bodies. See InvisibleTextConfig.

env_security

EnvSecurityConfig

default:"EnvSecurityConfig()"

Environment variable masking in audit logs. See EnvSecurityConfig.

Properties

Property	Type	Description
`policy.injection_config`	`InjectionDefenseConfig`	Resolved config regardless of whether a `bool` or object was passed.
`policy.audit_config`	`AuditConfig`	Resolved audit config.
`policy.requires_tls_interception`	`bool`	`True` if PII, injection defense, or any transformations are enabled.

Methods

Method	Returns	Description
`policy.to_dict()`	`dict`	Serialize to a JSON-compatible dict.
`policy.to_json()`	`str`	Serialize to a JSON string.
`SecurityPolicy.from_dict(data)`	`SecurityPolicy`	Deserialize from a dict.

PIIConfig

Configure detection and handling of personally identifiable information in outbound HTTP traffic.

from declaw import PIIConfig, PIIType, RedactionAction

pii = PIIConfig(
    enabled=True,
    types=[PIIType.EMAIL, PIIType.CREDIT_CARD, PIIType.SSN],
    action=RedactionAction.REDACT.value,
    rehydrate_response=True,
)

enabled

bool

default:"False"

Whether PII scanning is active.

types

list[str]

default:"all types"

PII types to scan for. Defaults to all PIIType values. Valid values are the string values of PIIType.

action

str

default:"'redact'"

Action to take when PII is detected. One of 'redact', 'block', 'log_only'.

rehydrate_response

bool

default:"True"

When True, the security proxy replaces redaction tokens in API responses with the original values so the agent sees real data in replies.

domains

list[str] | None

default:"None"

Limit PII scanning to requests targeting these domains. None means scan all domains.

`PIIType` enum

class PIIType(str, Enum):
    SSN         = "ssn"
    CREDIT_CARD = "credit_card"
    EMAIL       = "email"
    PHONE       = "phone"
    PERSON_NAME = "person_name"
    API_KEY     = "api_key"
    ADDRESS     = "address"
    IP_ADDRESS  = "ip_address"

`RedactionAction` enum

class RedactionAction(str, Enum):
    REDACT   = "redact"    # Replace with a placeholder token
    BLOCK    = "block"     # Reject the request entirely (HTTP 403)
    LOG_ONLY = "log_only"  # Log the detection but forward unchanged

InjectionDefenseConfig

Detect and block prompt injection attempts in outbound HTTP request bodies.

from declaw import InjectionDefenseConfig, InjectionAction, InjectionSensitivity

injection = InjectionDefenseConfig(
    enabled=True,
    action=InjectionAction.LOG_ONLY.value,
    sensitivity=InjectionSensitivity.MEDIUM.value,
    threshold=0.8,
)

enabled

bool

default:"False"

Whether injection defense is active.

action

str

default:"'log_only'"

Action when injection is detected. One of 'block' (HTTP 403) or 'log_only' (forward unchanged, record detection in the audit log).

sensitivity

str

default:"'medium'"

Preset sensitivity tier. One of 'low', 'medium', 'high'. Adjusts the scanner’s detection aggressiveness independently of threshold.

threshold

float

default:"0.8"

Confidence threshold between 0.0 and 1.0. Requests with a score above this value trigger the configured action. Lower values increase sensitivity.

domains

list[str] | None

default:"None"

Limit injection scanning to these domains. None means scan all.

`InjectionAction` enum

class InjectionAction(str, Enum):
    BLOCK    = "block"     # Reject the request (HTTP 403)
    LOG_ONLY = "log_only"  # Log and forward unchanged

`InjectionSensitivity` enum

class InjectionSensitivity(str, Enum):
    LOW    = "low"
    MEDIUM = "medium"
    HIGH   = "high"

ToxicityConfig

Scan outbound HTTP request bodies for toxic content (harassment, hate speech, etc.).

from declaw import ToxicityConfig

toxicity = ToxicityConfig(
    enabled=True,
    threshold=0.9,
    action="block",
)

enabled

bool

default:"False"

Whether toxicity scanning is active.

threshold

float

default:"0.9"

Confidence threshold between 0.0 and 1.0. Requests scoring above this value trigger the configured action.

action

str

default:"'block'"

Action when toxic content is detected. One of 'block' (HTTP 403) or 'log_only'.

CodeSecurityConfig

Detect suspicious or unsafe code in outbound HTTP request bodies.

from declaw import CodeSecurityConfig

code = CodeSecurityConfig(
    enabled=True,
    threshold=0.6,
    action="log_only",
    excluded_languages=["markdown", "plaintext"],
)

enabled

bool

default:"False"

Whether code-security scanning is active.

threshold

float

default:"0.6"

Confidence threshold between 0.0 and 1.0.

action

str

default:"'log_only'"

Action when suspicious code is detected. One of 'block' (HTTP 403) or 'log_only'.

excluded_languages

list[str] | None

default:"None"

Languages to exclude from scanning. Useful when content is intentionally code but already in a trusted context.

InvisibleTextConfig

Detect invisible or control Unicode characters (often used to smuggle prompt instructions past the model) in outbound HTTP request bodies.

from declaw import InvisibleTextConfig

invisible = InvisibleTextConfig(
    enabled=True,
    action="strip",
)

enabled

bool

default:"False"

Whether invisible-text scanning is active.

action

str

default:"'strip'"

Action when invisible characters are detected. One of 'block' (HTTP 403), 'strip' (remove the characters and forward), or 'log_only'.

NetworkPolicy

Network allowlist and denylist for outbound traffic from the sandbox. Set this on SecurityPolicy.network to apply it alongside other security controls.

from declaw import NetworkPolicy, ALL_TRAFFIC

# Allow only pypi.org and github.com, deny everything else
network = NetworkPolicy(
    allow_out=["pypi.org", "*.github.com", "8.8.8.8"],
    deny_out=[ALL_TRAFFIC],
    allow_public_traffic=False,
)

allow_out

list[str]

default:"[]"

Destinations to allow. Accepts IP addresses, CIDR blocks (e.g. "10.0.0.0/8"), and domain names with optional wildcard prefix (e.g. "*.github.com").

deny_out

list[str]

default:"[]"

Destinations to deny. Accepts IP addresses and CIDR blocks only (domains not accepted in deny rules).

allow_public_traffic

bool

default:"True"

Whether to allow all public traffic by default. Set to False when using allow_out to build an allowlist.

mask_request_host

str | None

default:"None"

Replace the Host header in all outbound requests with this value. Used for routing through a reverse proxy.

`ALL_TRAFFIC` constant

ALL_TRAFFIC: str = "0.0.0.0/0"

Use deny_out=[ALL_TRAFFIC] to block all outbound traffic.

`SandboxNetworkOpts`

SandboxNetworkOpts is the lower-level equivalent used directly in Sandbox.create(network=...). It has the same fields as NetworkPolicy using snake_case attribute names.

from declaw import SandboxNetworkOpts

network = SandboxNetworkOpts(
    allow_out=["pypi.org"],
    deny_out=[ALL_TRAFFIC],
    allow_public_traffic=False,
)

TransformationRule

Regex-based text transformation applied to outbound request bodies, inbound response bodies, or both.

from declaw import TransformationRule, TransformDirection

# Strip Bearer tokens from outbound requests
rule = TransformationRule(
    match=r"Bearer [A-Za-z0-9\-_\.]+",
    replace="Bearer [REDACTED]",
    direction=TransformDirection.OUTBOUND.value,
)

match

str

required

Python-compatible regular expression. Must be a valid regex pattern.

replace

str

required

Replacement string. Supports Python re.sub back-references (e.g. \1).

direction

str

default:"'outbound'"

Direction to apply the rule. One of 'outbound', 'inbound', 'both'.

`TransformDirection` enum

class TransformDirection(str, Enum):
    OUTBOUND = "outbound"   # Apply to requests leaving the sandbox
    INBOUND  = "inbound"    # Apply to responses entering the sandbox
    BOTH     = "both"       # Apply in both directions

Methods

Method	Returns	Description
`rule.applies_to(direction)`	`bool`	Whether the rule applies in the given direction string.
`rule.apply(text)`	`str`	Apply the regex substitution to `text`.

AuditConfig

Toggle whether lifecycle and security events for the sandbox are shipped to Declaw’s audit log.

from declaw import AuditConfig

# Opt out of audit logging for this sandbox
audit = AuditConfig(enabled=False)

enabled

bool

default:"True"

When True (the default), the orchestrator records the sandbox’s lifecycle events (create, kill, pause, resume, snapshot) and security decisions (egress allow/block) to the audit log. Set to False to suppress all audit events for the sandbox.

Audit log retention is a platform-wide setting (currently a 7-day rolling window) and is not configurable per sandbox. Request and response body logging is not exposed to callers.

`AuditEntry`

@dataclass
class AuditEntry:
    timestamp: datetime.datetime
    method: str
    url: str
    status_code: int = 0
    pii_redactions: int = 0
    injection_blocks: int = 0
    transformations_applied: int = 0
    direction: str = "outbound"

EnvSecurityConfig

Control how environment variable values are masked in audit logs.

from declaw import EnvSecurityConfig

env_sec = EnvSecurityConfig(
    mask_patterns=["*_KEY", "*_SECRET", "*_TOKEN", "*_PASSWORD", "API_KEY"],
    auto_mask_in_audit=True,
)

mask_patterns

list[str]

Glob patterns matched against uppercase environment variable names. Variables matching any pattern are masked as *** in audit logs.

auto_mask_in_audit

bool

default:"True"

Automatically redact matching variable values in all audit log entries.

`SecureEnvVar`

Pass sensitive environment variables without leaking values in logs:

from declaw import SecureEnvVar

var = SecureEnvVar(key="OPENAI_API_KEY", value="sk-...", secret=True)
var.to_safe_dict()  # {"key": "OPENAI_API_KEY", "value": "***", "secret": True}

Full policy example

from declaw import (
    Sandbox, SecurityPolicy,
    PIIConfig, PIIType,
    InjectionDefenseConfig, InjectionAction,
    NetworkPolicy, ALL_TRAFFIC,
    TransformationRule, TransformDirection,
    AuditConfig,
)

policy = SecurityPolicy(
    pii=PIIConfig(
        enabled=True,
        types=[PIIType.EMAIL, PIIType.SSN, PIIType.CREDIT_CARD],
        action="redact",
        rehydrate_response=True,
    ),
    injection_defense=InjectionDefenseConfig(
        enabled=True,
        action=InjectionAction.BLOCK.value,
        threshold=0.75,
    ),
    network=NetworkPolicy(
        allow_out=["api.openai.com", "pypi.org"],
        deny_out=[ALL_TRAFFIC],
        allow_public_traffic=False,
    ),
    transformations=[
        TransformationRule(
            match=r"sk-[A-Za-z0-9]+",
            replace="sk-[REDACTED]",
            direction=TransformDirection.OUTBOUND.value,
        ),
    ],
    audit=AuditConfig(enabled=True),
)

sbx = Sandbox.create(
    security=policy,
    api_key="your-api-key",
    domain="104.198.24.180:8080",
)

​SecurityPolicy

​Properties

​Methods

​PIIConfig

​PIIType enum

​RedactionAction enum

​InjectionDefenseConfig

​InjectionAction enum

​InjectionSensitivity enum

​ToxicityConfig

​CodeSecurityConfig

​InvisibleTextConfig

​NetworkPolicy

​ALL_TRAFFIC constant

​SandboxNetworkOpts

​TransformationRule

​TransformDirection enum

​Methods

​AuditConfig

​AuditEntry

​EnvSecurityConfig

​SecureEnvVar

​Full policy example

SecurityPolicy

Properties

Methods

PIIConfig

`PIIType` enum

`RedactionAction` enum

InjectionDefenseConfig

`InjectionAction` enum

`InjectionSensitivity` enum

ToxicityConfig

CodeSecurityConfig

InvisibleTextConfig

NetworkPolicy

`ALL_TRAFFIC` constant

`SandboxNetworkOpts`

TransformationRule

`TransformDirection` enum

Methods

AuditConfig

`AuditEntry`

EnvSecurityConfig

`SecureEnvVar`

Full policy example