Skip to main content
Declaw provides a layered security model. Every sandbox has a SecurityPolicy that defines exactly what kinds of outbound traffic are allowed, what PII gets redacted, and which requests get audited. All enforcement happens transparently in the security proxy running inside the sandbox — your agent code requires no modifications.

SecurityPolicy structure

from declaw import (
    SecurityPolicy, PIIConfig, InjectionDefenseConfig,
    NetworkPolicy, TransformationRule, AuditConfig,
    EnvSecurityConfig, ALL_TRAFFIC,
)

policy = SecurityPolicy(
    pii=PIIConfig(
        enabled=True,
        types=["ssn", "credit_card", "email", "phone_number"],
        action="redact",
        rehydrate_response=True,
    ),
    injection_defense=InjectionDefenseConfig(
        enabled=True,
        action="block",
        threshold=0.8,
    ),
    network=NetworkPolicy(
        allow_out=["*.openai.com", "pypi.org"],
        deny_out=[ALL_TRAFFIC],
    ),
    transformations=[
        TransformationRule(
            direction="outbound",
            match=r"Authorization:\s*Bearer\s+sk-\w+",
            replace="Authorization: Bearer [REDACTED]",
        ),
    ],
    audit=AuditConfig(enabled=True),
    env=EnvSecurityConfig(
        mask_patterns=["*_KEY", "*_SECRET", "*_TOKEN"],
    ),
)

sbx = Sandbox.create(security=policy)

Enforcement pipeline

All outbound traffic from the sandbox passes through a 6-stage pipeline before reaching the internet. On the response path, the stages run in a different order: injection scanning first, then inbound transformation rules, then PII rehydration (restoring original values from the session redaction map), and finally audit logging.

Stage descriptions

IP and CIDR rules are enforced at the kernel level via iptables. This is the fastest path — no userspace proxy overhead for purely IP-based rules. deny_out entries become DROP rules; allow_out IP/CIDR entries become ACCEPT rules with higher priority.
When domain names appear in allow_out or deny_out, all TCP traffic is redirected through the per-namespace TCP proxy. The proxy inspects the TLS SNI field (port 443) or HTTP Host header (port 80) to determine the destination domain. Wildcard patterns like *.openai.com are supported.
When PII scanning or transformation rules are enabled, the proxy performs TLS interception at the edge proxy. A per-sandbox CA certificate is generated at sandbox creation and injected into the VM trust store. The proxy terminates TLS, inspects the plaintext body, and re-encrypts to the real destination. This stage is skipped entirely when no body inspection is needed.
Scans request and response bodies for PII and prompt injection patterns.PII scanning: Regex patterns cover structured PII (SSN, credit card with Luhn validation, email, phone). When the optional Guardrails Service is deployed, it adds ML-based NER for unstructured PII. Three actions are available: redact (replace with a token), block (reject the request), or log_only (pass through and audit). The redaction map is stored per-session so response bodies can be rehydrated.Injection defense: Configurable sensitivity threshold. When the Guardrails Service is deployed, the qualifire/prompt-injection-sentinel model scores the content. Actions: block (reject) or log (pass through and audit).On the response path, injection scanning runs first, then the Transform Engine applies inbound rules, and finally PII rehydration restores the original values. This ensures transforms operate on redacted content, not raw PII.
Applies TransformationRule regex patterns to request or response bodies. Rules are direction-aware: outbound rules apply to requests, inbound rules apply to responses. Useful for stripping API keys from outbound headers or removing injection patterns from inbound content.
Records lifecycle events (vm_created, vm_killed, …) and, when audit is enabled, network decisions (egress_allowed, egress_blocked) to the platform audit log. Request/response bodies are not recorded. Retention is 7 days platform-wide; opt out per sandbox with AuditConfig(enabled=False).

Composability

Each security component is independent. You can enable any combination:
# Network-only policy (no body scanning)
policy = SecurityPolicy(
    network=NetworkPolicy(allow_out=["pypi.org"], deny_out=[ALL_TRAFFIC]),
)

# PII-only (scan all traffic, no network restriction)
policy = SecurityPolicy(
    pii=PIIConfig(enabled=True, types=["ssn", "credit_card"]),
)

# Full stack
policy = SecurityPolicy(
    pii=PIIConfig(enabled=True, types=["ssn", "email", "credit_card"]),
    injection_defense=True,
    network=NetworkPolicy(allow_out=["*.openai.com"], deny_out=[ALL_TRAFFIC]),
    transformations=[...],
    audit=True,
)
TLS interception (Stage 3) activates automatically when pii.enabled=True or transformations are configured. It remains off when only network policies or audit logging are used, so there is no TLS overhead for pure network restriction use cases.

Shorthand forms

Several fields accept shorthand values for common configurations:
# injection_defense=True is equivalent to InjectionDefenseConfig(enabled=True)
policy = SecurityPolicy(injection_defense=True)

# audit=True is equivalent to AuditConfig(enabled=True)
policy = SecurityPolicy(audit=True)

# network dict is equivalent to NetworkPolicy(**dict)
policy = SecurityPolicy(
    network={"allow_out": ["pypi.org"], "deny_out": [ALL_TRAFFIC]}
)

Security sections

PageWhat it covers
PII RedactionPIIConfig, detection types, redact/block/log actions
Prompt Injection DefenseInjectionDefenseConfig, sensitivity thresholds, ML model
Network PoliciesNetworkPolicy, domain filtering, CIDR rules
Transformation RulesTransformationRule, regex patterns, directions
Audit LoggingAuditConfig, AuditEntry, retrieval
Env SecretsEnvSecurityConfig, SecureEnvVar, masking patterns
Guardrails ServiceML-powered scanning, Presidio, HuggingFace, deployment