Skip to main content
Declaw provides a layered security model. Every sandbox has a SecurityPolicy that defines exactly what kinds of outbound traffic are allowed, what PII gets redacted, and which requests get audited. All enforcement happens transparently in the security proxy running inside the microVM — your agent code requires no modifications.

SecurityPolicy structure

from declaw import (
    SecurityPolicy, PIIConfig, InjectionDefenseConfig,
    NetworkPolicy, TransformationRule, AuditConfig,
    EnvSecurityConfig, ALL_TRAFFIC,
)

policy = SecurityPolicy(
    pii=PIIConfig(
        enabled=True,
        types=["ssn", "credit_card", "email", "phone_number"],
        action="redact",
        rehydrate_response=True,
    ),
    injection_defense=InjectionDefenseConfig(
        enabled=True,
        action="block",
        threshold=0.8,
    ),
    network=NetworkPolicy(
        allow_out=["*.openai.com", "pypi.org"],
        deny_out=[ALL_TRAFFIC],
    ),
    transformations=[
        TransformationRule(
            direction="outbound",
            match=r"Authorization:\s*Bearer\s+sk-\w+",
            replace="Authorization: Bearer [REDACTED]",
        ),
    ],
    audit=AuditConfig(enabled=True),
    env=EnvSecurityConfig(
        mask_patterns=["*_KEY", "*_SECRET", "*_TOKEN"],
    ),
)

sbx = Sandbox.create(security=policy)

Enforcement pipeline

All outbound traffic from the sandbox passes through a 7-stage pipeline before reaching the internet.

Stage descriptions

IP and CIDR rules are enforced at the kernel level via iptables. This is the fastest path — no userspace proxy overhead for purely IP-based rules. deny_out entries become DROP rules; allow_out IP/CIDR entries become ACCEPT rules with higher priority.
When domain names appear in allow_out or deny_out, all TCP traffic is redirected through the per-namespace TCP proxy. The proxy inspects the TLS SNI field (port 443) or HTTP Host header (port 80) to determine the destination domain. Wildcard patterns like *.openai.com are supported.
When PII scanning or transformation rules are enabled, the proxy performs MITM TLS interception. A per-sandbox CA certificate is generated at sandbox creation and injected into the VM trust store. The proxy terminates TLS, inspects the plaintext body, and re-encrypts to the real destination. This stage is skipped entirely when no body inspection is needed.
Scans request bodies for PII types configured in PIIConfig. Regex patterns cover structured PII (SSN, credit card with Luhn validation, email, phone). When the optional Guardrails Service is deployed, it adds ML-based NER for unstructured PII. Three actions are available: redact (replace with a token), block (reject the request), or log_only (pass through and audit). The redaction map is stored per-session so response bodies can be rehydrated.
Scans both outbound requests and inbound responses for prompt injection patterns. Configurable sensitivity threshold. When the Guardrails Service is deployed, the qualifire/prompt-injection-sentinel model scores the content. Actions: block (reject) or log (pass through and audit).
Applies TransformationRule regex patterns to request or response bodies. Rules are direction-aware: outbound rules apply to requests, inbound rules apply to responses. Useful for stripping API keys from outbound headers or removing injection patterns from inbound content.
Records every intercepted request and response with metadata: timestamp, source, destination, method, status, redaction events, injection detections. Audit entries are streamed to the host orchestrator and retrievable via the SDK get_audit_log() method.

Composability

Each security component is independent. You can enable any combination:
# Network-only policy (no body scanning)
policy = SecurityPolicy(
    network=NetworkPolicy(allow_out=["pypi.org"], deny_out=[ALL_TRAFFIC]),
)

# PII-only (scan all traffic, no network restriction)
policy = SecurityPolicy(
    pii=PIIConfig(enabled=True, types=["ssn", "credit_card"]),
)

# Full stack
policy = SecurityPolicy(
    pii=PIIConfig(enabled=True, types=["ssn", "email", "credit_card"]),
    injection_defense=True,
    network=NetworkPolicy(allow_out=["*.openai.com"], deny_out=[ALL_TRAFFIC]),
    transformations=[...],
    audit=True,
)
TLS interception (Stage 3) activates automatically when pii.enabled=True or transformations are configured. It remains off when only network policies or audit logging are used, so there is no TLS overhead for pure network restriction use cases.

Shorthand forms

Several fields accept shorthand values for common configurations:
# injection_defense=True is equivalent to InjectionDefenseConfig(enabled=True)
policy = SecurityPolicy(injection_defense=True)

# audit=True is equivalent to AuditConfig(enabled=True)
policy = SecurityPolicy(audit=True)

# network dict is equivalent to NetworkPolicy(**dict)
policy = SecurityPolicy(
    network={"allow_out": ["pypi.org"], "deny_out": [ALL_TRAFFIC]}
)

Security sections

PageWhat it covers
PII RedactionPIIConfig, detection types, redact/block/log actions
Prompt Injection DefenseInjectionDefenseConfig, sensitivity thresholds, ML model
Network PoliciesNetworkPolicy, domain filtering, CIDR rules
Transformation RulesTransformationRule, regex patterns, directions
Audit LoggingAuditConfig, AuditEntry, retrieval
Env SecretsEnvSecurityConfig, SecureEnvVar, masking patterns
Guardrails ServiceML-powered scanning, Presidio, HuggingFace, deployment