What You’ll Learn
- How to upload a multi-file Python project into a sandbox
- How to run
unittest inside the sandbox and capture output
- How to parse test results (pass/fail, test count, failure count) from stdout
- How to demonstrate a regression by uploading a buggy version and re-running tests
- The sandbox-as-CI-runner pattern for safe, isolated test execution
Prerequisites
- Declaw running locally or in the cloud (see Deployment)
DECLAW_API_KEY and DECLAW_DOMAIN set in your environment
This example is available in Python. TypeScript support coming soon.
Code Walkthrough
1. Define the project files
Both the module under test and the test file are Python strings defined in the outer script and uploaded to the sandbox:
CALCULATOR_MODULE = """\
class Calculator:
def add(self, a: float, b: float) -> float:
return a + b
def subtract(self, a: float, b: float) -> float:
return a - b
def multiply(self, a: float, b: float) -> float:
return a * b
def divide(self, a: float, b: float) -> float:
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
"""
TEST_CALCULATOR = """\
import unittest
from calculator import Calculator
class TestCalculator(unittest.TestCase):
def setUp(self):
self.calc = Calculator()
def test_add(self):
self.assertEqual(self.calc.add(2, 3), 5)
def test_divide_by_zero(self):
with self.assertRaises(ValueError):
self.calc.divide(1, 0)
"""
2. Upload files and run the test suite
from declaw import Sandbox
sbx = Sandbox.create(template="python", timeout=300)
try:
sbx.files.write("/home/user/project/calculator.py", CALCULATOR_MODULE)
sbx.files.write("/home/user/project/test_calculator.py", TEST_CALCULATOR)
result = sbx.commands.run(
"cd /home/user/project && python3 -m unittest test_calculator -v 2>&1"
)
print(result.stdout)
print(f"Exit code: {result.exit_code}")
finally:
sbx.kill()
The 2>&1 redirect sends stderr to stdout so test output is captured in result.stdout. unittest writes its summary to stderr by default.
3. Parse test results
def parse_test_output(stdout: str) -> dict:
"""Parse unittest output and return a summary dict."""
lines = stdout.strip().splitlines()
summary = {"passed": False, "total_tests": 0, "failures": 0, "errors": 0}
for line in lines:
if line.startswith("Ran "):
summary["total_tests"] = int(line.split()[1])
if "OK" in line and "FAILED" not in line:
summary["passed"] = True
if line.startswith("FAILED"):
summary["passed"] = False
if "failures=" in line:
summary["failures"] = int(
line.split("failures=")[1].split(")")[0].split(",")[0]
)
return summary
4. Inject a bug and re-run
The example demonstrates a failing build by uploading a buggy version of the calculator:
buggy_module = CALCULATOR_MODULE.replace(
"return a + b",
"return a - b # BUG: subtraction instead of addition"
)
sbx.files.write("/home/user/project/calculator.py", buggy_module)
result2 = sbx.commands.run(
"cd /home/user/project && python3 -m unittest test_calculator -v 2>&1"
)
report2 = parse_test_output(result2.stdout)
print(f"Status: {'PASS' if report2['passed'] else 'FAIL'}")
print(f"Failures: {report2['failures']}")
Expected Output
--- Creating Sandbox ---
Sandbox created: sbx-abc123
--- Uploading Project Files ---
Uploaded: calculator.py
Uploaded: test_calculator.py
--- Running Test Suite ---
Test output:
test_add ... ok
test_divide ... ok
test_divide_by_zero ... ok
test_multiply ... ok
test_subtract ... ok
Ran 5 tests in 0.001s
OK
Exit code: 0
--- Test Report ---
Status: PASS
Total tests: 5
Failures: 0
Errors: 0
--- Injecting a Bug and Re-running ---
Uploaded buggy calculator.py
Test output:
test_add ... FAIL
...
FAILED (failures=2)
Exit code: 1
--- Buggy Test Report ---
Status: FAIL
Total tests: 5
Failures: 2
Errors: 0
Why Use Declaw for CI
Running tests directly on a CI runner (GitHub Actions, CircleCI, Jenkins) means:
- Untrusted test code can access the runner’s environment variables, credentials, and filesystem
- A compromised dependency in the test suite can exfiltrate CI secrets
- A runaway test process can consume all runner resources and block other jobs
With Declaw:
- Each test run gets its own isolated Firecracker microVM with no host access
- Add
allow_internet_access=False to prevent network access during tests
- Add
SecurityPolicy with PII scanning to prevent credential exfiltration even if a test makes outbound calls
- Sandboxes are destroyed after each run — no state leaks between runs