What You’ll Learn
- How to call the Anthropic Messages API and feed the response directly into a sandbox
- Structuring prompts to get code-only output (no markdown, no explanation)
- Sandbox-per-task isolation: fresh microVM for each code generation request
- Demo mode: run the full workflow without an Anthropic API key
Prerequisites
- Declaw running locally or in the cloud (see Deployment)
DECLAW_API_KEY and DECLAW_DOMAIN set in your environment
ANTHROPIC_API_KEY set in your environment (optional — demo mode runs without it)
This example is available in Python. TypeScript support coming soon.
Code Walkthrough
Live mode (requires ANTHROPIC_API_KEY)
import anthropic
from declaw import Sandbox
client = anthropic.Anthropic()
task = "Write a Python script that computes the Fibonacci sequence up to the 20th number."
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": (
f"You are a code generation agent. Given a task, respond "
f"ONLY with Python code that accomplishes the task and "
f"prints the results. No markdown fences, no explanation.\n\n"
f"Task: {task}"
),
},
],
)
code = strip_code_fences(message.content[0].text)
sbx = Sandbox.create(template="python", timeout=300)
try:
sbx.files.write("/tmp/generated.py", code)
result = sbx.commands.run("python3 /tmp/generated.py", timeout=30)
print(result.stdout)
finally:
sbx.kill()
The Anthropic Messages API returns a list of content blocks. For text output, access message.content[0].text.
Prompt engineering for code-only output
The system prompt tells Claude to respond only with Python code. Two keys:
- State
ONLY with Python code explicitly
- Add
No markdown fences, no explanation — Claude sometimes wraps code in ```python blocks regardless
Always call strip_code_fences() as a defensive measure even when the prompt says not to include fences.
Demo mode — text readability analysis
The demo mode runs a pre-written analysis script that computes Flesch Reading Ease for a sample text:
simulated_code = textwrap.dedent("""\
import re
text = (
"The quick brown fox jumps over the lazy dog. "
"Pack my box with five dozen liquor jugs."
)
sentences = [s.strip() for s in re.split(r'[.!?]+', text) if s.strip()]
words = text.split()
syllable_count = sum(
max(1, len(re.findall(r'[aeiouy]+', w.lower())))
for w in words
)
avg_sentence_length = round(len(words) / len(sentences), 2)
avg_syllables_per_word = round(syllable_count / len(words), 2)
flesch = round(206.835 - 1.015 * avg_sentence_length - 84.6 * avg_syllables_per_word, 1)
print(f"Flesch Reading Ease: {flesch}")
""")
sbx = Sandbox.create(template="python", timeout=300)
try:
sbx.files.write("/tmp/generated.py", simulated_code)
result = sbx.commands.run("python3 /tmp/generated.py", timeout=30)
print(result.stdout)
finally:
sbx.kill()
Expected Output (demo mode)
Agent in Sandbox (Anthropic) Example
============================================================
ANTHROPIC_API_KEY not set — running in demo mode.
--- Demo: Simulating Anthropic Claude agent workflow ---
Simulated task: Analyze text and compute readability statistics
Creating sandbox and executing...
Sandbox created: sbx-abc123
Output:
=== Text Readability Analysis ===
Sentences: 4
Words: 36
Characters (alpha): 151
Syllables (approx): 46
Avg word length: 4.19
Avg sentence length: 9.0
Avg syllables/word: 1.28
Flesch Reading Ease: 67.2
Readability: Easy to read
Sandbox sbx-abc123 killed.
Comparison with OpenAI Example
The pattern is identical to the OpenAI example — only the API client and model differ. This makes it easy to swap providers:
| Provider | Client | Model |
|---|
| Anthropic | anthropic.Anthropic() | claude-sonnet-4-20250514 |
| OpenAI | openai.OpenAI() | gpt-4o-mini |
The sandbox creation and execution steps are the same regardless of which LLM you use.