Local LLM Code Interpreter

What You’ll Learn

Connecting to a local LLM via the OpenAI client with a base_url override
Checking if the local LLM server is reachable before attempting requests
Stripping markdown code fences from LLM responses before execution
Writing generated code into a Declaw sandbox filesystem
Executing the code securely with sbx.commands.run()
Graceful demo mode when the LLM server is not available

Prerequisites

Declaw instance running and DECLAW_API_KEY / DECLAW_DOMAIN set
A running local LLM server (optional — the example runs in demo mode without it)

pip install declaw python-dotenv openai

Start a local LLM server. With Ollama:

ollama serve
ollama pull llama3.2

Set LOCAL_LLM_URL and LOCAL_LLM_MODEL in .env if your setup differs from the defaults (http://localhost:11434/v1 and llama3.2).

Code Walkthrough

This example is available in Python. TypeScript support coming soon.

1. Check if the local server is reachable

Before making inference requests, probe the /models endpoint that most OpenAI-compatible servers expose:

import urllib.request
import urllib.error

def is_llm_reachable(base_url: str) -> bool:
    try:
        req = urllib.request.Request(f"{base_url}/models", method="GET")
        urllib.request.urlopen(req, timeout=5)
        return True
    except (urllib.error.URLError, OSError):
        return False

base_url = os.environ.get("LOCAL_LLM_URL", "http://localhost:11434/v1")
if not is_llm_reachable(base_url):
    print(f"Local LLM at {base_url} is not reachable.")
    demo_mode()

2. Connect via OpenAI client with `base_url` override

Any OpenAI-compatible server works — Ollama, vLLM, LM Studio, llama.cpp server, and more. Pass api_key="not-needed" since local servers typically skip authentication:

import openai
import os

base_url = os.environ.get("LOCAL_LLM_URL", "http://localhost:11434/v1")
model = os.environ.get("LOCAL_LLM_MODEL", "llama3.2")

client = openai.OpenAI(base_url=base_url, api_key="not-needed")

response = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": (
                "You are a Python code interpreter. When asked a question, "
                "respond ONLY with Python code that computes the answer and "
                "prints it. No markdown, no explanation, just code."
            ),
        },
        {
            "role": "user",
            "content": "Write a function that checks if a string is a palindrome and test it with 5 examples",
        },
    ],
    temperature=0,
)

3. Strip code fences and execute in a sandbox

from declaw import Sandbox

def strip_code_fences(code: str) -> str:
    code = code.strip()
    if code.startswith("```"):
        code = "\n".join(code.split("\n")[1:])
    if code.endswith("```"):
        code = "\n".join(code.split("\n")[:-1])
    return code.strip()

code = strip_code_fences(response.choices[0].message.content or "")

sbx = Sandbox.create(template="python", timeout=300)
try:
    sbx.files.write("/tmp/solution.py", code)
    result = sbx.commands.run("python3 /tmp/solution.py", timeout=30)
    print(result.stdout)
finally:
    sbx.kill()

4. Demo mode (no LLM server needed)

code = """\
import json

data = {
    "name": "Declaw Sandbox",
    "version": "1.0",
    "features": ["secure execution", "file I/O", "network policies"],
}

print("Sandbox Info:")
print(json.dumps(data, indent=2))

# Simple matrix multiplication
matrix_a = [[1, 2], [3, 4]]
matrix_b = [[5, 6], [7, 8]]
result = [
    [sum(a * b for a, b in zip(row_a, col_b))
     for col_b in zip(*matrix_b)]
    for row_a in matrix_a
]
print(f"\\nMatrix multiplication result: {result}")
"""

sbx = Sandbox.create(template="python", timeout=300)
try:
    sbx.files.write("/tmp/demo.py", code)
    result = sbx.commands.run("python3 /tmp/demo.py", timeout=30)
    print(result.stdout)
finally:
    sbx.kill()

Expected Output

============================================================
Local LLM Code Interpreter with Declaw Sandbox
============================================================

LLM endpoint: http://localhost:11434/v1
LLM model: llama3.2

Local LLM at http://localhost:11434/v1 is not reachable. Showing demo mode.

To use a local LLM, start Ollama or another OpenAI-compatible server:
  ollama serve
  ollama pull llama3.2

--- Demo: Running pre-written code in Declaw sandbox ---

Sandbox created: sbx_abc123
  Output:
Sandbox Info:
{
  "name": "Declaw Sandbox",
  "version": "1.0",
  "features": [
    "secure execution",
    "file I/O",
    "network policies"
  ]
}

Matrix multiplication result: [[19, 22], [43, 50]]

Sandbox killed.

​What You’ll Learn

​Prerequisites

​Code Walkthrough

​1. Check if the local server is reachable

​2. Connect via OpenAI client with base_url override

​3. Strip code fences and execute in a sandbox

​4. Demo mode (no LLM server needed)

​Expected Output

What You’ll Learn

Prerequisites

Code Walkthrough

1. Check if the local server is reachable

2. Connect via OpenAI client with `base_url` override

3. Strip code fences and execute in a sandbox

4. Demo mode (no LLM server needed)

Expected Output