Proprietary chat platforms bundle their coding agent capabilities behind closed ecosystems. Terminal-native agents like Claude Code and Codex are powerful but leave non-developers behind. Lathe occupies the space between: the conversational interface of a web chat with the full sandbox capabilities of a coding agent.
The model writes and runs code in a sandbox, right inside your chat conversation. No terminal needed. Users just talk.
Full Linux VM with bash, file editing, git, package managers, and persistent state across sessions. Not a toy: a real machine.
Works with any model your Open WebUI instance can reach, open-weight or proprietary. No vendor lock-in on the intelligence.
Every user gets their own isolated sandbox. Admins install once; users don't configure anything. Built for shared deployments.
Thirteen tools, same caliber as the best proprietary agents. Each one runs against a persistent Linux sandbox that follows you anywhere your browser does.
Built-in manual. The model calls this to learn the sandbox model, workflows, and gotchas before its first real tool use.
Run any shell command. Install packages, compile code, run tests, manage git repos.
Read files with line numbers. Supports offset/limit for large files.
Create or overwrite files. Parent directories are created automatically.
Exact string replacement in files. Safer than rewriting the whole file.
Search for files by pattern. Returns a hierarchical listing that collapses dense directories.
Search file contents by regex. Returns matches grouped by file with line numbers.
Persistent Python REPL. Variables, imports, and definitions survive across calls within the same conversation. Built for iterative data exploration.
One-step file browser ("dufs"), browser IDE ("code-server"), public URL for any web service ("http:<port>"), or time-limited SSH command ("ssh").
Hand off a multi-step task to a sub-agent that works the same sandbox in its own context window and reports back with a summary.
Load project context: always lists the directory, plus any agent instructions and skills it finds in the repo.
Wipe your sandbox VM and start fresh. Persistent volume data survives. Requires explicit confirmation.
Wrap up a long session. The model writes a structured handoff document you paste into a new conversation to pick up where you left off.
Some tasks take many steps: cloning a repo, reading a dozen files, running tests, fixing failures, re-running. If the main agent does all of that inline, every intermediate result accumulates in the conversation, crowding out the parts you care about.
delegate() hands that work to a sub-agent: same model, same
sandbox, but its own context window. The sub-agent works autonomously
through however many steps it needs (up to 30), then sends a concise
summary back as a single tool result.
Each delegated job runs in the foreground for a configurable timeout (default 30 seconds). If the sub-agent finishes quickly, the result comes back immediately. If it takes longer, the job moves to the background automatically, and the main agent stays responsive to you while the sub-agent keeps working. Background results are written to sidecar files in the sandbox that the main agent can check later.
The sub-agent gets bash, read, write,
edit, glob, and grep. It can't
expose URLs, destroy the sandbox, or spawn
further sub-agents. Pass context_files to inject project
docs directly into the sub-agent's prompt so it doesn't waste steps
on orientation.
interpret() runs Python code in a persistent session
tied to your conversation. Variables, imports, and function definitions
survive across calls: define a DataFrame in one message, filter it in
the next, plot it in a third. No re-importing, no re-loading.
This is the closest Lathe gets to ChatGPT's code interpreter experience. The model writes Python, executes it, sees the output, and iterates, all without leaving the chat. State resets when you start a new conversation (your sandbox files still persist, only the in-memory REPL session is per-chat).
Iterative exploration, data analysis, incremental computation. State carries over between calls. Pure Python.
Shell commands, package installs, git, running servers, anything that isn't Python or needs process isolation.
bash().await. The REPL runs synchronous Python. Use asyncio.run() if you need async.input()) will hang. Provide values in code.Real prompts you can send to a model with Lathe enabled. The model figures out which tools to call.
env_vars in your user settings (e.g. {"GITHUB_TOKEN":"ghp_..."}). Every bash command gets them; the model never sees the values.
ChatGPT's code interpreter resets every conversation. Lathe doesn't. Your sandbox is yours: files, installed packages, git repos, running servers. Start a conversation Monday, come back Wednesday, everything is where you left it.
The sandbox stops after ~15 minutes of idle, then archives about an hour
later. Either way, the next tool call wakes it transparently: your
files and installed packages survive both. Only running processes are lost.
If you want a clean slate, ask the model to destroy it.
Your files persist, but conversation context doesn't. Every new chat
starts cold. When a session runs long, ask the model to
handoff: it writes a structured summary you paste into a fresh
conversation, and the new agent picks up with full access to your sandbox.
Lathe takes a filesystem snapshot on the first tool call of each conversation, so it can detect what changed during the session. If a delegated task finishes in the background, a completion notice is automatically prepended to the next tool result the model sees, so it can reference background work without being explicitly told about it.
Most users need zero configuration. But if you work with APIs or private repos, you can set environment variables that get injected into every shell command:
Values are never shown to the model. They're exported in the shell preamble with single-quoted values to prevent expansion.
Lathe is a single Python file (lathe.py) deployed as an
Open WebUI Tool.
It needs a Daytona API key and a
deployment label. Configure these as admin Valves after install.
See the README for full deployment instructions, valve reference, and testing guide.