The open-source agent harness
Substructure handles durability, orchestration, and real-time streaming.
Agents are just stateless HTTP endpoints. Write them in any language, deploy them anywhere.
The decision loop
Your worker is a pure function. Substructure sends a trigger and the current state. Your worker makes decisions, executes tools, and returns actions. Substructure handles everything else.
Trigger
A decision needs to be made or a tool needs to be executed. The server packages it with the current agent state and sends it to your worker.
Decide
Your worker receives the trigger, runs your agent logic, and returns a list of actions along with updated state.
Act
The server processes each action: calling the LLM, dispatching tool executions back to the worker, or spawning sub-agents.
Loop
Each completed action generates the next trigger and the loop continues. The worker returns done to end it.
Durable state with stateless code
Workers don't hold state between calls. Instead, the server sends the current state with every decision request, and saves whatever the worker sends back.
For simple agents, inline the full object. For larger workloads, pass a reference key and keep state in your own storage. The server treats it as opaque bytes.
messages: 1✓ persistedmessages: 2, pending_tools: 1✓ persistedmessages: 3, pending_tools: 0✓ persistedmessages: 4✓ persistedReal-time client connections
Clients stream every event as it happens. LLM tokens, tool calls, sub-agent progress, all in real time. Disconnect and reconnect mid-session without losing a single event.
See it in action
substructure start --worker-url http://localhost:4444const agent = defineAgent("weather-agent")
.use(state())
.use(systemMessage("You are a helpful weather assistant.")
.use(messageHistory())
.use(tools({ getWeather })
.use(llmLoop({
request: { model: "anthropic/claude-opus-4.6-fast" },
llm_client: "openrouter",
retry: { timeout_secs: 120, max_retries: 3 },
}));
const worker = new Worker([agent]);
Bun.serve({ port: 4444, fetch: worker.fetchHandler() });const client = new BackendClient({ url: "http://localhost:9000", apiKey });
const stream = client.submit({
agentId: "weather-agent",
payload: {
type: "message",
message: { role: "user", content: "What's the weather in SF?" },
},
auth: { tenant_id: "default", sub: "user-1" },
});
for await (const event of stream) {
console.log(event.payload.type);
}
const result = await stream.result;
console.log(result.data);What Substructure handles
- Durable agent sessions
- Crash recovery and automatic retries
- Agent state persistence
- LLM call orchestration
- Tool call dispatch
- Sub-agent lifecycle management
- Cost tracking per session
- Real-time event streaming to clients
- Resumable client connections
- JWT auth for browser clients
- API key auth for backends
- Client-side tool execution
- Human-in-the-loop interrupts
- Embeddable or standalone server
Just the harness
The server handles durability, retries, and state so your worker doesn't have to. Your worker just runs logic and executes tool calls. Everything lives in version control and deploys however you want.