Autonomy Stack — The Complete Guide
How to make an agent network operate on its own 24/7 — safely, with human approval for every risky move
An agent network that talks to itself is a start. A network that also acts on its own — takes a task, executes it, verifies it actually succeeded, and repairs itself when something breaks — is the autonomy stack. This guide describes the live system I (Elad) run on a private server: a durable task queue (SQLite) that holds every request even when my computer is off; a Worker that pulls one task, executes, and dies (easy to debug, no memory leaks); a Firewall that blocks any dangerous action until I approve it with one click; a verify-on-result layer that proves a task truly succeeded rather than just 'ran'; an outcome ledger that measures every move; an Oracle that weekly audits itself and writes fix proposals; and at the very top — a self-healing executor that applies a fix on its own (off by default, enabled only once trusted). Above all of it sits a model gateway with a $5/day cost cap, plus a CRM and dashboard that show the whole thing at a glance. For you — the same pattern turns a pile of bots-waiting-for-orders into a system that works for you while you sleep.
What this guide covers
What is an autonomy stack?
The difference between an agent that reacts and a network that acts on its own
Autonomy is the layer that makes the agent network operate by itself — take a task, execute, verify, and repair — without a human standing behind every step. Unlike orchestration (which is about how agents coordinate), autonomy is about how the network acts safely on its own. Its three pillars: durability (the queue holds tasks even when everything is off), reliability (verify-on-result instead of 'the command ran'), and safety (a Firewall that stops every risky move until a human approves).
Task queue + Worker — the beating heart
A task enters a durable queue; the Worker pulls one, executes, and dies
The heart of autonomy is a durable task queue: every request — from Elad on WhatsApp, from another agent, from a scheduled cron — is written to a queue on the server (SQLite for me, simple and resilient). Above it runs a Worker triggered every minute (systemd timer): it pulls the next task, executes it, reports a result — and dies. The 'single-shot process' pattern (pull→execute→die) is the opposite of a long-lived bot: easy to debug, no memory leaks, and every run starts clean.
Firewall + one-click approval — autonomy with a brake
A safe action runs on its own; a risky move waits for human approval
The big fear of an autonomous agent is 'what if it does something irreversible?'. The Firewall is the answer: every task type has a safety rating. A safe task (summarize, classify, report, read) runs on its own immediately. A dangerous task (send an email to a client, publish a post, make a payment, delete) is blocked — it enters an approval queue, and Elad gets an alert and approves or rejects with a single click. That's how you get real autonomy without giving up control.
Verify-on-result — 'ran' is not 'succeeded'
A separate step that proves the task actually hit its target
The classic automation mistake: assuming 'the command didn't crash' = 'the task succeeded'. In practice many failures are silent — an email sent to the wrong address, a post that went up empty, a script that ran but did nothing. The verify-on-result layer separates execution from verification: after the Worker executes, a separate step checks objective evidence that the target was achieved. Only if verification passes is the task marked 'done'. Otherwise — 'failed', and it returns for handling.
Outcome ledger + system map — measuring every move
One source of truth that measures what worked, what failed, and what it cost
For a system to improve, it needs to know how it performs. The outcome ledger records every move: what was executed, whether verification passed, how long it took, what it cost. Above it sits the system map — one declared source of truth describing every component, its owner, and its safety rating. Combining the two yields the 'shelfware signal': a component declared but which never ran successfully — an instant flag that something isn't wired up.
Self-healing — the system audits and repairs itself
The Oracle writes a fix proposal; at the top layer — an executor applies it on its own
The most advanced layer: a system that maintains itself. For me it works in several tiers. Tier one — the Oracle (Aurora) goes weekly over the ledger and the map, detects drift and failures, and writes a fix proposal (fix_proposal) that goes to human approval. A higher tier — a self-healing executor (apply_remediation) that can apply a fix on its own, but it's off by default and is enabled only when trusted for a specific failure type. In parallel, a knowledge custodian (brain_maintain) maintains the source of truth — index, gaps, and coherence — so the organizational brain doesn't rot either.
Model gateway + cost control — autonomy isn't free
Every LLM call goes through one gateway with a $5 daily cap
A system that runs on its own 24/7 can also burn money on its own 24/7. The model gateway is one chokepoint for all LLM calls in the network: it picks a model (free first — Gemini/local, a strong model only when needed), measures every call, and enforces a daily cost cap (for me, $5). If approaching the cap — it downgrades to cheaper models or stops non-critical actions. That's how autonomy stays economically sustainable.
Integration — how to adopt this yourself
Start with a simple queue, add a safety layer only when you give it 'hands'
As in every guide — don't build the whole autonomy stack on day one. Order matters: first queue + Worker (durability), then the Firewall the moment the agent gets real 'hands', then verification and measurement, and only at the end — self-healing. Each layer arrives safe-by-default and lights up gradually. It all connects to the rest of the network: the Delegator is the entry gate, the dashboard shows live state + approval buttons, and orchestration provides the coordination layer on top of which all of this sits.
