Every agent request flows through five steps in single-digit milliseconds. The same architecture that lets Visa decide whether your card swipe will breach your credit limit before the merchant ships the goods — applied to autonomous compute spend.
A single authorization roundtrip touches three services. The Enforcement Proxy sits in your network path; the Budget Engine holds the ledger invariant; the provider sees a clean, scoped call. If anything in the chain is unhealthy, the proxy refuses to admit — no leak.
The onboarding object is the only thing the security team has to review. It's intentionally small: a few caps, a tier label, and a pinned lane. Everything dynamic happens at run time and is recorded against this frozen configuration.
# agents/support-triage.yaml agent: support-triage cost_center: customer-support priority: P0 caps: per_run: $5.00 monthly: $60,000 yearly: $700,000 lane: provider: anthropic-bedrock family: claude region: us-east-1 tiers: [S, M, L, XL] rules: - if pace > 90% then floor: L
Two services run in the background. The Compute Price Signal aggregates published rates, contract pricing, and observed billing. The Ledger keeps the running balance per agent, per cap. They reconcile on every settlement; a divergence over $0.01 triggers an alarm.
sum(holds) + sum(settles) ≤ capevery hold settles or expires within 30severy settle releases the unused hold portionappend-only · hash-chained · externally verifiableper-agent & per-workspace balance reconciledEvery authorization runs through the same five-rule ladder, in order. The first matching rule wins. The cascade is auditable: the receipt records which rule fired and why. There is no machine-learning model making this call; correctness matters more than cleverness.
Before the proxy forwards the call to the provider, it places a hold on the ledger for the maximum dollar amount the run could spend. The provider sees a clean, scoped call. When the call settles, the actual cost is captured and the unused portion of the hold is released. There is no path where two concurrent runs consume the same dollar.
Every release runs a 10,000-request concurrency invariant test against the Budget Engine: random tiers, random caps, random failure injection. The build does not ship unless sum(authorized_spend) ≤ cap holds with zero exceptions.
PASS · 10,000 / 10,000 · 0 over-cap admits
PASS · 0 lost holds · 0 double-counts
PASS · ledger reconciles to the cent
The receipt is the leaf. Period statements roll receipts up by cost center, team, agent, and lane. There is no separate billing dataset; finance and engineering look at the same ledger.
No "billing batch job". Statements aggregate; they don't compute.
Every hold is placed in a single ledger transaction against the per-run, monthly, and yearly caps. Two requests cannot both succeed against the same remaining dollar — the ledger is the serialization point. HOLD is the same primitive a card network uses to authorize a swipe before the goods ship.
If the Budget Engine is unhealthy — degraded, partitioned, recovering — the proxy 503s the request rather than forwarding it. We chose this default deliberately: an outage of Tarmac is a paused fleet, not a runaway bill. Customers can override per agent if they prefer fail-open for P0 work.
Every authorization, hold, throttle, settle, and release is an append-only, hash-chained entry. Statements aggregate; they don't compute. The ledger is externally verifiable — a SOC 2 Type II auditor can replay a period from the chain without trusting Tarmac to render it.
Design partners point one team's fleet at the proxy and have authorization receipts inside an hour. The first statement lands at month-end. Both surfaces, end to end.