Tarmac / How it works
Admit · decide · enforce · settle — every run, every dollar

Tarmac runs like a card network. An atomic authorization hold is placed before any compute is spent.

Every agent request flows through five steps in single-digit milliseconds. The same architecture that lets Visa decide whether your card swipe will breach your credit limit before the merchant ships the goods — applied to autonomous compute spend.

// 00 · The full path

From agent intent to settled receipt. What happens in the ~14ms before your run starts.

A single authorization roundtrip touches three services. The Enforcement Proxy sits in your network path; the Budget Engine holds the ledger invariant; the provider sees a clean, scoped call. If anything in the chain is unhealthy, the proxy refuses to admit — no leak.

Agent
POST /v1/messages
request · estimated max cost
Proxy
HOLD $4.10 max
authorize against ledger
Budget Engine
Agent
tier: M (throttled)
decision returned · 6ms
Proxy
ADMIT $4.10 held
cap fit · receipt #r-9c2
Budget Engine
Agent
forward · scoped token
to Anthropic Bedrock
Proxy
200 OK
stream tokens back
Provider
Agent
stream · response
+ usage metering
Proxy
SETTLE $1.20
release $2.90 · receipt closed
Budget Engine
// p50 · hold
4ms
Authorization decision against the live ledger.
// p99 · hold
14ms
Including price-signal lookup & rule evaluation.
// p50 · settle
3ms
Releases unused portion of the hold; closes the receipt.
// proxy overhead
7ms
Round-trip added per call. Streamed responses are unaffected.
// availability target
99.95%
Monthly SLO. Fail-closed: outage = no spend, not over-spend.
// The five steps

Onboard. Watch. Decide. Enforce. Report. Each one with a single, testable contract.

// Step 01 Onboard

Encode the budget, the priority tier, and the lane — once.

The onboarding object is the only thing the security team has to review. It's intentionally small: a few caps, a tier label, and a pinned lane. Everything dynamic happens at run time and is recorded against this frozen configuration.

// What you set
  • Workspace + cost-center mapping
  • Per-run cap · the most a single run may spend
  • Monthly cap · hard limit for the calendar period
  • Yearly cap · multi-period safety net
  • Agent priority tier · P0 / P1 / P2 / P3
  • Pinned lane · model family + provider + region
  • Allowed tier range within the lane · S → XL
// Example · agent config
# agents/support-triage.yaml
agent: support-triage
cost_center: customer-support
priority: P0

caps:
  per_run: $5.00
  monthly: $60,000
  yearly:  $700,000

lane:
  provider: anthropic-bedrock
  family:   claude
  region:   us-east-1
  tiers:    [S, M, L, XL]

rules:
  - if pace > 90% then floor: L
// Step 02 Watch

The Price Signal and the Ledger run continuously — and they never disagree.

Two services run in the background. The Compute Price Signal aggregates published rates, contract pricing, and observed billing. The Ledger keeps the running balance per agent, per cap. They reconcile on every settlement; a divergence over $0.01 triggers an alarm.

// The Compute Price Signal
  • Continuous aggregation across providers
  • Customer contract rates & volume discounts
  • Token-normalized across models & reasoning tiers
  • Drift detection · alarms on published-rate changes
  • Hot-reload to the decision engine
// Ledger invariants — continuously enforced
  • sum(holds) + sum(settles) ≤ cap
  • every hold settles or expires within 30s
  • every settle releases the unused hold portion
  • append-only · hash-chained · externally verifiable
  • per-agent & per-workspace balance reconciled
// Step 03 Decide

The decision is a deterministic rule cascade — not a heuristic.

Every authorization runs through the same five-rule ladder, in order. The first matching rule wins. The cascade is auditable: the receipt records which rule fired and why. There is no machine-learning model making this call; correctness matters more than cleverness.

01
if per-run cap < minimum-tier price · → hold
no tier in the lane can be served for the requested run
hold reason: per-run-cap
02
if monthly remaining < minimum-tier estimate · → hold
the cheapest valid tier still doesn't fit the remaining budget
hold reason: monthly-cap
03
if agent.priority < rebalance-threshold(pace) · → throttle
tier drops to the heaviest level that protects higher-priority work
throttle reason: priority-rebalance
04
if per-run cap < preferred-tier estimate · → throttle
lane offers a lighter tier that fits the per-run cap
throttle reason: per-run-tier-down
05
otherwise · → admit · heaviest tier that fits
pick the highest tier whose worst-case estimate still satisfies both caps
admit tier: XL · L · M · S
// Step 04 Enforce

Atomic authorization holds — the same primitive a card network uses.

Before the proxy forwards the call to the provider, it places a hold on the ledger for the maximum dollar amount the run could spend. The provider sees a clean, scoped call. When the call settles, the actual cost is captured and the unused portion of the hold is released. There is no path where two concurrent runs consume the same dollar.

// The hold lifecycle
  • PLACE · hold full max-cost on caps
  • ADMIT · proxy forwards to provider on scoped token
  • SETTLE · provider usage captured · final cost reconciled
  • RELEASE · unused hold returned to the caps
  • EXPIRE · stuck holds auto-release at 30s · alarm raised
// The CI release gate

Concurrency-invariant test

Every release runs a 10,000-request concurrency invariant test against the Budget Engine: random tiers, random caps, random failure injection. The build does not ship unless sum(authorized_spend) ≤ cap holds with zero exceptions.

PASS · 10,000 / 10,000 · 0 over-cap admits
PASS · 0 lost holds · 0 double-counts
PASS · ledger reconciles to the cent

// Step 05 Report

Every run gets an immutable receipt. Statements are aggregations — not "recalculations".

The receipt is the leaf. Period statements roll receipts up by cost center, team, agent, and lane. There is no separate billing dataset; finance and engineering look at the same ledger.

RECEIPT · rcpt_9c2f4a SETTLED
Agentsupport-triage · P0
Cost centercustomer-support
Laneanthropic · claude · us-east-1
Tier resolvedXL · claude-opus-4
Rule fired05 · admit · heaviest fit

Hold placed$4.10 · 14:22:07.214
Settled cost$1.92 · 14:22:11.832
Released$2.18 back to monthly cap

Total · billed$1.92
// What aggregates roll up to
  • Receipts → agent rollup · daily, weekly, monthly
  • Agent rollup → cost-center rollup · with allocation rules
  • Cost-center rollup → workspace statement · monthly close
  • Statement → GL export · NetSuite / QuickBooks / Sage / Workday
  • All views read from the append-only ledger · never reconciled by hand

No "billing batch job". Statements aggregate; they don't compute.

// What makes prevention possible

Three invariants. Every property that follows hangs on these.

// 01 · Atomicity

One hold, one dollar.

Every hold is placed in a single ledger transaction against the per-run, monthly, and yearly caps. Two requests cannot both succeed against the same remaining dollar — the ledger is the serialization point. HOLD is the same primitive a card network uses to authorize a swipe before the goods ship.

// 02 · Fail-closed

No authorization, no admit.

If the Budget Engine is unhealthy — degraded, partitioned, recovering — the proxy 503s the request rather than forwarding it. We chose this default deliberately: an outage of Tarmac is a paused fleet, not a runaway bill. Customers can override per agent if they prefer fail-open for P0 work.

// 03 · Append-only

Ledger is the system of record.

Every authorization, hold, throttle, settle, and release is an append-only, hash-chained entry. Statements aggregate; they don't compute. The ledger is externally verifiable — a SOC 2 Type II auditor can replay a period from the chain without trusting Tarmac to render it.

// See it on real traffic

Run a paid pilot.
Watch your own authorizations.

Design partners point one team's fleet at the proxy and have authorization receipts inside an hour. The first statement lands at month-end. Both surfaces, end to end.

// The architecture in one paragraph

Proxy in the path. Ledger as source of truth. Five-rule cascade. Fail-closed default. Append-only audit. Done.

  • Authorization decision · p99 14ms
  • Proxy overhead · 7ms / call
  • SLO · 99.95% monthly availability
  • Concurrency-invariant test · CI release gate
  • Zero customer credentials at rest