Article IV: The Fiscal Ceilings
The Law of Boundaries
Soft budgets are suggestions. Hard ceilings are architecture. A system without enforced spending limits is not production-ready — it is a metered utility with no circuit breaker.
Section 1: The Session Cap
Every distinct workflow or agent loop must initialize with a hard token budget.
Budget initialization is not a configuration afterthought. It is the first operation of every session:
session = create_session(
user_id="usr_123",
feature="document-analysis",
token_budget=50_000,
usd_ceiling=0.50
)No budget, no session. No exceptions for "internal testing" in production environments.
Section 2: Graceful Degradation
When a session hits 90% of its budget, it must force a final summary or cleanly terminate — rather than throwing a mid-generation out-of-funds error.
Users should never see a provider 429 or a raw "insufficient credits" exception. They should see a completed, degraded response:
- At 90% — inject a system directive: "Conclude your response. Budget nearly exhausted."
- At 100% — return the best partial result available with a clear status:
completed_degraded - Never — abort mid-sentence, mid-tool-call, or mid-thought without recovery
Hard stops that surface infrastructure failures to end users are constitutional violations.
Section 3: Usage-Based Syncing
Local state limits must sync seamlessly with external credit-based billing ledgers to prevent unbilled overages.
Your internal session budget and your billing provider's credit balance are two views of the same constraint. They must stay synchronized:
- Pre-authorize spend against the billing ledger before starting expensive operations
- Reconcile local counters with provider billing on a defined interval (≤ 5 minutes for production)
- Block new sessions when billing credits are exhausted, not when the monthly invoice arrives
The worst failure mode in AI FinOps is discovering unbilled overages in a invoice 30 days after the damage was done.
Previous: Article III: Context Window Sovereignty · Next: Article V: Prompt Schema Standards