Article II: The Routing Mandate
The Law of Model Selection
Frontier models are a privileged resource, not a default endpoint. Treating GPT-4-class models as model: "default" is the architectural equivalent of using a cargo plane for grocery delivery.
Section 1: The Baseline Assumption
All tasks default to the smallest capable model.
The routing layer — not the application developer — owns the initial model selection. Application code requests a capability (e.g., classify-intent, generate-summary), and the router maps it to the cheapest model that meets the SLA.
Prohibited patterns:
- Hardcoding
gpt-4in application code without a routing abstraction - Using frontier models for classification, extraction, or formatting tasks
- "We'll switch to a cheaper model later" as a launch strategy
Section 2: The Proof of Complexity
Routing to a frontier model requires programmatic justification — a classifier flag, complexity score, or escalation signal indicating that cheaper models have been attempted or assessed as insufficient.
Acceptable justification signals:
- A complexity classifier returned
score > threshold - A cheaper model attempt failed quality checks (with the failure logged)
- The user explicitly selected a "high quality" tier (with corresponding billing)
Unacceptable justification:
- Developer preference
- "The prompt is long"
- Absence of a routing layer entirely
Every frontier model call must be auditable. If you cannot produce the justification log for a given inference, the call should not have happened.
Section 3: The Fallback Protocol
If a fast or cheap model fails, the system cascades upward — but never defaults upward.
Escalation is a response to failure, not a starting position. The cascade must be:
- Attempt cheapest capable model
- On quality failure, log the failure and escalate one tier
- Repeat until success or hard ceiling reached
- If ceiling reached, degrade gracefully (see Article IV) — do not silently burn budget on frontier models
Defaulting upward — starting with the most expensive model "to be safe" — is a constitutional violation regardless of outcome quality.
Previous: Article I: Immutable Metering · Next: Article III: Context Window Sovereignty