Reference
API reference
Endpoints, request/response shapes, headers
RealityRouter exposes standard OpenAI-compatible endpoints. Any client built for the OpenAI API works — just change the base URL.
Base URL
With the router running locally, point your client at:
http://localhost:8000/v1
API key. Your client must send anAuthorizationheader, but the value isn't validated. The router authenticates to upstream providers using keys in~/.reality_router/.env.
POST /v1/chat/completions
Chat-based interactions. Same request shape as OpenAI:
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer anything" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{"role": "user", "content": "Refactor this function..."}
],
"tools": [...],
"temperature": 0.2
}'
The model parameter
"auto"— let the router decide per-query. Recommended."gpt-5.4-thinking","claude-opus-4", etc. — pin to a specific model. Useful for debugging or when you have hard constraints."tier:cheap","tier:flagship"— bias the router toward a price tier without naming a specific model.
Response shape
Same as OpenAI's chat.completion response, with an additional x-rr-* set of headers exposing the routing decision:
X-RR-Model— which model handled the request.X-RR-Strategy—single-shotorsequential.X-RR-Attempts— how many models were tried before success.X-RR-Cost— estimated USD cost of this request.X-RR-Saved— estimated USD saved vs the always-flagship counterfactual.
POST /v1/completions
Legacy text completion. Same shape and headers as the OpenAI equivalent. Generally prefer /v1/chat/completions for new code.
GET /.well-known/agent-card.json
Standard agent-discovery endpoint for A2A communication. Returns the router's name, version, capabilities, and supported protocols.
curl http://localhost:8000/.well-known/agent-card.json
GET /health
Liveness check. Returns 200 OK with a small JSON body when the router is up:
{
"status": "ok",
"uptime_s": 3812,
"models_active": 7,
"models_quarantined": 1
}
GET /metrics/dashboard
The web dashboard UI. Renders the live view of spend, calibration, and per-agent activity. See the Dashboard guide.
Error codes
200— success. Response body matches OpenAI's shape.400— malformed request (invalid JSON, missing required fields).429— concurrency limit exceeded on all eligible models. Retry after a moment.502 Bad Gateway— infrastructure failure with the upstream provider (timeout, 500, invalid key). InspectX-RR-Upstream-Errorfor details.503— all configured models are quarantined by the circuit breaker. Wait for recovery or check provider status.
Quality failures don't return errors. If a model returns broken JSON, truncated output, or an AI refusal, the router treats that as a quality failure and silently escalates to a better model. Your client only sees the validated response.