Skip to content

Inference proxy

The proxy is the inference entry point. It mirrors the upstream OpenAI/Anthropic API surface, with your token in the URL path.

https://api.usepod.ai/proxy/<token>/... # Anthropic-compatible
https://api.usepod.ai/proxy/<token>/v1/... # OpenAI-compatible

Common endpoints behind the proxy:

EndpointSurface
/v1/chat/completionsOpenAI chat completions (streaming and non-streaming)
/v1/messagesAnthropic messages
/v1/modelsModel listing
HeaderRequiredMeaning
Content-Type: application/jsonyesStandard JSON body
X-Pod-Max-Price-InputnoMax price per million input tokens (USDC microunits)
X-Pod-Max-Price-OutputnoMax price per million output tokens (USDC microunits)

The Authorization / api_key your SDK sends is ignored — auth is the token in the path. See Spend controls for the price headers.

HeaderMeaning
X-Balance-RemainingRemaining token balance after this request
X-Pod-RouteWhich source served the request: marketplace, BYOK relay, or centralized
  • A token with no balance is rejected before any upstream call.
  • An unknown token returns 401.
  • In marketplace-only mode, no provider at your price returns a dedicated no-provider result rather than billing at a higher price.

Streaming responses are relayed byte-for-byte from the selected source; token usage is extracted from the stream for settlement.