Authorization: Bearer. See the Inference API docs for provider-specific parameters.
Supports both managed (Lava’s API keys) and unmanaged (bring your own credentials) mode.
Quick Start
Chat Completions
Target URL:https://api.inference.net/v1/chat/completions
| Content Type | application/json |
| Streaming | Yes (set stream: true in request body) |
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| google/gemma-3-27b-instruct/bf-16 | $0.30 | $0.40 |
| meta-llama/llama-3.2-11b-instruct/fp-16 | $0.055 | $0.055 |
| meta-llama/llama-3.1-8b-instruct/fp-8 | $0.03 | $0.03 |
| meta-llama/llama-3.2-3b-instruct/fp-16 | $0.02 | $0.02 |
| meta-llama/llama-3.2-1b-instruct/fp-16 | $0.01 | $0.01 |
Next Steps
All Providers
Browse all supported AI providers
Forward Proxy
Learn how to construct proxy URLs and authenticate requests