About
Fireworks AI is a cloud platform for running, tuning, and deploying generative AI models. It provides serverless inference, on-demand GPU deployments, and fine-tuning workflows for open models, with model access exposed through an API and developer tooling.
The pricing page lists $1 in free credits for serverless inference, along with per-token pricing for text and vision models, per-second pricing for speech-to-text, per-step and per-image pricing for image generation, and per-GPU-hour pricing for on-demand deployments. It is aimed at developers building code assistants, chatbots, RAG systems, multimodal apps, and other model-backed applications.
- Serverless inference with postpaid billing
- $1 free credits at signup
- Per-token text and vision pricing
- Speech-to-text billed per audio minute
- Image generation billed per step or image
- On-demand GPU deployments by hour
- Fine-tuning and reinforcement tuning support
- Open model library with API access
What's included in the free tier
- Access to serverless inference with $1 in free credits.
- Use of text and vision models within the free credits.
- Use of speech-to-text models within the free credits.
- Use of image generation models within the free credits.
- Use of embeddings models within the free credits.
See Fireworks AI pricing for current limits.