About

Fireworks AI is a cloud platform for running, tuning, and deploying generative AI models. It provides serverless inference, on-demand GPU deployments, and fine-tuning workflows for open models, with model access exposed through an API and developer tooling.

The pricing page lists $1 in free credits for serverless inference, along with per-token pricing for text and vision models, per-second pricing for speech-to-text, per-step and per-image pricing for image generation, and per-GPU-hour pricing for on-demand deployments. It is aimed at developers building code assistants, chatbots, RAG systems, multimodal apps, and other model-backed applications.

  • Serverless inference with postpaid billing
  • $1 free credits at signup
  • Per-token text and vision pricing
  • Speech-to-text billed per audio minute
  • Image generation billed per step or image
  • On-demand GPU deployments by hour
  • Fine-tuning and reinforcement tuning support
  • Open model library with API access

What's included in the free tier

  • Access to serverless inference with $1 in free credits.
  • Use of text and vision models within the free credits.
  • Use of speech-to-text models within the free credits.
  • Use of image generation models within the free credits.
  • Use of embeddings models within the free credits.