About
Together AI is a cloud platform for running open-source models through serverless inference, batch inference, dedicated inference, and dedicated container inference. It also includes accelerated compute with GPU clusters, sandbox environments, managed storage, and fine-tuning and evaluation tools for model shaping.
The pricing page shows usage-based pricing rather than a free plan: serverless inference is priced per 1M tokens, dedicated inference starts at $3.99 per hour for 1x H100 80GB, GPU clusters start at $3.49 per hour, sandbox compute is billed per vCPU and GiB RAM, managed storage is $0.16 per GiB/month, and fine-tuning is priced per 1M tokens with model-specific minimum charges.
- Serverless inference APIs
- Batch inference workloads
- Dedicated GPU endpoints
- GPU clusters on demand
- Sandbox development environments
- Managed storage for model data
- Fine-tuning and evaluations
What's included in the free tier
Details aren't itemized for this entry yet - check the pricing page below for the latest.
See Together AI pricing for current limits.