Name: Deep Infra
Brand: Deep Infra
Availability: InStock

About

Deep Infra is an AI inference and machine learning infrastructure platform that provides API access to hosted models, GPU instances, and dedicated clusters. The service exposes model endpoints across text generation, embeddings, speech, image, and video categories, with pricing based on per-token, per-minute, per-image, or instance-hour usage depending on the workload.

The pricing page lists usage-based billing with no free allowance for the core hosted inference service. It also includes dedicated custom LLM deployments on A100, H100, H200, B200, and B300 GPUs, billed in minute granularity, plus dedicated instances and clusters available by sales contact.

Hosted model APIs across multiple modalities
Per-token, per-minute, and per-image billing
GPU instances and dedicated clusters
256k to 1M token contexts on select models
SOC 2 and ISO 27001 certified
US-based data centers
200 concurrent request limit

What's included in the free tier

See Deep Infra pricing for current limits.

About

What's included in the free tier

Together AI

Clarifai

Fireworks AI

Deepgram

Modal

Qdrant

Pinecone

Mistral AI