Skip to main content
DeepInfra hosts a large number of the most popular machine learning models. You can find the full list here, conveniently split into categories based on their functionality. We are constantly adding more. DeepInfra is usually amongst the first to add a new model once it is available, and offers the best prices for open-source model inference.

Model categories

Model pages

Each model has a dedicated page where you can:
  • Try it out interactively
  • See its API documentation
  • Grab ready-to-use code examples

Private models

We also support deploying custom models on DeepInfra infrastructure. Run your own fine-tuned or trained-from-scratch LLM on dedicated A100/H100/H200/B200/B300 GPUs.

Specifying model versions

Some models have more than one version available. You can infer against a particular version using {"model": "MODEL_NAME:VERSION", ...} format. You can also infer against a deploy_id using {"model": "deploy_id:DEPLOY_ID", ...}. This is especially useful for Custom LLMs — you can start inferring before the deployment finishes and before you have the model name + version pair.

Model deprecation

Due to the fast-paced AI world, newer and better models are released every day. Occasionally we have to deprecate older models to maintain quality and affordability. When a model is deprecated:
  • You’ll receive at least 1 week’s advance notice before the deprecation date
  • Your applications won’t break — after deprecation, inference requests are automatically forwarded to a recommended replacement model
  • You’ll get an email notifying recent users of the model, including the deprecation date
You can browse the current list of available models at deepinfra.com/models.

Suggest a model

If you think there is a model that we should run, let us know at info@deepinfra.com. We read every email.