Anthropic SDK & Claude Code

DeepInfra exposes an Anthropic-compatible Messages API. This means tools that target the Anthropic API — Claude Code, the Anthropic Python and TypeScript SDKs, and any framework with an Anthropic adapter — can point at DeepInfra and use open-source models.

Endpoint

https://api.deepinfra.com/anthropic

Two endpoints are available:

Endpoint	Description
`POST /anthropic/v1/messages`	Create a message (chat completion)
`POST /anthropic/v1/messages/count_tokens`	Count tokens for a message request

Authentication

Both standard Anthropic authentication methods are supported:

Header	Example
`Authorization`	`Bearer $DEEPINFRA_TOKEN`
`x-api-key`	`$DEEPINFRA_TOKEN`

You can also pass anthropic-version and anthropic-beta headers as needed.

Using the Anthropic SDK

pip install anthropic

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.deepinfra.com/anthropic",
    api_key="$DEEPINFRA_TOKEN",
)

message = client.messages.create(
    model="deepseek-ai/DeepSeek-V3",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
)

print(message.content[0].text)

Using with Claude Code

Claude Code can use DeepInfra as its backend. To keep your normal Claude Code setup untouched, add a dedicated shell function to your ~/.bashrc or ~/.zshrc:

deepinfra() {
  export ANTHROPIC_BASE_URL=https://api.deepinfra.com/anthropic
  export ANTHROPIC_AUTH_TOKEN=$DEEPINFRA_TOKEN
  export ANTHROPIC_MODEL=deepseek-ai/DeepSeek-V3.1-Terminus
  export ANTHROPIC_DEFAULT_HAIKU_MODEL=Qwen/Qwen3-30B-A3B
  export CLAUDE_CODE_MAX_OUTPUT_TOKENS=16384
  claude "$@"
}

Then run deepinfra instead of claude to launch Claude Code via DeepInfra. Your regular claude command stays unchanged.

Model override environment variables

Claude Code uses model aliases (opus, sonnet, haiku) internally. You can remap each alias to a DeepInfra model using these environment variables:

Environment variable	Description	Example
`ANTHROPIC_MODEL`	The primary model Claude Code uses for all tasks	`deepseek-ai/DeepSeek-V3.1-Terminus`
`ANTHROPIC_DEFAULT_OPUS_MODEL`	Model used for the `opus` alias (complex reasoning)	`deepseek-ai/DeepSeek-R1`
`ANTHROPIC_DEFAULT_SONNET_MODEL`	Model used for the `sonnet` alias (daily coding)	`deepseek-ai/DeepSeek-V3.1-Terminus`
`ANTHROPIC_DEFAULT_HAIKU_MODEL`	Model used for the `haiku` alias and background tasks (tab completions, commit messages)	`Qwen/Qwen3-30B-A3B`
`CLAUDE_CODE_SUBAGENT_MODEL`	Model used for subagents (parallel background tasks)	`Qwen/Qwen3-30B-A3B`

A more complete example with all overrides:

deepinfra() {
  export ANTHROPIC_BASE_URL=https://api.deepinfra.com/anthropic
  export ANTHROPIC_AUTH_TOKEN=$DEEPINFRA_TOKEN
  export ANTHROPIC_MODEL=deepseek-ai/DeepSeek-V3.1-Terminus
  export ANTHROPIC_DEFAULT_OPUS_MODEL=deepseek-ai/DeepSeek-R1
  export ANTHROPIC_DEFAULT_SONNET_MODEL=deepseek-ai/DeepSeek-V3.1-Terminus
  export ANTHROPIC_DEFAULT_HAIKU_MODEL=Qwen/Qwen3-30B-A3B
  export CLAUDE_CODE_SUBAGENT_MODEL=Qwen/Qwen3-30B-A3B
  export CLAUDE_CODE_MAX_OUTPUT_TOKENS=16384
  claude "$@"
}

ANTHROPIC_DEFAULT_HAIKU_MODEL is used for lightweight background tasks like tab completions and commit messages. Pick a fast, cheap model here to keep costs low. The older ANTHROPIC_SMALL_FAST_MODEL variable is deprecated — use ANTHROPIC_DEFAULT_HAIKU_MODEL instead.

Streaming

Streaming works the same as the Anthropic API — use stream=True (Python) or stream: true (JS/cURL):

with client.messages.stream(
    model="deepseek-ai/DeepSeek-V3",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a short poem about open source."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Token counting

Count the tokens in a message request before sending it:

curl "https://api.deepinfra.com/anthropic/v1/messages/count_tokens" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DEEPINFRA_TOKEN" \
  -d '{
      "model": "deepseek-ai/DeepSeek-V3",
      "messages": [
        {
          "role": "user",
          "content": "Hello, how are you?"
        }
      ]
    }'

Notes

You are running open-source models via the Anthropic protocol, not Anthropic’s Claude models.
Model names use DeepInfra identifiers (e.g. deepseek-ai/DeepSeek-V3), not Anthropic model names.
Not all Anthropic-specific features may be supported. Standard message creation, streaming, and token counting work as expected.

Chat Completions

Use the OpenAI-compatible API instead.

Authentication

API keys and scoped JWTs.

Getting Started

Chat Completions

More APIs

Deploy Private Models

GPU Instances

Integrations

Account & Security

Tutorials

Anthropic SDK & Claude Code

Endpoint

Authentication

Using the Anthropic SDK

Using with Claude Code

Model override environment variables

Streaming

Token counting

Notes

Chat Completions

Authentication

Getting Started

Chat Completions

More APIs

Deploy Private Models

GPU Instances

Integrations

Account & Security

Tutorials

​Endpoint

​Authentication

​Using the Anthropic SDK

​Using with Claude Code

​Model override environment variables

​Streaming

​Token counting

​Notes

Chat Completions

Authentication

Endpoint

Authentication

Using the Anthropic SDK

Using with Claude Code

Model override environment variables

Streaming

Token counting

Notes