Anthropic integration

Track Claude API costs per agent — with hard budget caps Anthropic doesn't give you.

PromptCost is a drop-in proxy for the Anthropic API. Swap api.anthropic.com for api.promptcost.io/anthropic, add two headers, and every Claude call gets tagged: cost, input/output tokens, latency, model — broken down by agent. Plus the one feature Anthropic doesn't offer: hard caps that block at 429.

Start free → Lock $9/mo — 50 spots

Free forever · No card · Indie plan $9/mo for first 50 users

The Anthropic-specific gap

Anthropic's billing UI gives you usage alerts and a Usage API. What it doesn't give you is the thing teams actually want: a hard cap that stops requests when the budget is hit. There's no equivalent of "fail at $X." If a Claude-powered agent goes into a loop, you'll get an alert email and the bill keeps growing.

PromptCost adds that layer. The proxy maintains a per-agent spend counter, and once an agent's monthly cap is hit, the request is rejected with a 429 — Anthropic never sees it, and you don't pay for it.

Setup in 60 seconds

01 —

Get a PromptCost key

02 —

Swap the endpoint

# Before
POST https://api.anthropic.com/v1/messages

# After
POST https://api.promptcost.io/anthropic/v1/messages

03 —

Add the headers

x-api-key:         sk-ant-••••••••••      # your Anthropic key
anthropic-version: 2023-06-01
cg-key:            sk-pc-••••3f9a          # promptcost key
cg-agent:          "lead-scorer"             # your agent name
Content-Type:      application/json

The body is identical to Anthropic's API. Same JSON shape, same model names, same parameters.

04 —

Use the Anthropic SDK if you want

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.promptcost.io/anthropic",
    api_key="sk-ant-••••••••••",
    default_headers={
        "cg-key": "sk-pc-••••3f9a",
        "cg-agent": "lead-scorer",
    },
)

Supported models

Claude Haiku 4.5 — cheapest, fastest
Claude Sonnet 4.6 — balanced (the workhorse for most agents)
Claude Opus 4.7 — most capable
Plus Claude 3.5 Haiku, Claude 3.5 Sonnet, Claude 3 Opus and other supported versions

Pricing tables are updated automatically when Anthropic changes rates.

What you get

Per-agent cost breakdown — tag any request with a name and see it grouped on the dashboard.
Hard budget cap — set a USD limit per agent. PromptCost returns 429 before forwarding to Anthropic. You don't pay for blocked calls.
Tool use tracked — function calls and tool definitions are counted in token usage.
Streaming responses work natively — the SSE stream passes through; usage logs after stream completion.
Prompt caching cost — cache reads and writes are priced separately, exactly as Anthropic bills them.
Zero key storage — your Anthropic key passes through as a header.
~5–15ms overhead — async logging never blocks the response path.

FAQ

Is prompt caching tracked correctly?

Yes. Anthropic returns separate cache_read_input_tokens and cache_creation_input_tokens; PromptCost prices them at the correct discounted/premium rates.

Does this work with the Messages Batches API?

Yes — batch requests are tracked at Anthropic's discounted batch rate.

Tool use / function calling?

Tracked. Both the tool schemas (input tokens) and the tool call arguments (output tokens) count toward usage.

What about Vision (image input)?

Image tokens are counted exactly as Anthropic bills them.

Does the proxy support extended thinking / reasoning models?

Yes. Reasoning tokens are billed at the model's output rate, same as Anthropic.

Start tracking free → Back to overview