Track OpenAI and Claude costs from Python — without installing another SDK.
PromptCost works with the official openai and anthropic Python SDKs natively. No extra package. No decorators. No instrumentation. Just set base_url and pass two headers — the same per-agent cost tracking and hard budget caps you'd get from any HTTP client.
Free forever · No card · Indie plan $9/mo for first 50 users
Why a proxy beats SDK instrumentation
Most LLM observability tools in Python ask you to install a package, wrap your client, or sprinkle decorators. They work, but they couple your code to the tracker. If you switch tools later, you're rewriting code.
PromptCost is just an HTTP endpoint. Your code stays vanilla — the official OpenAI and Anthropic SDKs you already use. Switch to or away from PromptCost by changing base_url, nothing else.
OpenAI (Python SDK)
Configure the client once
from openai import OpenAI client = OpenAI( base_url="https://api.promptcost.io/openai/v1", api_key="sk-••••••••••", # your OpenAI key default_headers={ "cg-key": "sk-pc-••••3f9a", "cg-agent": "lead-scorer", }, ) response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}], )
Every call from this client is automatically tagged to the lead-scorer agent. Switch agents per-call by passing extra_headers={"cg-agent": "..."} to create().
Anthropic (Python SDK)
Same pattern
from anthropic import Anthropic client = Anthropic( base_url="https://api.promptcost.io/anthropic", api_key="sk-ant-••••••••••", default_headers={ "cg-key": "sk-pc-••••3f9a", "cg-agent": "email-drafter", }, ) response = client.messages.create( model="claude-haiku-4-5", max_tokens=1000, messages=[{"role": "user", "content": "Draft an email"}], )
Per-call agent tagging
If you have multiple agents in one process, pass the agent name per request:
# OpenAI response = client.chat.completions.create( model="gpt-4o-mini", messages=[...], extra_headers={"cg-agent": "summarizer"}, ) # Anthropic response = client.messages.create( model="claude-sonnet-4-6", max_tokens=1000, messages=[...], extra_headers={"cg-agent": "summarizer"}, )
Async clients
Both AsyncOpenAI and AsyncAnthropic work identically:
from openai import AsyncOpenAI client = AsyncOpenAI( base_url="https://api.promptcost.io/openai/v1", api_key="sk-••••••••••", default_headers={"cg-key": "sk-pc-...", "cg-agent": "my-agent"}, ) async def run(): return await client.chat.completions.create(...)
Streaming
SSE streaming passes through unchanged. Token usage is logged when the stream completes:
with client.messages.stream( model="claude-haiku-4-5", max_tokens=1000, messages=[...], ) as stream: for chunk in stream.text_stream: print(chunk, end="")
Handling 429 (budget exceeded)
When an agent hits its monthly cap, PromptCost returns a 429 with a structured body. The SDKs surface this as RateLimitError — handle it like any rate limit, except this one means "you ran out of budget":
from openai import RateLimitError try: response = client.chat.completions.create(...) except RateLimitError as e: if e.body.get("error") == "budget_exceeded": # your agent hit its monthly cap notify_slack(e.body["agent"], e.body["message"]) else: raise
What you get
- Per-agent cost breakdown in the dashboard — every Python agent on its own row.
- Hard budget cap per agent — set monthly USD cap, get a 429 when exceeded.
- Full request log — model, tokens in/out, cost, latency, timestamp.
- Works with sync, async, streaming, tool use, vision and prompt caching — anything the official SDKs support.
- Zero extra dependencies — no
pip install promptcost. Justopenaioranthropic.
FAQ
Does this work with LangChain / LlamaIndex?
Yes — they use the underlying SDKs. Set base_url and headers on the client you pass in. For LangChain's ChatOpenAI, use openai_api_base and default_headers.
Does this work with Instructor / Pydantic-based clients?
Yes. Instructor wraps the OpenAI client; configure that client with the proxy base_url and headers, then pass it to Instructor.
What about Azure OpenAI?
Azure is on the roadmap but not yet supported. For now PromptCost proxies OpenAI's direct API and Anthropic's direct API.