OpenAI integration

Track OpenAI API costs per agent — without the runaway-bill anxiety.

PromptCost is a drop-in proxy for the OpenAI API. Swap api.openai.com for api.promptcost.io/openai, add two headers, and every call gets tagged: cost, tokens, latency, model — broken down by agent. With hard budget caps that return 429 before forwarding.

Start free → Lock $9/mo — 50 spots

Free forever · No card · Indie plan $9/mo for first 50 users

Why OpenAI's own dashboard isn't enough

OpenAI gives you a usage dashboard and a Usage API. Both are useful, both have real limits when you're running multiple agents:

The dashboard lags. By the time a runaway loop shows up, the damage is done. Several developers have reported $30K–$72K incidents where the spike registered hours after the fact.
"Hard limits" aren't hard. OpenAI removed real-time hard caps in 2024 in favor of alerts and delayed enforcement. Your "monthly limit" emails you. It does not block requests in real time.
No native per-agent attribution. If twelve agents share one API key, the dashboard shows you one total, not who spent what.

Setup in 60 seconds

01 —

Get a PromptCost key

02 —

Swap the endpoint

# Before
POST https://api.openai.com/v1/chat/completions

# After
POST https://api.promptcost.io/openai/v1/chat/completions

All paths under /v1/* are proxied as-is — chat completions, embeddings, audio, batch, etc.

03 —

Add the headers

Authorization: Bearer sk-••••••••••     # your OpenAI key
cg-key:        sk-pc-••••3f9a           # promptcost key
cg-agent:      "lead-scorer"              # your agent name
Content-Type:  application/json

The body is identical to OpenAI's API. Nothing else changes.

04 —

Use the OpenAI SDK if you want

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptcost.io/openai/v1",
    api_key="sk-••••••••••",
    default_headers={
        "cg-key": "sk-pc-••••3f9a",
        "cg-agent": "lead-scorer",
    },
)

Works with the official Python and Node SDKs out of the box — both expose base_url and custom headers.

Supported models

All current OpenAI models on the Chat Completions, Responses, Embeddings, and Audio endpoints are supported. Pricing tables are updated automatically when OpenAI changes rates.

Chat: GPT-4o, GPT-4o mini, GPT-4 Turbo, o1, o1-mini, o3, o3-mini
Embeddings: text-embedding-3-large, text-embedding-3-small
Audio: whisper-1, gpt-4o-mini-transcribe, gpt-4o-mini-tts
Batch API: all of the above at half cost

What you get

Per-agent cost breakdown — tag any request with a name and see it grouped on the dashboard.
Hard budget cap — set a USD limit per agent. PromptCost returns 429 before forwarding. You don't pay for blocked calls.
Streaming responses work natively — SSE passes through; usage logs after stream completion.
Function/tool calls tracked — input + output tokens both counted; tool definitions included in cost.
Zero key storage — your OpenAI key passes through as a header.
~5–15ms overhead — async logging never blocks the response path.

FAQ

Does this work with Assistants API / threads?

Yes. The proxy supports the Assistants and Responses APIs. Each run is logged as one cost-tracked event.

What about the Batch API?

Yes — batch requests are tracked at the discounted batch rate.

Function calls and tool use?

Tracked. The proxy logs input tokens (including the tool schemas) and output tokens (including tool call arguments).

Will OpenAI see this as a different IP?

Yes — requests go from PromptCost's infra to OpenAI. If you have IP-restricted keys, allowlist api.promptcost.io's outbound range (provided in dashboard).

Start tracking free → Back to overview