Claude API Cost Calculator

Calculate Anthropic Claude API costs for Claude 4 Sonnet, Opus, Haiku and Claude 3.5 models.

Model

Input tokens per call

Output tokens per call

API calls per month

Claude Sonnet 4 — Input: $0.003/1K tokens · Output: $0.015/1K tokens

Per call

$0.010500

Monthly total

$10.5000

Per 1M input tokens

$3.00

Per 1M output tokens

$15.00

Understanding Claude API pricing

Claude's API is priced per token — the basic unit of text the model processes. Tokens are roughly 4 characters or 0.75 words on average. Pricing has two components: input tokens (the text you send to the model, including system prompts and conversation history) and output tokens (the text the model generates in response). Input tokens are always cheaper than output tokens.

Cost = (Input tokens × Input price per 1M) + (Output tokens × Output price per 1M)

Example: Claude Sonnet 4 ($3 input / $15 output per 1M tokens)
  Send 500 tokens, receive 200 tokens:
  Cost = (500 × $3/1,000,000) + (200 × $15/1,000,000)
       = $0.0015 + $0.0030 = $0.0045 per call

Reducing API costs: practical strategies

Choose the right model

Claude Haiku is ~20× cheaper than Sonnet per token. Use Haiku for simple classification, routing, and extraction tasks; save Sonnet and Opus for complex reasoning.

Compress system prompts

System prompts are charged on every API call. A 500-token prompt sent 10,000 times/day costs as much as 5 million tokens. Audit and trim your prompts regularly.

Use prompt caching

Anthropic’s prompt caching feature allows frequently repeated context (system prompts, documents) to be cached — reducing input token costs by up to 90%.

Limit output length

Set max_tokens to the minimum needed for your use case. Verbose outputs cost proportionally more — concise prompts that instruct brevity reduce output costs.

Batch processing

The Batch API offers 50% cost reduction for requests that don’t require real-time responses — ideal for data processing, document analysis, and bulk generation.

Cache results client-side

For identical inputs, cache the API response. A question asked 1,000 times should only be sent to the API once.

Token counting rule of thumb

English text:    ~1 token per 4 characters  (~0.75 words)
Code:            ~1 token per 3–4 characters
Non-English:     varies; CJK often 1 char = 1–2 tokens

Estimating tokens:
  Tweet (280 chars)     ≈ 70 tokens
  Blog post (1,000 words) ≈ 1,300 tokens
  Book chapter (5,000 words) ≈ 6,500 tokens
  GPT-4 context window (128K tokens) ≈ 96,000 words ≈ 192 pages

Frequently asked questions

How is Claude API pricing calculated?

You are charged per token for both input (your prompt) and output (the response), with different rates per model. This tool estimates cost from your token counts.

What is a token?

A token is a chunk of text — roughly 4 characters or 0.75 words of English. Both your prompt and the model's reply are measured in tokens.

Why are output tokens more expensive than input?

Generating text is more computationally demanding than reading it, so most providers price output tokens higher than input tokens.

How can I reduce my Claude API costs?

Trim unnecessary context from prompts, cap the output length, choose a smaller model where it suffices, and reuse cached context when supported.

Formula / How it works

Cost = (Input tokens / 1000 × input rate) + (Output tokens / 1000 × output rate) Prices per 1K tokens (as of 2025). Verify at anthropic.com/pricing as prices may change.