—

Choose the right AI model
across 22 providers
before you go into production.

—/22

Anthropic

—

spin the dial · tap a size · tap the centre to open a model

Private by design · runs entirely in your browser

Just a taster — mid tier only. The full calculator does every tier, batch & cache savings, replies and your real word counts →

Compare your real workload across 22 leading AI providers and understand cost, quality, speed, context limits and the real trade-offs before you deploy. No sign-up. No prompts stored. No pasted text sent anywhere.

Pick any of 22 providers to compare

What you're actually paying for

You Wait, what even is a token?

6 words / 8 tokens / $0.000024 ↓ prompt

—

0 words / 0 tokens / $0.00 ↑ reply

total this exchange / 86 tokens / $0.001194

Under two thousandths of a cent. But a million messages a day is $1,194. Scale is everything — which is what the tool below shows.

Tap a content type live

📧

Email ~300 words / ~400 tokens

— ↓ input

— ↑ output

— batch 50% off

Three pricing modes to compare

Standard

Full rate

Batch API

50% off

Cache hit

90% off input

2 Compare every provider one fast table when you want the full market →

Current focus

Pricing Anthropic against the market — switch provider below or start with your workload.

22providers tracked

Nightlyprice checks

500xrate spread shown

Every reply re-reads the entire conversation

—

Context re-read (grows)

Output written (fixed)

Now try it with your own content

Type a word count, paste any text, or pick a content type. Drag the conversation slider. Toggle Batch and Cache. Everything you just saw — but live, with your numbers.

—

Common questions

What does it actually cost to run an AI API call?

It depends on the model and how much text goes in and out. TokenScale shows real examples. Writing the whole of The Hobbit costs about $0.06 on Gemini Flash-Lite, one of the cheapest tiers tracked. The same job costs far more on a frontier model. Pick a content size and a provider to see the live figure.

Why does the output price matter more than the input price?

Most providers charge more for the tokens a model generates than for the tokens you send. Output often costs three to five times more than input. A long answer to a short question can cost more than it looks. TokenScale splits every price into input and output so the gap is visible.

How much do AI prices differ between providers?

A lot. Across the 22 providers TokenScale tracks, input rates span roughly 500× — from $0.02 to $10 per million tokens — and even models aimed at similar work are routinely more than thirty times apart. Choosing the right provider for a task can cut the bill by an order of magnitude. The comparison table sorts every provider cheapest first.

What is the cheapest way to run a large AI workload?

Three things lower the bill. Pick a low-cost provider, use Batch mode where it is offered for fifty percent off, and reuse cached context for up to ninety percent off input. TokenScale lets you toggle Standard, Batch and Cache to see each saving. Open-weight models on inference hosts are usually the cheapest of all.

Is TokenScale free?

Yes. TokenScale is free, with no sign-up and no account. It runs entirely in your browser. Nothing you type is sent anywhere or stored.

How often is the pricing updated?

Every night. An automated check reads each provider's pricing and records it, building a verified price history you can scroll back through. The latest verified date is shown on the site.

4.1 ★★★★☆ AI Critics

Explore ▾

Tap to enter a word count ↑ or use the slider

drag the slider to explore

⚑ report a pricing error

— —

tap to edit

↕ tokens

$0.00

↓ input

$0.00

↑ output

$0.00

= total

Energy lens — set a word count above

—

of 100W light

Query

—

vs one kettle boil

Kettle Index

—

@ 1M users/day

Carbon

—

gCO₂e per query

Estimated from token count, split ~80% input / 20% output — output decode uses ~3–5× more energy per token than input prefill (Luccioni et al. 2023). Output-heavy tasks (code generation, long-form writing) may use up to 2× more than shown.
Lite ~0.0001 / 0.0004 · Mid ~0.0003 / 0.0012 · Pro ~0.0008 / 0.0035 Wh per 1K in / out tokens.
MoE models (DeepSeek, some open-hosted) adjusted to ~30% of dense equivalent (DeepSeek-V3 report 2024). Range is an illustrative band, not a confidence interval — reflects hardware generation, PUE 1.1–1.6, and batch size (IEA 2024). Carbon uses provider-specific grid mix (IEA 2023): DeepSeek ~580, US providers ~380, global fallback 436 gCO2/kWh. The idea started watching a kettle boil — one litre takes ~0.1 kWh, enough for 30–300 queries like this.

type tokens · set word count above · paste text below

⚡ Live Token Counter ⚡ paste or type to count tokens & cost

⚠ Exceeds 200K limit — must chunk across multiple API calls.

you're on

—

total cost

—

tap a column header to switch tiers & see your cost change

① SMALL

Lite · fastest · cheapest

↓

② MID

Mid · balanced · default

↓

③ FLAGSHIP

Pro · flagship · powerful

↓

① ② ③ three tiers · tap a column to switch

—

↓ input

—

↑ output

—

= total

—

↓ input

—

↑ output

—

= total

—

↓ input

—

↑ output

—

= total

—

One model, every tier — this provider has a single model, so all three pricing tiers are identical.

💡

Switch mode to see your savings

Batch API cuts 50% · Cache hit cuts 90% on input

26 JulNo price moves overnight · 22 providers verifiedHistory →

TokenScale

Market

22 providers. Ranked by rate.

— lowest rate

—× most vs least

3 providers compared

Every rate is the provider’s own published price, per million tokens, re-checked nightly. Small · Mid · Flagship = each provider’s small, mid & large model — the actual model name is always shown.

Tier Provider
Provider · Model	Rate /1M ↑	Cost · vs you

Standard rates · switch mode below

Prices apply current pricing mode · your provider highlighted · drag slider to see costs

Tier
Provider · Model	Rate /1M	Cost · vs you

Standard rates · switch mode below

Fixed A→Z order · rows stay in place as you switch tiers

Tier

tap a provider to compare

step lines · list price per 1M tokens · hover a line to spotlight it · click it for the provider chart · verified nightly

📖 The Novel Index · checked every night

What it costs to make an AI write a whole novel

The lowest price we find across all 22 providers to generate every one of the 95,356 words in The Hobbit — re-priced each night from each provider's own published rates.

A plain yardstick for how far the price of AI writing has fallen.

Today's lowest

$0.0038

See how we get this →

The Manager's Notepad

July 2026 · Notable price changes

📋 See the full price-change log →

July opened with a reckoning: Claude Fable 5 returned — to Anthropic's price list at $10/$50, and to this board, re-auditing all 66 tiers in one sweep. 40 exact, 26 fixed, every miss published.

Jul 1 Anthropic Pro (Fable 5 returns — retakes the slot; Opus 4.8 stays listed) $5→$10 in · $25→$50 out it's back

Jul 1 xAI Grok Lite (4.1 Fast retired — old slugs bill 6×) $0.20→$1.00 in · $0.50→$2.00 out model swap

Jul 1 Mistral AI Flagship (Large 3 — now cheaper than Medium) $2.00→$0.50 in · $6.00→$1.50 out model swap

Jul 1 Audit ×22 Phantoms out: K2.6 Turbo · M2.7 Pro · Behemoth 40/66 exact · 24 tiers fixed audit

Verified nightly · 1 July 2026 · Bilton Projects

👆 Tap any row to explore the full price chart

✓ Verified monthly by Bilton Projects · June 2026

⚡ Compare providers: 💻 Shift+click rows · 📱 hold ½ sec — build a list, export CSV

pick something familiar

Costs compound — watch every reply re-read everything Drag the slider · see context dominate · understand your bill ▾

each reply re-reads everything · costs compound fast

THE COMPOUNDING EFFECT

Every reply re-reads
the entire conversation

Each time the AI responds, it processes all previous messages from scratch — not just your question. By reply 5, you're paying for 15 context re-reads. By reply 10, context dominates your bill.

↓ drag the reply slider to watch costs stack in real time

context re-read (grows) output written (fixed)

Set your scenario

Replies in conversation Reply 1 of 10

reply +10 +100

↓ Context re-read each reply = the growing blue bar

words ×2 +1K

↑ Output written per reply = the red bar

words ×2 +250

Model

Blue = context re-read (grows) Red = output written (fixed)

— in : out ratio —

per-call cost × volume = real budget · see the scale

PRODUCTION REALITY CHECK

Fractions of a cent become
real budget at scale

A $0.0001 per-call cost looks free. At 1,000 calls/day it's $3/month — manageable. At 100K calls/day it's $300/month. Set your actual volume and discover where AI fits in your budget.

↓ drag the sliders to reveal your production number

API calls per day production volume

1,000 calls +10K +100K

Output words per call avg. reply length

300 words +250 +1000

Per call

—

enter a word count

Per day

—

Per month

—

30 days

Per year

—

365 days

Common questions ∨

What does an AI API call actually cost?

Why does output price matter more than input?

How much do prices differ between providers?

What is the cheapest way to run a large workload?

Provider name

provider type

Everything you need to understand AI model pricing, providers and how to choose.

⚠ Report a pricing error

Help keep TokenScale accurate

What's wrong?

Source URL (optional but helpful)

Your email (optional — only if you want a reply)

Help me choose

Question 1 of 5

What do you mainly want to use AI for?

Select all that apply

Our recommendation

—

Also worth considering

Learn more

Tap to close

Choose the right AI modelacross 22 providersbefore you go into production.

What does it actually cost to run an AI API call?

Why does the output price matter more than the input price?

How much do AI prices differ between providers?

What is the cheapest way to run a large AI workload?

Is TokenScale free?

How often is the pricing updated?

Choose the right AI model
across 22 providers
before you go into production.