LIVETokens saved by mintoken users:14,328,507and counting
65-75% fewer output tokens. Same model. Same accuracy.NewContext compression — cut cache token reads

Cut your AI bill,
not your brain.

A drop-in proxy for OpenAI, Anthropic, and Google. Change one line of code, keep the same model. Mintoken trims ~65% off output tokens and now compresses conversation context too — the cache reads quietly draining 90% of your bill.

Free tier: 100k tokens/month. No credit card.

live compression
0% tokens
mintoken
the−1
database
is−1
a−1
connection
pool
that−1
is−1
used−1
to−1
borrow
and−1
return
connections
kept
databaseconnectionpoolborrowreturnconnections
14 tokens in 5 tokens out
1Without Mintoken
847 tokens

With Mintoken
tokens
Waiting to compress…
same meaning, fewer tokens0% saved
app.py
from openai import OpenAI

client = OpenAI(
    + base_url="https://api.mintoken.dev/v1",    api_key="sk-...",
)
That's the whole integration.-65% tokens →
OpenAIAnthropic ClaudeGoogle GeminiDeepSeekMistralxAI GrokGroqTogether AIPerplexityOpenRouterFireworks AICerebrasSambaNovaNVIDIA NIMHyperbolicDeepInfraCustom · BYOStreaming SSEFunction callingVisionJSON modeOpenAIAnthropic ClaudeGoogle GeminiDeepSeekMistralxAI GrokGroqTogether AIPerplexityOpenRouterFireworks AICerebrasSambaNovaNVIDIA NIMHyperbolicDeepInfraCustom · BYOStreaming SSEFunction callingVisionJSON mode
Three steps

One line of code. Two axes of savings.

01

Sign up, get a key

Create a mintoken account (free, no card). Generate an mt_live_ API key. Takes 20 seconds.

mt_live_9c59e9…
02

Change one line

Point your existing OpenAI / Anthropic / Google SDK at mintoken's base URL. Zero refactors. Your request/response format is identical.

base_url="https://api.mintoken.dev/v1"
03

Watch both sides drop

Output tokens shrink ~65% on every response. Flip on context compression and the request side starts dropping too — fewer cache reads on every turn. The dashboard shows exactly what you saved.

Saved $312 this month
New · just shippedContext compression

95% of your AI bill is cache reads.
We're the only proxy attacking that.

Long Claude Code or agent sessions accumulate huge context — file reads, tool results, old turns — that gets re-sent on every message. Anthropic's prompt cache makes those reads cheaper but doesn't reduce the volume. Mintoken does.

The receipt that made us build this

“Last month I burned 1.5 billion tokens on Claude Code. 95% of them were cache reads from re-sending the same conversation history every turn.

The prompt cache made it cheap. It didn't make it small. So I built the layer that does both.”

— vijay, founder

Last month
Total tokens1.5B
Cache reads95%
Bill from cache alone~$427
Anthropic cache-read pricing: $0.30/MTok.
Mintoken target: cut that ~50%.

Truncate huge tool results

That 10,000-line file your agent read on turn 4 keeps re-sending forever. We keep the first and last lines, drop the middle, mark the cut clearly so the model knows it happened.

−30-50%
on tool-output-heavy sessions

Deduplicate repeated reads

If your agent reads the same file across multiple turns, only the latest copy stays verbatim. Earlier copies become a tiny pointer: ‘see message N’.

O(1)
instead of O(N) for repeated files

Recency window protected

The last 6 turns are sacred. Code, errors, file paths, current work — never compressed. The science (Stanford's ‘Lost in the Middle’) says the model barely attends to the middle anyway.

0%
accuracy loss measured
Opt-in per API key

Toggle it on. Watch your cache reads halve.

Off by default. Three levels: light / standard / aggressive. If we save zero, we forward your original body untouched — the compressor never makes a request worse.

How it works
Pocket calculator

Show me the money.

Drag the slider, pick your workload, see what mintoken would actually save you. We split the savings between output and cache — because they're different problems.

500/mo
$50$5k$20k
Context compression new
Cuts cache reads on long sessions
After Mintoken
New API bill$299
Output compression saves$81
Context compression saves$120
Total savings (gross)$201
Mintoken Pro plan– $19
₹1,599 INR/mo, billed via Razorpay
Net monthly savings$182

= $2,184/year · ROI 958%

Output ratio: 65% (our public benchmark average). Context compression ratio: ~40% of cache reads on enabled keys (Tier 1 heuristic truncation + dedup; varies by session length and tool-output mix). Mintoken Pro is billed in INR (₹1,599/mo, ~$19) via Razorpay — savings shown in USD for comparison with your provider bill.

Built for people who ship, not talk.

Bootstrapping? Mintoken is your CFO.

Every dollar you spend on OpenAI is a dollar not spent on runway. Cut ~65% on AI costs, day one. Free tier covers your first 100k tokens/month while you find product-market fit.

Fit Metric
Runway extended 3-5x
founders already shipping
Real benchmarks. Public data.

Not marketing numbers.
Measured ones.

Mintoken runs an open evals harness against a 3-arm baseline (terse prompt vs. skill prompt vs. no prompt). Every release re-measures — the numbers below are the average across our full prompt suite.

Explanations
24%
Code walkthroughs
31%
Bug reports
28%
API docs
35%
Step-by-step how-tos
30%
65%
Avg. token reduction
94%
Best case (explanations)
<2s
P50 added latency
100%
Semantic accuracy retained
Pricing

Save more than you spend.

Every paid plan pays for itself after ~$60/mo of OpenAI usage. Most teams break even on day one.

Free

Kick the tires. No card needed.

$0forever
  • 100k tokens / month
  • 2 API keys
  • Basic analytics
  • Community support
  • OpenAI, Anthropic, Google proxy
Start free
MOST POPULAR

Pro

For indie devs and small teams shipping.

$19/ month
Billed in USD via Paddle (VAT / sales tax handled)
  • 10M tokens / month
  • 10 API keys
  • Smart auto-detect compression
  • Custom vocabulary rules
  • Full analytics dashboard
  • Priority email support
Start 14-day trial

Team

Org-wide keys, shared analytics, roles.

$49/ month
Billed in USD via Paddle (VAT / sales tax handled)
  • 100M tokens / month
  • Unlimited API keys
  • Team workspace + SSO
  • Per-member usage breakdowns
  • Auto-compress input prompts
  • Dedicated Slack channel
Start trial

Enterprise

Self-host, audit logs, SSO/SAML, SLA.

Custom
  • Unlimited tokens
  • SOC2 / GDPR ready
  • Self-hosted option
  • Custom SLA
  • Dedicated success engineer
  • Air-gap deployments
Talk to us

All plans include zero-spend protection: hit your limit and requests fail fast — we never auto-charge overage.

Questions, answered.

No. Mintoken is a transparent proxy that speaks the native OpenAI, Anthropic, and Google APIs. Your request/response JSON is identical — you change only the base URL. We run the full test suite of each provider's SDK against mintoken.

Still have questions? Email hello.mintoken@gmail.com →

Start saving.
Like, right now.

100k tokens free. No card. 20 seconds to a working API key.

✓ OpenAI compatible✓ Anthropic compatible✓ Google compatible✓ Streaming works