Cut your AI bill,
not your brain.
A drop-in proxy for OpenAI, Anthropic, and Google. Change one line of code, keep the same model. Mintoken trims ~65% off output tokens and now compresses conversation context too — the cache reads quietly draining 90% of your bill.
Free tier: 100k tokens/month. No credit card.
from openai import OpenAI
client = OpenAI(
+ base_url="https://api.mintoken.dev/v1", api_key="sk-...",
)One line of code. Two axes of savings.
Sign up, get a key
Create a mintoken account (free, no card). Generate an mt_live_ API key. Takes 20 seconds.
mt_live_9c59e9…Change one line
Point your existing OpenAI / Anthropic / Google SDK at mintoken's base URL. Zero refactors. Your request/response format is identical.
base_url="https://api.mintoken.dev/v1"Watch both sides drop
Output tokens shrink ~65% on every response. Flip on context compression and the request side starts dropping too — fewer cache reads on every turn. The dashboard shows exactly what you saved.
Saved $312 this month95% of your AI bill is cache reads.
We're the only proxy attacking that.
Long Claude Code or agent sessions accumulate huge context — file reads, tool results, old turns — that gets re-sent on every message. Anthropic's prompt cache makes those reads cheaper but doesn't reduce the volume. Mintoken does.
“Last month I burned 1.5 billion tokens on Claude Code. 95% of them were cache reads from re-sending the same conversation history every turn.
The prompt cache made it cheap. It didn't make it small. So I built the layer that does both.”
— vijay, founder
Mintoken target: cut that ~50%.
Truncate huge tool results
That 10,000-line file your agent read on turn 4 keeps re-sending forever. We keep the first and last lines, drop the middle, mark the cut clearly so the model knows it happened.
Deduplicate repeated reads
If your agent reads the same file across multiple turns, only the latest copy stays verbatim. Earlier copies become a tiny pointer: ‘see message N’.
Recency window protected
The last 6 turns are sacred. Code, errors, file paths, current work — never compressed. The science (Stanford's ‘Lost in the Middle’) says the model barely attends to the middle anyway.
Toggle it on. Watch your cache reads halve.
Off by default. Three levels: light / standard / aggressive. If we save zero, we forward your original body untouched — the compressor never makes a request worse.
Show me the money.
Drag the slider, pick your workload, see what mintoken would actually save you. We split the savings between output and cache — because they're different problems.
= $2,184/year · ROI 958%
Output ratio: 65% (our public benchmark average). Context compression ratio: ~40% of cache reads on enabled keys (Tier 1 heuristic truncation + dedup; varies by session length and tool-output mix). Mintoken Pro is billed in INR (₹1,599/mo, ~$19) via Razorpay — savings shown in USD for comparison with your provider bill.
Built for people who ship, not talk.
Bootstrapping? Mintoken is your CFO.
Every dollar you spend on OpenAI is a dollar not spent on runway. Cut ~65% on AI costs, day one. Free tier covers your first 100k tokens/month while you find product-market fit.
Not marketing numbers.
Measured ones.
Mintoken runs an open evals harness against a 3-arm baseline (terse prompt vs. skill prompt vs. no prompt). Every release re-measures — the numbers below are the average across our full prompt suite.
Save more than you spend.
Every paid plan pays for itself after ~$60/mo of OpenAI usage. Most teams break even on day one.
Free
Kick the tires. No card needed.
- 100k tokens / month
- 2 API keys
- Basic analytics
- Community support
- OpenAI, Anthropic, Google proxy
Pro
For indie devs and small teams shipping.
- 10M tokens / month
- 10 API keys
- Smart auto-detect compression
- Custom vocabulary rules
- Full analytics dashboard
- Priority email support
Team
Org-wide keys, shared analytics, roles.
- 100M tokens / month
- Unlimited API keys
- Team workspace + SSO
- Per-member usage breakdowns
- Auto-compress input prompts
- Dedicated Slack channel
Enterprise
Self-host, audit logs, SSO/SAML, SLA.
- Unlimited tokens
- SOC2 / GDPR ready
- Self-hosted option
- Custom SLA
- Dedicated success engineer
- Air-gap deployments
All plans include zero-spend protection: hit your limit and requests fail fast — we never auto-charge overage.
Questions, answered.
Still have questions? Email hello.mintoken@gmail.com →
Start saving.
Like, right now.
100k tokens free. No card. 20 seconds to a working API key.