Mintoken CLI
A local-proxy CLI that sits between your AI coding tool — Claude Code, Codex, Cursor, Continue.dev, your own LangChain agent — and the upstream LLM. Dedupes repeated tool results, truncates the boring middle of giant logs, saves 40-70% of input tokens on long agent sessions. Same model output, no SDK changes, one env var swap.
api.mintoken.dev, the CLI runs on localhost. Your Claude Code OAuth token (or any API key) is forwarded upstream unchanged and never leaves your machine. Compression happens in your own process, in memory, in milliseconds.Why use the CLI vs the cloud proxy
Both offer the same compression brain. The difference is where the auth token lives:
- Cloud proxy (
api.mintoken.dev) — best for apps where you control the auth (your own backend, your own SaaS). The API key is yourmt_live_…and you supply the upstream provider key per-request via theX-Provider-Keyheader. - Local CLI (this page) — best for tools that authenticate via OAuth or store credentials locally (Claude Code, Codex, Cursor). The proxy on localhost forwards their existing auth header unchanged. No new keys, no token-handling concerns.
Install
# Install from PyPI (when published)
pip install mintoken-cli
# Or install from source while we're shipping fast
git clone https://github.com/Vijay-2005/mintoken
cd mintoken/mintoken-cli
pip install -e .Quickstart — Claude Code + Codex side by side
One CLI, one port, both providers. Run it once and point both tools at it.
# (optional) save your mintoken key so the dashboard tracks savings
mintoken login --key mt_live_xxxxx
# start the local proxy
mintoken proxy
# in another shell, point Claude Code at it
export ANTHROPIC_BASE_URL="http://127.0.0.1:8788"
claude
# or point Codex at it
export OPENAI_BASE_URL="http://127.0.0.1:8788/v1"
codexOn first start you'll see an orange-bordered banner with the listening URL, your selected compression level, and the masked mintoken key. Both providers are routed on a single port (default 8788) — Claude Code hits /v1/messages, Codex hits /v1/chat/completions or /v1/responses, the proxy routes correctly.
How it works
Every request your tool makes goes localhost-first:
Claude Code / Codex / your app
│
│ BASE_URL → localhost
▼
┌─────────────────────────┐
│ mintoken-cli │
│ on localhost │
│ │
│ • dedupes tool calls │
│ • truncates logs │
│ • reports stats * │
└────────────┬────────────┘
│
│ forwards with same auth header
▼
api.anthropic.com / api.openai.com
* optional → api.mintoken.dev for dashboard analyticsCompression runs before the upstream provider sees the request. Anthropic and OpenAI bill you for the smaller body — which directly translates to fewer tokens consumed against your Claude Code 5-hour window or your OpenAI per-token bill.
Compression levels
Pass --level to mintoken proxy or set it in ~/.mintoken/config.json.
- light — engages above 8,000 input tokens. Truncates tool results larger than 4,000 tokens. Last 8 turns untouched. Conservative, almost zero risk.
- standard (default) — engages above 4,000 input tokens. Truncates tool results larger than 2,000 tokens. Last 6 turns untouched. The production sweet spot.
- aggressive — engages above 2,000 input tokens. Truncates tool results larger than 1,000 tokens. Last 4 turns untouched. Maximum savings; may compress content the model might still want.
Verify it's working
The proxy attaches three response headers to every forwarded request, so you can see compression engaged on each call without leaving your terminal:
HTTP/1.1 200 OK
content-type: application/json
x-mintoken-cli-tokens-before: 9027
x-mintoken-cli-tokens-after: 2947
x-mintoken-cli-tokens-saved: 6080Or open your dashboard— savings flow there in real time once you've set --key. CLI traffic shows up under endpoint labels like cli-messages and cli-chat-completions.
Use it with your own SDK code
Any code that talks to OpenAI or Anthropic via their official SDK can use the CLI proxy by changing one line:
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8788/v1", # ← local proxy
api_key="sk-proj-...", # ← your real OpenAI key
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "hello"}],
)The proxy preserves the request and response shape exactly — your app behaves identically, just with a smaller bill.
What never gets touched
- The system prompt
- The most recent N messages (recency window — protected per level)
- Mid-flight
tool_use/tool_resultpairs (no orphan calls) - Conversations under the per-level minimum threshold
If compression somehow produces a bigger body than the original (rare edge case where the truncation marker overhead exceeds savings), the proxy returns the original untouched. Compression can never increase what you send upstream.
Privacy
- Your auth token (OAuth or API key) is forwarded upstream unchanged through localhost. It never leaves your machine and is never inspected by the proxy.
- Request bodies are compressed in memory. Prompts, code, file contents — never logged, never written to disk.
- The only thing sent to
api.mintoken.devis a small metric record per request: model name, token counts, duration, compression result. No content. No headers. No prompts. - If you don't set a mintoken key, no telemetry is sent at all — fully local mode.
CLI reference
mintoken proxy
Start the local HTTP proxy.
--port-p— local port. Default8788.--host— bind address. Default127.0.0.1. Don't expose to0.0.0.0on a shared network.--key— mintoken API key. Reads fromMINTOKEN_KEYenv var or~/.mintoken/config.jsonif unset. Optional — without it the proxy runs in fully-local mode.--level-l—light·standard·aggressive. Defaultstandard.--anthropic-upstream— override the Anthropic base URL (e.g. point to a different region or self-hosted relay). Defaulthttps://api.anthropic.com.--openai-upstream— same for OpenAI. Defaulthttps://api.openai.com.
mintoken login
Save your mintoken key to ~/.mintoken/config.jsonso you don't have to pass --key on every run. Pass --key mt_live_… non-interactively, or omit for an interactive prompt.
mintoken status
Print current config (key prefix, level, upstream URLs). Useful to confirm which key the proxy will use before starting it.
mintoken version
Print the installed CLI version.
Environment variables
MINTOKEN_KEY— yourmt_live_…key. Used for telemetry only.MINTOKEN_TELEMETRY_URL— override the telemetry endpoint (used in self-hosted deployments). Default:https://api.mintoken.dev/v1/cli-telemetry.ANTHROPIC_BASE_URL— set this tohttp://127.0.0.1:8788to route Claude Code through the proxy.OPENAI_BASE_URL— set this tohttp://127.0.0.1:8788/v1to route Codex / OpenAI clients.
Troubleshooting
My tool says "authentication failed"
The proxy forwards your auth header unchanged — if the upstream provider rejects it, that's an upstream problem, not a proxy problem. Confirm by setting ANTHROPIC_BASE_URL back to its default and retrying. If your tool works without the proxy but fails with it, open an issue with the response status code from the headers.
My dashboard shows no CLI savings
- Did you pass
--keyor runmintoken login? Without a key, telemetry stays local. - Is the dashboard hitting the right account? CLI telemetry is tagged to the user_id behind the
mt_live_…key. - Are your conversations big enough to trigger compression? Below the per-level minimum the proxy is a transparent passthrough.
Port 8788 is already in use
Pass --port 9000(or any free port) and update your tool's BASE_URLenv var to match. The proxy doesn't need a privileged port.
Open source
The CLI is MIT-licensed and lives in the same monorepo as the cloud proxy at github.com/Vijay-2005/mintoken under mintoken-cli/. PRs and issues welcome.