API reference

Text compression

POST /v1/compresstakes a blob of text and returns a compressed version plus a validation report. Useful for pre-processing long system prompts, RAG chunks, or any static content you'll reuse many times.

When to use this

The proxy endpoints compress output. This endpoint compresses input. Reach for it when you have:

A long system prompt that ships with every request — compress once, cache the result, save tokens on every call.
RAG chunks retrieved before sending to the LLM — trim filler before they consume your context window.
Documentation or reference material you're feeding the model as context.

Not for conversational input

Don't compress user-generated chat messages — you lose nuance and intent. This endpoint is for your own static content where you control the original.

Request

curl https://api.mintoken.dev/v1/compress \
  -H "Authorization: Bearer mt_live_xxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "When you are designing a REST API, you need to think carefully...",
    "run_validation": true
  }'

Parameters

text (required) — the markdown / plain text to compress.
run_validation (default true) — after compression, check that code blocks, URLs, file paths, and headings were preserved. Adds ~150ms to the response.
provider_key (optional) — override the server-side LLM key used for compression. Most users leave this blank.

Response

{
  "compressed": "When designing REST API, carefully consider auth...",
  "original_tokens": 87,
  "compressed_tokens": 39,
  "savings_percent": 55.2,
  "valid": true,
  "errors": [],
  "warnings": []
}

compressed — the compressed text, ready to use.
original_tokens / compressed_tokens— tiktoken counts using OpenAI's o200k_base encoding.
savings_percent — how much smaller the compressed version is.
valid — true if validation passed (all code blocks, URLs, and headings preserved).
errors — fatal validation failures (e.g. a code block disappeared). Mintoken retries up to twice to fix these; if the array is non-empty, retries failed and you may want to skip using the compressed version.
warnings — non-fatal issues (e.g. a URL slightly rephrased).

Limits

Max input size: 100,000 characters (~25k tokens). For larger inputs split into chunks and compress each.
Plan quotas apply — every compression counts toward your monthly token allowance (input + output tokens used during the compression call).

← Previous

Streaming

Analytics