Tokenminning in Cline – Tokenminning

Cline is an open-source VS Code agent extension. Every turn resends your prompt, custom instructions, MCP tool schemas, attached files, and growing task history. Most spend comes from auto-approved tool loops, heavy MCP, and marathon threads—not from one verbose reply.

Work through the sections below in order. For the general technique stack, see Where to start. For underlying patterns, see Context hygiene, Model routing, and Prompt hygiene. If you route through OpenRouter, read that guide for gateway-level caps and model lists.

Quick checklist

Check your provider dashboard (Anthropic, OpenRouter, etc.) or Cline’s task cost summary after a heavy week.
Use a mid-tier or fast model for routine edits. Reserve frontier models for tasks that actually need them.
Shorten Custom Instructions and .clinerules — one concern per file, glob-scoped where possible.
Disable MCP servers you are not using this week.
Turn off auto-approve for destructive tools until you trust the task scope.
Start a new task per goal—not one thread across unrelated work.

Typical impact when you follow the list: 40–70% savings routing routine work to cheaper models; 20–50% on input by trimming rules and MCP; 30–60% less context growth from shorter tasks. Benchmark on your own provider bill—your API key and default model will differ from anyone else’s.

How Cline bills a request

Cline is BYOK: you supply API keys for Anthropic, OpenAI, OpenRouter, or other providers. There is no bundled subscription pool—every token bills at provider rates (plus any OpenRouter gateway fee).

Each agent step sends:

Your prompt and any @ file or folder attachments
Custom Instructions (global and workspace)
.clinerules files when matched
MCP tool schemas for every enabled server
Prior messages and tool outputs in the current task

Cline shows approximate cost per task in the UI when the provider returns usage metadata. OpenRouter and Anthropic include token counts in responses; treat your provider dashboard as billing truth.

1. Measure first

Where to look:

Provider dashboard — Anthropic, OpenAI, OpenRouter Activity , etc.
Cline task history — cost hints per completed task when available
opencode stats-style tooling does not apply; export or note heavy weeks manually

After a heavy week, check whether spend is input-heavy (MCP, rules, attachments) or output-heavy (frontier model, long tool loops). That tells you which section below to prioritize.

2. Match the model to the task

See Cline’s model picker and your provider’s pricing page. This is Cline’s version of Model routing: default cheap, escalate only on failure.

Start here:

Fast / haiku-tier — log checks, renames, single-file fixes
Mid-tier (Sonnet, GPT-4.1) — multi-file refactors, most agent work
Frontier (Opus, o3, thinking) — deep debugging or novel design only

Costs more than you expect:

Same frontier model for every @-mention question
Plan mode followed by Act mode on the same task without narrowing scope
Reasoning models with long internal chains
OpenRouter :nitro or throughput-sorted routing when price would suffice

If you use OpenRouter, set explicit model IDs and a models fallback array—see Tokenminning with OpenRouter.

3. Trim what rides along every request

Input bloat in Cline usually comes from configuration and MCP—not your prompt text alone.

Custom instructions and `.clinerules`

Global Custom Instructions inject into every task. Workspace .clinerules compound the same way Cursor rules do.

Keep instructions short — build commands, architecture one-liners, non-obvious conventions
Split rules by concern; avoid one megabyte-style rules file
Do not duplicate the same content in Custom Instructions, .clinerules, and CLAUDE.md
Reference linter configs by path instead of pasting style guides

MCP servers

Each enabled MCP server adds tool schemas to context—even when no tool is called.

Disable servers you are not using this week
One narrow, task-specific server beats five overlapping ones
Prefer built-in file read/search over MCP wrappers when the repo is local

Auto-approve and Act mode

Auto-approved tool calls remove friction but multiply spend when the agent loops.

Require approval for bash, write, and browser tools until the task is scoped
Use Plan first; switch to Act only after you agree on files and approach
Set a mental step budget—if the agent exceeds ~10 tool rounds, stop and restate scope

`@` attachments

Attaching folders or large files front-loads tokens on every subsequent turn.

Prefer a focused prompt: “fix spacing in Navbar.tsx only”
Use single-file @ references instead of entire packages
Let the agent search when it needs more context

See Context hygiene for the general just-in-time retrieval pattern.

New task per goal

Start a new Cline task when you finish one goal and begin another, when you switch models, or when the agent loops on a stuck problem.

4. Write tighter prompts

Cline-specific versions of Prompt hygiene:

Too broad:


Fix this bug. Also review the whole auth system and suggest improvements.

Scoped:


Fix ONLY the null check in auth/login.ts line 42.
No explanations. Max 1 file changed.

Batch related fixes in one message instead of five separate agent turns. Reject bad diffs early—each revision cycle is another output bill.

5. Set spending guardrails

Cline does not enforce your inference budget. You set the limits.

Set spending caps on your provider or OpenRouter API keys
Use separate API keys per project or environment
Disable auto-approve on destructive tools in shared repos
Review provider Activity weekly after heavy agent sessions

For metering and caps in products you ship, see Article I and Article IV.

Troubleshooting

High input — Custom Instructions, .clinerules, or MCP bloat. Shorten rules; disable unused MCP.

High output — frontier model, auto-approved loops, or many revision cycles. Cheaper model; approval gates; tighter prompts.

Task never finishes — scope too broad or auto-approve on bash. New task with acceptance criteria; require approval.

Spike after enabling MCP — tool schemas attach every turn. Disable unused servers.

OpenRouter bill higher than Cline estimate — gateway fee plus provider rate. Check OpenRouter Activity for the model that actually ran.

When Cline optimization is not enough

Trimming Cline configuration does not fix production agent loops in your own API routes. If customer-facing features dominate spend, instrument with per-feature tags and apply Context hygiene, Prompt caching, and Output and RAG. Narev provides normalized USD across providers if you need cross-provider cost math.