Skip to Content
IDEsCline

Tokenminning in Cline

Cline is an open-source VS Code agent extension. Every turn resends your prompt, custom instructions, MCP tool schemas, attached files, and growing task history. Most spend comes from auto-approved tool loops, heavy MCP, and marathon threads—not from one verbose reply.

Work through the sections below in order. For the general technique stack, see Where to start. For underlying patterns, see Context hygiene, Model routing, and Prompt hygiene. If you route through OpenRouter, read that guide for gateway-level caps and model lists.

Quick checklist

  1. Check your provider dashboard (Anthropic, OpenRouter, etc.) or Cline’s task cost summary after a heavy week.
  2. Use a mid-tier or fast model for routine edits. Reserve frontier models for tasks that actually need them.
  3. Shorten Custom Instructions and .clinerules — one concern per file, glob-scoped where possible.
  4. Disable MCP servers you are not using this week.
  5. Turn off auto-approve for destructive tools until you trust the task scope.
  6. Start a new task per goal—not one thread across unrelated work.

Typical impact when you follow the list: 40–70% savings routing routine work to cheaper models; 20–50% on input by trimming rules and MCP; 30–60% less context growth from shorter tasks. Benchmark on your own provider bill—your API key and default model will differ from anyone else’s.

How Cline bills a request

Cline is BYOK: you supply API keys for Anthropic, OpenAI, OpenRouter, or other providers. There is no bundled subscription pool—every token bills at provider rates (plus any OpenRouter gateway fee).

Each agent step sends:

  • Your prompt and any @ file or folder attachments
  • Custom Instructions (global and workspace)
  • .clinerules files when matched
  • MCP tool schemas for every enabled server
  • Prior messages and tool outputs in the current task

Cline shows approximate cost per task in the UI when the provider returns usage metadata. OpenRouter and Anthropic include token counts in responses; treat your provider dashboard as billing truth.

1. Measure first

Where to look:

  • Provider dashboard — Anthropic, OpenAI, OpenRouter Activity , etc.
  • Cline task history — cost hints per completed task when available
  • opencode stats-style tooling does not apply; export or note heavy weeks manually

After a heavy week, check whether spend is input-heavy (MCP, rules, attachments) or output-heavy (frontier model, long tool loops). That tells you which section below to prioritize.

2. Match the model to the task

See Cline’s model picker and your provider’s pricing page. This is Cline’s version of Model routing: default cheap, escalate only on failure.

Start here:

  • Fast / haiku-tier — log checks, renames, single-file fixes
  • Mid-tier (Sonnet, GPT-4.1) — multi-file refactors, most agent work
  • Frontier (Opus, o3, thinking) — deep debugging or novel design only

Costs more than you expect:

  • Same frontier model for every @-mention question
  • Plan mode followed by Act mode on the same task without narrowing scope
  • Reasoning models with long internal chains
  • OpenRouter :nitro or throughput-sorted routing when price would suffice

If you use OpenRouter, set explicit model IDs and a models fallback array—see Tokenminning with OpenRouter.

3. Trim what rides along every request

Input bloat in Cline usually comes from configuration and MCP—not your prompt text alone.

Custom instructions and .clinerules

Global Custom Instructions inject into every task. Workspace .clinerules compound the same way Cursor rules do.

  • Keep instructions short — build commands, architecture one-liners, non-obvious conventions
  • Split rules by concern; avoid one megabyte-style rules file
  • Do not duplicate the same content in Custom Instructions, .clinerules, and CLAUDE.md
  • Reference linter configs by path instead of pasting style guides

MCP servers

Each enabled MCP server adds tool schemas to context—even when no tool is called.

  • Disable servers you are not using this week
  • One narrow, task-specific server beats five overlapping ones
  • Prefer built-in file read/search over MCP wrappers when the repo is local

Auto-approve and Act mode

Auto-approved tool calls remove friction but multiply spend when the agent loops.

  • Require approval for bash, write, and browser tools until the task is scoped
  • Use Plan first; switch to Act only after you agree on files and approach
  • Set a mental step budget—if the agent exceeds ~10 tool rounds, stop and restate scope

@ attachments

Attaching folders or large files front-loads tokens on every subsequent turn.

  • Prefer a focused prompt: “fix spacing in Navbar.tsx only”
  • Use single-file @ references instead of entire packages
  • Let the agent search when it needs more context

See Context hygiene for the general just-in-time retrieval pattern.

New task per goal

Start a new Cline task when you finish one goal and begin another, when you switch models, or when the agent loops on a stuck problem.

4. Write tighter prompts

Cline-specific versions of Prompt hygiene:

Too broad:

Fix this bug. Also review the whole auth system and suggest improvements.

Scoped:

Fix ONLY the null check in auth/login.ts line 42. No explanations. Max 1 file changed.

Batch related fixes in one message instead of five separate agent turns. Reject bad diffs early—each revision cycle is another output bill.

5. Set spending guardrails

Cline does not enforce your inference budget. You set the limits.

  • Set spending caps on your provider or OpenRouter API keys 
  • Use separate API keys per project or environment
  • Disable auto-approve on destructive tools in shared repos
  • Review provider Activity weekly after heavy agent sessions

For metering and caps in products you ship, see Article I and Article IV.

Troubleshooting

High input — Custom Instructions, .clinerules, or MCP bloat. Shorten rules; disable unused MCP.

High output — frontier model, auto-approved loops, or many revision cycles. Cheaper model; approval gates; tighter prompts.

Task never finishes — scope too broad or auto-approve on bash. New task with acceptance criteria; require approval.

Spike after enabling MCP — tool schemas attach every turn. Disable unused servers.

OpenRouter bill higher than Cline estimate — gateway fee plus provider rate. Check OpenRouter Activity  for the model that actually ran.

When Cline optimization is not enough

Trimming Cline configuration does not fix production agent loops in your own API routes. If customer-facing features dominate spend, instrument with per-feature tags and apply Context hygiene, Prompt caching, and Output and RAG. Narev  provides normalized USD across providers if you need cross-provider cost math.

Last updated on