Introduction
Welcome to Tokenminning — a community wiki for understanding, measuring, and optimizing LLM token usage.
Whether you are building AI products, running FinOps for AI, or just trying to understand why your API bill keeps climbing, this wiki is a starting point for practical knowledge about token economics.
What is tokenminning?
In the context of large language models, tokenminning is the work of extracting value from every token you send and receive:
- Understanding how providers price input, output, cache, and reasoning tokens
- Measuring usage across models, features, and customers
- Optimizing prompts, context windows, and model selection to reduce cost without sacrificing quality
Modern AI workloads consume far more tokens than early chat apps. Longer context windows, agentic workflows, and test-time compute can turn a 500-token query into 50,000 tokens. Tokenminning helps you stay ahead of that inflation.
Start here
- The Tokenminning Constitution — engineering law for production AI stacks
- Concepts — deeper dives into token economics
In the news
- Tokenminning mentioned in The New York Times — June 18, 2026
Community
Tokenminning is maintained by Narev (opens in a new tab). Join the community: