LLM Token Counter and Context Budget Planner - Estimate Tokens and Visualize Context Window Usage

First Published: 2026-06-24
Last Updated: 2026-07-04

Estimate how many tokens your prompt, conversation history, and tool definitions use, then see at a glance whether the input plus your reserved output fits inside a model's context window.

All processing is performed entirely in your browser using client-side JavaScript. No data is transmitted to any server, and no API calls are made. Your text never leaves your device.

⚠️ IMPORTANT DISCLAIMER:

This tool is provided "AS IS" without any warranties of any kind.
Token counts shown are estimates produced in your browser and may differ from a model's actual tokenizer.
For exact counts, use the provider's official token counting API.
The author accepts no responsibility for decisions based on these estimates.
By using this tool, you accept full responsibility for any outcomes.

This tool uses client-side JavaScript for all processing. No data is transmitted to servers, no files are uploaded online, all processing happens locally in your browser. Once loaded, this tool continues to work even without an internet connection. For more details, please refer to our Web Tools Disclaimer.

Model and Output Reserve

Model

Context window: 0 tokens · Max output: 0 tokens
Tokenizer: -

Output tokens to reserve

4,096

Range: 0 to 0 tokens (model max output)

Input

Text0

Characters: 0 Words: 0 Estimated tokens: 0

System prompt0 tok Tool definitions (JSON)0 tok

Conversation history0 tok

Context Budget

0Used (input + reserve)

0Context window

0%Utilization

0Remaining

Estimated input total: 0 tokens

What-if

Drop the last N history turns

Shorten the system prompt by 0%

Load from File

Drop a .txt file (plain text) or a .json messages array here, or click to browse.

Examples:

About This Tool

A token is the unit a language model actually processes - a chunk of text the tokenizer treats as one piece, not a word or a character. The context window is the total token budget a model can consider at once, and it is shared by everything you send (system prompt, tool definitions, conversation history) and the space reserved for the model's response. When the input plus the reserved output approaches the window limit, requests fail or history has to be trimmed. This planner estimates the token cost of each part and shows how it all fits.

How the estimate works (and why it is an estimate): An exact Claude token count can only be produced by the provider's server-side token counting API; it cannot be reproduced in the browser, and this tool never sends your text anywhere. Instead, it approximates tokens from the character count using ratios derived from each model's officially published "words / unicode characters per token" figures, grouped by tokenizer family. This is deliberately not a GPT-style byte-pair tokenizer such as tiktoken or gpt-tokenizer: those are built for OpenAI models and meaningfully miscount Claude tokens, which would give a false sense of precision. The estimate also does not model the small per-message and per-tool structural overhead the real API adds, so treat the result as a planning approximation with a margin.

Context window and max output limits below were taken from the official Anthropic models documentation as of 2026-06-23 and can change. Max output values are for the synchronous Messages API.

Model	Model ID	Context window	Max output	Tokenizer
Claude Fable 5	claude-fable-5	1,000,000	128,000	Opus 4.7+
Claude Opus 4.8	claude-opus-4-8	1,000,000	128,000	Opus 4.7+
Claude Sonnet 5	claude-sonnet-5	1,000,000	128,000	Opus 4.7+
Claude Opus 4.7	claude-opus-4-7	1,000,000	128,000	Opus 4.7+
Claude Opus 4.6	claude-opus-4-6	1,000,000	128,000	Pre-4.7
Claude Sonnet 4.6	claude-sonnet-4-6	1,000,000	128,000	Pre-4.7
Claude Sonnet 4.5	claude-sonnet-4-5	200,000	64,000	Pre-4.7
Claude Opus 4.5	claude-opus-4-5	200,000	64,000	Pre-4.7
Claude Haiku 4.5	claude-haiku-4-5	200,000	64,000	Pre-4.7
Claude Opus 4.1 (deprecated)	claude-opus-4-1	200,000	32,000	Pre-4.7

Source: Anthropic - Models overview. Some platforms differ (for example, Claude Opus 4.8 has a 200,000-token context window on Microsoft Foundry). For programmatic, always-current limits, query the Models API; for exact counts, use the token counting API.

How to Use

Pick a model to load its context window and max output limit.
Set the output reserve - how many tokens you want to keep free for the model's response (capped at the model's max output).
Enter your input. Use Plain Text for a single prompt, or Messages (roles) to break it into system prompt, tool definitions, and conversation turns. Counts update as you type.
Read the budget bar. The stacked bar and the numbers show input plus reserved output against the window, with a warning if you exceed it.
Run a what-if. In Messages mode, drop the last N turns or shorten the system prompt to see how much room that frees.
Load a file or an example, then Copy Summary to export the breakdown as text.

FAQ

Is this an exact token count?
No. These are browser-side estimates derived from character counts and per-model ratios. They will not exactly match the tokenizer. For an exact number, use the provider's official token counting API (it runs server-side and reports the real input_tokens).

What counts toward the context window?
Everything the model has to read or produce in a single request: the system prompt, all tool / function definitions, the entire conversation history (every prior user and assistant turn, including tool calls and results), and the space reserved for the response. The input and the reserved output share one budget, so they must fit together inside the window.

Why does my message use more tokens than it has words?
Tokens are sub-word units. Common words may be a single token, but rare words, code, punctuation, whitespace, and non-English text often split into several tokens each. As a rough guide, English text runs well under one token per character, while CJK text and code are denser, so a word can cost more than one token.

Why do different models show different token counts for the same text?
Models use different tokenizers. The tokenizer introduced with Claude Opus 4.7 produces roughly 30% more tokens for the same text than earlier Claude tokenizers, so this tool groups models into tokenizer families and applies different ratios accordingly. Always re-check counts when you switch model families.

Does my text get sent anywhere?
No. The tool makes no network requests and no API calls. All estimation runs locally in your browser, and nothing is uploaded or stored on a server. You can use it offline once the page has loaded.

Related Tools

Word Character Counter Tool - count words, characters, and bytes with real-time analysis.
JSON Formatter Tool - format and validate the JSON tool definitions you paste here.
MCP Server Config Builder and Validator Tool - build and validate MCP server configurations.
Agent Skills Validator and Security Scanner Tool - validate Agent Skills definitions.
llms.txt Generator and Validator Tool - generate and validate llms.txt files.
JWT Decoder Tool - decode and inspect JSON Web Tokens.
Prompt Caching Breakpoint Planner Tool - plan cache_control breakpoints and prefix reuse for Claude prompt caching.
JSONL Conversation and Transcript Viewer - read and search the LLM chat logs and agent transcripts you are budgeting tokens for.

Important Notes

Token counts shown here are estimates produced in your browser and may differ from a model's actual tokenizer. For exact counts, use the provider's official token counting API. Context window and output limits were taken from official documentation as of 2026-06-23 and can change.
The estimate does not include the small structural overhead the real API adds per message and per tool definition, so actual counts are usually slightly higher than shown.
This tool does not display pricing or cost figures; for current pricing, consult the provider's official pricing page.
No data is transmitted, no API calls are made, and nothing is stored. All processing is client-side.

References:
Tech Blog with curated related content
Web Tools Collection

Written by Hidekazu Konishi

hidekazu-konishi.com