LLM Token Counter & Price Comparator

How often are prices updated?

We update pricing data weekly from official provider documentation. Prices shown are per million tokens and may not include volume discounts.

What's the difference between input and output pricing?

Input tokens are your prompt — they're cheaper. Output tokens are the model's response — they cost 2-5x more because generation is computationally expensive.

Which model is cheapest?

Gemini 2.5 Flash-Lite and GPT-4.1 nano are the most affordable at $0.075/M and $0.10/M input tokens. Sort by input cost to see the full ranking.

Which model gives best value for money?

Depends on your task. For general tasks: Gemini 2.5 Flash-Lite or GPT-4o mini. For complex reasoning: GPT-4.1 or Claude 4 Sonnet. For cost-sensitive bulk work: GPT-4.1 nano.

What is context window size?

The maximum combined length of input + output tokens. Llama 4 Scout leads with 10M tokens, Gemini supports 1M. Most OpenAI models support 128K-200K.

Are these API prices or subscription prices?

These are API (pay-per-use) prices. Subscription plans (like ChatGPT Plus or Claude Pro) have fixed monthly fees with usage limits.

Do prices include fine-tuning?

No. Fine-tuning has separate pricing. These are standard inference prices for using pre-trained models via API.

What about DeepSeek's pricing?

DeepSeek offers competitive pricing. DeepSeek V3 at $0.27/M input and R1 at $0.55/M input. Both are available via DeepSeek's API platform.

Is Llama free?

Llama itself is open-source and free to self-host. The prices shown are for hosted API access. Self-hosting costs depend on your compute infrastructure.

How do I estimate my monthly cost?

Use our Token Counter to estimate tokens per request, then multiply by your expected monthly request volume. Cost = (total tokens / 1M) × price per M tokens.

LLM Price Comparator

Frequently Asked Questions