What should I enter for average input and output tokens?
Use production logs when available. If the feature is not live yet, sample realistic prompts and expected responses, then estimate both median and high-end token usage before choosing a model.
Estimate AI API spend from token usage, pricing, retry behavior, cache strategy, and human review time. Use it before choosing a model, pricing a feature, or committing to an agent workflow.
Enter the current prices from your model provider. The calculator includes retries, input caching, output tokens, and optional human review time.
Paste the current input and output prices from your model provider, then estimate average token usage from logs or a representative sample of prompts.
The model accounts for retry overhead, input cache savings, output tokens, and optional human review time so you can compare API cost with operational cost.
Provider prices change frequently. This calculator is a planning tool, not a price guarantee. Verify commercial terms on the official provider site before buying.
Calculator FAQ
Use production logs when available. If the feature is not live yet, sample realistic prompts and expected responses, then estimate both median and high-end token usage before choosing a model.
Retries, JSON repair calls, fallback model calls, and agent step recovery can multiply the real number of model calls behind one user action. A low visible request count can still create a high bill when failures repeat.
Yes. For customer-visible or high-risk workflows, review time can exceed token cost. Including review overhead helps compare a cheaper model against the operational cost of correcting weak outputs.
No. Provider prices, discounts, cache behavior, and billing units can change. Use this calculator as a planning model, then verify current commercial terms on the official provider site.