Usage assumptions
Model mix
Use current public provider prices. This calculator is intentionally transparent and conservative; it does not promise savings or include every provider-specific billing edge case.
Why provider invoices are not enough
Provider invoices tell you what was spent. They usually do not tell you why spend moved. Production AI apps need customer/workspace attribution, feature-route tags, retry and fallback logging, and budget controls before one failed loop turns into an opaque monthly bill.
- Which customer or workspace generated the spend?
- Which feature route created the spend?
- Did retries, fallbacks, or batch jobs amplify the bill?
- Are hard caps in place before runaway jobs continue?
Cost controls to add before scale
- Per-key budgets and hard caps.
- Customer/workspace-level attribution.
- Feature-route tags on every request.
- Retry and fallback logging.
- Usage export for billing and analytics.
- Cheaper model routing for summaries, cleanup, classification, and low-risk tasks.
Formula transparency
base_input_cost = monthly_requests * avg_input_tokens * traffic_share * input_price_per_1m / 1_000_000 base_output_cost = monthly_requests * avg_output_tokens * traffic_share * output_price_per_1m / 1_000_000 retry_adjusted_cost = base_cost * (1 + retry_rate) fallback_overhead = base_cost * fallback_invocation_rate
FerryAPI provides an OpenAI-compatible API layer for teams that want lower-cost model access with usage visibility and operational controls. Use one familiar API shape while routing work across supported models, tracking usage, and keeping cost-sensitive workflows from becoming a single opaque provider invoice.