Per-request cost estimation is computed and available for monitoring

ab-000323 · ai-token-optimization.caching-cost.cost-estimation-per-request

Severity: lowactive

Why it matters

Raw token counts are an abstraction that most product decisions do not operate on. Knowing a request consumed 1,200 tokens tells you nothing actionable; knowing it cost $0.014 does. Dollar cost estimation unlocks concrete decisions: which feature is too expensive to offer on the free plan, which user is consuming 10x the cohort average, whether a model swap will save $300/month, and at what traffic level a cost-per-user budget is breached. NIST AI RMF MEASURE 2.5 requires quantitative monitoring of AI operational costs; dollar estimates are that quantification.

Severity rationale

Low because the absence of cost estimation leaves token counts abstract, delaying cost-driven product decisions — but does not cause operational failure on its own.

Remediation

Add a cost calculator utility in src/lib/ai/cost-calculator.ts that converts token counts to dollar amounts using model pricing constants, then call it immediately after logging token usage.

// src/lib/ai/cost-calculator.ts
const PRICE_PER_1M_TOKENS = {
  "gpt-4o":         { input: 2.50,  output: 10.00 },
  "gpt-4o-mini":    { input: 0.15,  output: 0.60  },
  "claude-3-5-sonnet-20241022": { input: 3.00, output: 15.00 },
} as const;

export function estimateCost(
  model: keyof typeof PRICE_PER_1M_TOKENS,
  promptTokens: number,
  completionTokens: number
): number {
  const p = PRICE_PER_1M_TOKENS[model];
  return (promptTokens * p.input + completionTokens * p.output) / 1_000_000;
}

Store cost_usd alongside token counts in ai_request_logs. Verify by querying the table and confirming the values are plausible for the model used.

Detection

ID: cost-estimation-per-request
Severity: low
What to look for: Look for logic that converts token counts into dollar amounts using model pricing rates. This could be a utility function (calculateCost, estimateCost), a derived column in the request logs table, or an admin dashboard metric. Check if pricing constants are defined anywhere in the codebase (e.g., GPT4O_INPUT_PRICE_PER_1M = 2.50). Count all instances found and enumerate each.
Pass criteria: The system computes or stores a dollar cost estimate alongside token counts, making it possible to answer "how much did this session cost?" or "what is our cost per feature?". At least 1 implementation must be confirmed.
Fail criteria: Only raw token counts are tracked (or nothing is tracked), with no financial cost derivation. Dollar costs are not available without manual calculation using external pricing tables.
Skip (N/A) when: The project is a hobby project or personal tool where cost management is not a concern. Also skip if the project uses a flat-rate API plan where per-request cost is meaningless. Signal: No production deployment signals (no vercel.json, netlify.toml), single developer, no user accounts.
Detail on fail: "No cost estimation per request — token counts logged but financial impact invisible"

Remediation: Token counts are abstract. Dollar amounts drive product decisions: pricing plans, feature cost analysis, user quotas, and abuse detection.

// src/lib/ai/cost-calculator.ts
const PRICE_PER_1M_TOKENS = {
  "gpt-4o":         { input: 2.50,  output: 10.00 },
  "gpt-4o-mini":    { input: 0.15,  output: 0.60  },
  "claude-3-5-sonnet-20241022": { input: 3.00, output: 15.00 },
} as const;

export function estimateCost(
  model: keyof typeof PRICE_PER_1M_TOKENS,
  promptTokens: number,
  completionTokens: number
): number {
  const prices = PRICE_PER_1M_TOKENS[model];
  return (promptTokens * prices.input + completionTokens * prices.output) / 1_000_000;
}

Then include the estimate in logs:

const costUsd = estimateCost("gpt-4o", result.usage.promptTokens, result.usage.completionTokens);
await db.insert("ai_request_logs").values({ ..., cost_usd: costUsd });

Verify by querying the logs table for cost_usd values and confirming they are plausible for the model used.

External references

nist-ai-rmf:1.0 · MEASURE 2.5 — AI risk metrics are monitored
iso-25010:2011 · reliability.maturity — Maturity — per-request cost estimation supports operational monitoring

Taxons

observability

History

2026-04-18·v1.0.0·Initial import from ai-token-optimization·automated