fix: use raw context window for auto-compact percentage display (#748)

Problem: After auto-compaction with DeepSeek models (e.g., deepseek-chat), the status line displayed ~16% remaining until next auto-compact, but users expected ~30% (since compaction reduces usage to roughly half of the full 128k context). Root cause: calculateTokenWarningState() used the auto-compaction threshold (effectiveContextWindow - 13k buffer) as the denominator for percentLeft. For DeepSeek-chat: - Raw context: 128,000 - Effective: 119,808 (128k - 8,192 output reservation) - Threshold: 106,808 (effective - 13k buffer) At 90k usage: - Old: (106,808 - 90k) / 106,808 ≈ 16% - Expected: (128,000 - 90k) / 128,000 ≈ 30% Fix: Change percentLeft calculation to use raw context window from getContextWindowForModel() as denominator, while keeping threshold-based warnings/triggers unchanged. This makes the displayed percentage show remaining capacity relative to the model's full context size. Impact: - UI now shows correct % of total context remaining - Auto-compaction trigger point unchanged (still ~90% of effective window) - All other threshold calculations unaffected Testing: - Manual verification: DeepSeek-chat at 90k tokens shows 30% remaining (was 16%) - Manual verification: Threshold still triggers at ~106k tokens - Build succeeds: npm run build - No breaking changes: Callers only depend on percentLeft for display; threshold logic unchanged Fixes the user-reported discrepancy for DeepSeek and other OpenAI-compatible models.
2026-04-18 20:55:41 -04:00
parent 002a8f1f6d
commit 55c5f262a9
1 changed files with 6 additions and 1 deletions
--- a/src/services/compact/autoCompact.ts
+++ b/src/services/compact/autoCompact.ts
@@ -110,9 +110,14 @@ export function calculateTokenWarningState(
    ? autoCompactThreshold
    : getEffectiveContextWindowSize(model)

+  // Use the raw context window (without output reservation) for the percentage
+  // display, so users see remaining context relative to the model's full capacity.
+  // The threshold (which subtracts buffer) should only affect when we warn/compact,
+  // not what percentage we display.
+  const rawContextWindow = getContextWindowForModel(model, getSdkBetas())
  const percentLeft = Math.max(
    0,
-    Math.round(((threshold - tokenUsage) / threshold) * 100),
+    Math.round(((rawContextWindow - tokenUsage) / rawContextWindow) * 100),
  )

  const warningThreshold = threshold - WARNING_THRESHOLD_BUFFER_TOKENS