feat(api): compress old tool_result content for small-context providers (#801)

* feat(api): compress old tool_result content for small-context providers

Adds a shim-layer pass that tiers tool_result content by age on
providers
  with small effective context windows (Copilot gpt-4o 128k, Mistral,
  Ollama). Recent turns remain full; mid-tier results are truncated to
2k
  chars; older results are replaced with a stub that preserves tool name
  and arguments so the model can re-invoke if needed.

  Tier sizes auto-tune via getEffectiveContextWindowSize, same
calculation
  used by auto-compact. Reuses COMPACTABLE_TOOLS and
  TOOL_RESULT_CLEARED_MESSAGE to complement (not duplicate)
microCompact.
  Configurable via /config toolHistoryCompressionEnabled.

  Addresses active-session context accumulation on Copilot where
  microCompact's time-based trigger never fires, which surfaces as
  "tools appearing in a loop" and prompt_too_long errors after ~15
turns.

* fix: config tool history
This commit is contained in:
viudes
2026-04-21 06:36:26 -03:00
committed by GitHub
parent 64582c119d
commit a6a3de5ac1
9 changed files with 1179 additions and 5 deletions

View File

@@ -1,4 +1,5 @@
import { APIError } from '@anthropic-ai/sdk'
import { compressToolHistory } from './compressToolHistory.js'
import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
import type {
ResolvedCodexCredentials,
@@ -484,13 +485,15 @@ export async function performCodexRequest(options: {
defaultHeaders: Record<string, string>
signal?: AbortSignal
}): Promise<Response> {
const input = convertAnthropicMessagesToResponsesInput(
const compressedMessages = compressToolHistory(
options.params.messages as Array<{
role?: string
message?: { role?: string; content?: unknown }
content?: unknown
}>,
options.request.resolvedModel,
)
const input = convertAnthropicMessagesToResponsesInput(compressedMessages)
const body: Record<string, unknown> = {
model: options.request.resolvedModel,
input: input.length > 0