fix: resolve 12 bugs across API, MCP, agent tools, web search, and context overflow (#674)

* fix: resolve 12 bugs across API, MCP, agent tools, web search, and context overflow API fixes: - Fix Gemini 400 error: delete 'store: false' field for Gemini endpoints (was globally injected, Gemini rejects unknown fields) - Fix session timeout 500 errors after ~25min: add 120s idle timeout on SSE stream readers in openaiShim and codexShim to detect dead connections and trigger withRetry reconnection - Fix context overflow 500 errors: add handler in errors.ts for 500 responses caused by oversized conversation context (too many tokens), surfacing user-friendly message with recovery actions instead of raw 'API Error: 500' Agent loop fix: - Fix premature task completion: detect continuation signals like 'so now I have to do it' in assistant text without tool calls and inject a meta nudge to force the agent to continue Web search improvements: - Increase result counts: Bing/Tavily/Exa/Firecrawl from 10→15, Mojeek/You/Jina from default→10 (explicit), max_uses 8→15 MCP fixes: - Reduce default tool timeout from ~27.8 hours to 5 minutes (tools no longer hang indefinitely on unresponsive servers) - Add retry logic (3 attempts) for tools/list fetch failures (prevents all MCP tools from silently disappearing on timeout) - Add abort signal check in URL elicitation retry loop - Improve MCP error messages with server and tool name context Agent tool fixes: - Fix SendMessage race condition: double-check task status before auto-resuming stopped agents to prevent duplicate registration - Fix auto-compact circuit breaker gap: when auto-compact fails 3+ consecutive times, proactively block oversized context BEFORE the API call instead of letting it 500. Clear message with recovery instructions (/new, /compact, rewind). Tests: 850 total, 0 failures (25 new bugfix tests) * fix: address all 4 review blockers + 6 additional issues from PR #674 Blockers (from Vasanthdev2004 review): 1. Continuation nudge infinite loop — no loop guard Added continuationNudgeCount to State, capped at MAX_CONTINUATION_NUDGES (3). Counter increments on each nudge, resets on tool execution (next_turn). 2. Continuation signal regexes too broad — high false-positive rate Tightened all patterns to require explicit action verbs. Added completion marker check (done/finished/completed/summary). Broad patterns only fire on messages <80 chars. 3. BUGFIXES.md in repo root — scope contamination Removed. PR description already contains this info. 4. AgentTool dump state cleanup is comment-only, not a bug fix Wrapped clearInvokedSkillsForAgent and clearDumpState in individual try/catch blocks so one failure doesn't prevent the other. Additional issues: 5+6. readWithTimeout ignores AbortSignal, timer leak on abort Added optional signal param to openaiStreamToAnthropic, codexStreamToAnthropic, collectCodexCompletedResponse, readSseEvents. Added abort listener that clears idle timer so AbortError surfaces cleanly instead of spurious idle timeout. 7. MCP error format change breaks consumers Reverted human-readable message to original errorDetails format. Moved server/tool context to telemetryMessage param only. 10. AgentTool test broken by comment change Updated test assertions to match new defensive cleanup text + try/catch. 12. Mojeek test regex dangerously broad Tightened to match searchParams.set('t', '10') specifically. 14. linkup.ts in providerCounts test — no result count field Removed from providers list (uses depth param, not result count). 15. Error message overlap between errors.ts and query.ts Prefixed errorDetails with 'Context overflow (500):' to distinguish. Tests: 851 pass, 0 fail --------- Co-authored-by: openclaude-bot <bot@openclaude.ai> Co-authored-by: Fix Bot <fix@openclaude.dev>
2026-04-14 16:29:53 +05:30
parent 1741f32cb7
commit 25ce2ca7bf
18 changed files with 647 additions and 27 deletions
--- a/src/query.ts
+++ b/src/query.ts
@@ -160,6 +160,7 @@ function* yieldMissingToolResultBlocks(
 * rules, ye will be punished with an entire day of debugging and hair pulling.
 */
 const MAX_OUTPUT_TOKENS_RECOVERY_LIMIT = 3
+const MAX_CONTINUATION_NUDGES = 3

 /**
 * Is this a max_output_tokens error message? If so, the streaming loop should
@@ -209,6 +210,10 @@ type State = {
  pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
  stopHookActive: boolean | undefined
  turnCount: number
+  // Count of consecutive continuation nudges within the current turn.
+  // Capped at MAX_CONTINUATION_NUDGES to prevent infinite nudge loops
+  // when the model keeps matching continuation signals without tool calls.
+  continuationNudgeCount: number
  // Why the previous iteration continued. Undefined on first iteration.
  // Lets tests assert recovery paths fired without inspecting message contents.
  transition: Continue | undefined
@@ -272,6 +277,7 @@ async function* queryLoop(
    maxOutputTokensRecoveryCount: 0,
    hasAttemptedReactiveCompact: false,
    turnCount: 1,
+    continuationNudgeCount: 0,
    pendingToolUseSummary: undefined,
    transition: undefined,
  }
@@ -645,6 +651,35 @@ async function* queryLoop(
      }
    }

+    // Safety net: when auto-compact's circuit breaker has tripped (3+
+    // consecutive failures), the normal blocking check above is gated on
+    // reactiveCompact. If reactiveCompact is also enabled but ALSO fails
+    // (or is disabled), the oversized context goes straight to the API and
+    // gets a 500. This check catches that gap — if compaction is exhausted
+    // and context is still over the autocompact threshold, block immediately
+    // with a clear message instead of burning an API call that will 500.
+    if (
+      tracking?.consecutiveFailures !== undefined &&
+      tracking.consecutiveFailures >= 3 &&
+      isAutoCompactEnabled()
+    ) {
+      const model = toolUseContext.options.mainLoopModel
+      const tokenUsage = tokenCountWithEstimation(messagesForQuery) - snipTokensFreed
+      const { isAboveAutoCompactThreshold } = calculateTokenWarningState(
+        tokenUsage,
+        model,
+      )
+      if (isAboveAutoCompactThreshold) {
+        yield createAssistantAPIErrorMessage({
+          content:
+            'The conversation has exceeded the context limit and automatic compaction has failed. ' +
+            'Press esc twice to go up a few messages and try again, or start a new session with /new.',
+          error: 'invalid_request',
+        })
+        return { reason: 'blocking_limit' }
+      }
+    }
+
    let attemptWithFallback = true

    queryCheckpoint('query_api_loop_start')
@@ -1102,6 +1137,7 @@ async function* queryLoop(
              pendingToolUseSummary: undefined,
              stopHookActive: undefined,
              turnCount,
+              continuationNudgeCount: state.continuationNudgeCount,
              transition: {
                reason: 'collapse_drain_retry',
                committed: drained.committed,
@@ -1155,6 +1191,7 @@ async function* queryLoop(
            pendingToolUseSummary: undefined,
            stopHookActive: undefined,
            turnCount,
+            continuationNudgeCount: state.continuationNudgeCount,
            transition: { reason: 'reactive_compact_retry' },
          }
          state = next
@@ -1210,6 +1247,7 @@ async function* queryLoop(
            pendingToolUseSummary: undefined,
            stopHookActive: undefined,
            turnCount,
+            continuationNudgeCount: state.continuationNudgeCount,
            transition: { reason: 'max_output_tokens_escalate' },
          }
          state = next
@@ -1238,6 +1276,7 @@ async function* queryLoop(
            pendingToolUseSummary: undefined,
            stopHookActive: undefined,
            turnCount,
+            continuationNudgeCount: state.continuationNudgeCount,
            transition: {
              reason: 'max_output_tokens_recovery',
              attempt: maxOutputTokensRecoveryCount + 1,
@@ -1295,6 +1334,7 @@ async function* queryLoop(
          pendingToolUseSummary: undefined,
          stopHookActive: true,
          turnCount,
+          continuationNudgeCount: state.continuationNudgeCount,
          transition: { reason: 'stop_hook_blocking' },
        }
        state = next
@@ -1331,6 +1371,7 @@ async function* queryLoop(
            pendingToolUseSummary: undefined,
            stopHookActive: undefined,
            turnCount,
+            continuationNudgeCount: state.continuationNudgeCount,
            transition: { reason: 'token_budget_continuation' },
          }
          continue
@@ -1350,6 +1391,77 @@ async function* queryLoop(
        }
      }

+      // Continuation nudge: detect when the model signals intent to continue
+      // (e.g., "so now I have to do it", "let me now...", "I'll need to...")
+      // but returned no tool calls. This prevents premature task completion.
+      //
+      // Guard: capped at MAX_CONTINUATION_NUDGES to prevent infinite loops
+      // when the model keeps matching signals without ever calling tools.
+      if (
+        assistantMessages.length > 0 &&
+        turnCount < (maxTurns ?? Infinity) &&
+        state.continuationNudgeCount < MAX_CONTINUATION_NUDGES
+      ) {
+        const lastAssistant = assistantMessages.at(-1)
+        if (lastAssistant?.type === 'assistant') {
+          const lastText = lastAssistant.message.content
+            .filter((b): b is { type: 'text'; text: string } => b.type === 'text')
+            .map(b => b.text)
+            .join(' ')
+            .toLowerCase()
+
+          // Tightened patterns: require explicit action verbs and exclude
+          // common explanatory phrasing to reduce false positives.
+          const continuationSignals = [
+            // Only match "so now I/let me/we" followed by an action verb
+            /\bso now (i|let me|we) (need to|have to|should|must|will) (do|create|write|edit|update|fix|implement|add|run|check|make|build|set up)\b/,
+            // "now I'll" + action (not "now I'll explain" etc.)
+            /\bnow i('ll| will) (do|create|write|edit|update|fix|implement|add|run|check|make|build|set up|go|proceed)\b/,
+            // "let me" + action (not "let me think/explain/show")
+            /\blet me (go ahead and |now )?(do|create|write|edit|update|fix|implement|add|run|check|make|build|set up|proceed)\b/,
+            // "I'll/I need to/I have to" + action, only if message is short (<80 chars)
+            ...(lastText.length < 80
+              ? [/\b(i('ll| will| need to| have to| must) (now )?(do|create|write|edit|update|fix|implement|add|run|check|make|build|set up))\b/]
+              : []),
+            // "time to" + action
+            /\btime to (do|create|write|edit|update|fix|implement|add|run|check|make|build|get started|begin)\b/,
+            // "next, I'll/let me" + action, only if message is short
+            ...(lastText.length < 80
+              ? [/\bnext,?\s+(i('ll| will)|let me|i need to) (do|create|write|edit|update|fix|implement|add|run|check|make|build)\b/]
+              : []),
+          ]
+
+          // Don't nudge if the text contains completion markers
+          const completionMarkers = /\b(done|finished|completed|complete|summary|that's all|that is all|all set|hope this helps|let me know if)\b/
+          if (completionMarkers.test(lastText)) {
+            // Model signaled completion — don't nudge
+          } else if (continuationSignals.some(re => re.test(lastText))) {
+            logForDebugging(
+              `Continuation nudge triggered (${state.continuationNudgeCount + 1}/${MAX_CONTINUATION_NUDGES}): model said "${lastText.slice(-120)}" without tool calls`,
+            )
+            const nudge = createUserMessage({
+              content: 'Continue with the task. Use the appropriate tools to proceed.',
+              isMeta: true,
+            })
+            const next: State = {
+              messages: [...messagesForQuery, ...assistantMessages, nudge],
+              toolUseContext,
+              autoCompactTracking: tracking,
+              maxOutputTokensRecoveryCount: 0,
+              hasAttemptedReactiveCompact: false,
+              maxOutputTokensOverride: undefined,
+              pendingToolUseSummary: undefined,
+              stopHookActive: undefined,
+              turnCount,
+              continuationNudgeCount: state.continuationNudgeCount + 1,
+              transition: { reason: 'continuation_nudge' },
+            }
+            state = next
+            continue
+          }
+        }
+      }
+
      return { reason: 'completed' }
    }

@@ -1715,6 +1827,7 @@ async function* queryLoop(
      turnCount: nextTurnCount,
      maxOutputTokensRecoveryCount: 0,
      hasAttemptedReactiveCompact: false,
+      continuationNudgeCount: 0,
      pendingToolUseSummary: nextPendingToolUseSummary,
      maxOutputTokensOverride: undefined,
      stopHookActive,