Commit Graph

9 Commits

Author SHA1 Message Date
nehan
80a00acc2c feat(api): classify openai-compatible provider failures (#708)
* feat(api): classify openai-compatible provider failures

* Update src/services/api/providerConfig.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/services/api/errors.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat(api): harden openai-compatible diagnostics and env fallback

* Update src/services/api/openaiShim.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/services/api/openaiShim.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/services/api/errors.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/services/api/errors.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix openaiShim duplicate requests and diagnostics

* remove unused url from http failure classifier

* dedupe env diagnostic warnings

* Remove hardcoded URLs from OpenAI error tests

Removed hardcoded URLs from network failure classification tests.

* Update providerConfig.envDiagnostics.test.ts

* fix(openai-shim): return successful responses and restore localhost classifier tests

* Update src/services/api/openaiShim.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/services/api/openaiShim.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/services/api/openaiShim.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-17 18:01:40 +08:00
FluxLuFFy
25ce2ca7bf fix: resolve 12 bugs across API, MCP, agent tools, web search, and context overflow (#674)
* fix: resolve 12 bugs across API, MCP, agent tools, web search, and context overflow

API fixes:
- Fix Gemini 400 error: delete 'store: false' field for Gemini endpoints
  (was globally injected, Gemini rejects unknown fields)
- Fix session timeout 500 errors after ~25min: add 120s idle timeout
  on SSE stream readers in openaiShim and codexShim to detect dead
  connections and trigger withRetry reconnection
- Fix context overflow 500 errors: add handler in errors.ts for 500
  responses caused by oversized conversation context (too many tokens),
  surfacing user-friendly message with recovery actions instead of raw
  'API Error: 500'

Agent loop fix:
- Fix premature task completion: detect continuation signals like
  'so now I have to do it' in assistant text without tool calls and
  inject a meta nudge to force the agent to continue

Web search improvements:
- Increase result counts: Bing/Tavily/Exa/Firecrawl from 10→15,
  Mojeek/You/Jina from default→10 (explicit), max_uses 8→15

MCP fixes:
- Reduce default tool timeout from ~27.8 hours to 5 minutes
  (tools no longer hang indefinitely on unresponsive servers)
- Add retry logic (3 attempts) for tools/list fetch failures
  (prevents all MCP tools from silently disappearing on timeout)
- Add abort signal check in URL elicitation retry loop
- Improve MCP error messages with server and tool name context

Agent tool fixes:
- Fix SendMessage race condition: double-check task status before
  auto-resuming stopped agents to prevent duplicate registration
- Fix auto-compact circuit breaker gap: when auto-compact fails 3+
  consecutive times, proactively block oversized context BEFORE the
  API call instead of letting it 500. Clear message with recovery
  instructions (/new, /compact, rewind).

Tests: 850 total, 0 failures (25 new bugfix tests)

* fix: address all 4 review blockers + 6 additional issues from PR #674

Blockers (from Vasanthdev2004 review):

1. Continuation nudge infinite loop — no loop guard
   Added continuationNudgeCount to State, capped at MAX_CONTINUATION_NUDGES (3).
   Counter increments on each nudge, resets on tool execution (next_turn).

2. Continuation signal regexes too broad — high false-positive rate
   Tightened all patterns to require explicit action verbs. Added completion
   marker check (done/finished/completed/summary). Broad patterns only fire
   on messages <80 chars.

3. BUGFIXES.md in repo root — scope contamination
   Removed. PR description already contains this info.

4. AgentTool dump state cleanup is comment-only, not a bug fix
   Wrapped clearInvokedSkillsForAgent and clearDumpState in individual
   try/catch blocks so one failure doesn't prevent the other.

Additional issues:

5+6. readWithTimeout ignores AbortSignal, timer leak on abort
   Added optional signal param to openaiStreamToAnthropic,
   codexStreamToAnthropic, collectCodexCompletedResponse, readSseEvents.
   Added abort listener that clears idle timer so AbortError surfaces
   cleanly instead of spurious idle timeout.

7. MCP error format change breaks consumers
   Reverted human-readable message to original errorDetails format.
   Moved server/tool context to telemetryMessage param only.

10. AgentTool test broken by comment change
   Updated test assertions to match new defensive cleanup text + try/catch.

12. Mojeek test regex dangerously broad
   Tightened to match searchParams.set('t', '10') specifically.

14. linkup.ts in providerCounts test — no result count field
   Removed from providers list (uses depth param, not result count).

15. Error message overlap between errors.ts and query.ts
   Prefixed errorDetails with 'Context overflow (500):' to distinguish.

Tests: 851 pass, 0 fail

---------

Co-authored-by: openclaude-bot <bot@openclaude.ai>
Co-authored-by: Fix Bot <fix@openclaude.dev>
2026-04-14 18:59:53 +08:00
CRABHIVE
4ac7367733 fix: include retry timing in 429 error messages (#366)
## Summary

- Extract retry-after header from 429 API errors and include timing
  guidance in the user-facing error message
- Previously, non-quota 429 errors showed a generic message with no
  guidance on when to retry, only a link to status.anthropic.com

## Impact

- user-facing impact: 429 error messages now tell users when to retry
  instead of just linking to a status page
- developer/maintainer impact: none

## Testing

- [x] `bun run build`
- [ ] `bun run smoke`
- [ ] focused tests: error formatting is pure string construction,
  verified via build + manual inspection

## Notes

- provider/model path tested: applies to all providers returning 429
- screenshots attached (if UI changed): n/a
- follow-up work or known limitations: 529 errors could get similar
  treatment in a follow-up

https://claude.ai/code/session_01D7kprMn4c66a5WrZscF7rv

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-06 06:36:14 +08:00
Anandan
0d27ca596a Neutralize remaining internal-only diagnostic labels (#359)
This pass rewrites a small set of ant-only diagnostic and UI labels to
neutral internal wording while leaving command definitions, flags, and
runtime logic untouched. It focuses on internal debug output, dead UI
branches, and noninteractive headings rather than broader product text.

Constraint: Label cleanup only; do not change command semantics or ant-only logic gates
Rejected: Renaming ant-only command descriptions in main.tsx | broader UX surface better handled in a separate reviewed pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Remaining ANT-ONLY hits are mostly command descriptions and intentionally deferred user-facing strings
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)

Co-authored-by: anandh8x <test@example.com>
2026-04-04 23:50:15 +05:30
KRATOS
75d2543854 fix: remove internal Anthropic tooling from external build (#345)
Remove debug systems, employee detection, and internal logging
that have no function in a community fork.

Changes:
- Remove logPermissionContextForAnts import and calls (main.tsx, compact.ts)
  Reads Kubernetes namespace and container IDs from internal infra paths.
  Dead code for all external users.

- Remove createDumpPromptsFetch import and gate (query.ts)
  Internal prompt dump system for employee debugging.
  Replace gate with unconditional undefined — normal fetch always used.

- Remove stripSignatureBlocks ant-only block (query.ts)
  Was behind USER_TYPE === 'ant' guard, never ran for external users.

- Hardcode isAnt: false (query/config.ts)
  Employee detection flag has no place in a community fork.
  config.gates.isAnt had exactly one consumer (dumpPromptsFetch, now removed).

- Gut logClassifierResultForAnts body (bashPermissions.ts)
  Replace with empty no-op. Still called from 4 sites, zero execution.
  Remove ANT-ONLY comments describing internal security model.

- Gate status.anthropic.com behind firstParty check (errors.ts)
  429 error hint now only shown when using Anthropic directly.
  Third-party provider users see a generic capacity message.

Build: passes
Typecheck: clean (no new errors)
Tests: 196 pass, same 6 pre-existing failures unrelated to these changes
2026-04-04 21:23:17 +05:30
KRATOS
27e6505bfd hardening: isolate third-party paths and clean external-build metadata (#311)
* hardening: isolate third-party paths and clean external-build metadata

* fix: restore external feedback flow and make privacy check portable
2026-04-04 14:22:33 +08:00
Juan Camilo
c66b859342 fix: provider-aware error messages and skip Anthropic key approval for 3P
1. errors.ts: Add getCustomOffSwitchMessage() that returns a
   provider-neutral message for 3P users instead of the hardcoded
   "Opus is experiencing high load, please use /model to switch to
   Sonnet" which is misleading for OpenAI/Gemini/Ollama users.
   The original constant is preserved for backward-compatible string
   matching in error handlers.

2. Onboarding.tsx: Skip the "approve API key" step when a 3P provider
   is active. Previously, having ANTHROPIC_API_KEY in the environment
   (e.g., from a previous Anthropic setup) triggered an irrelevant
   Anthropic key approval UI even when using Gemini or OpenAI.
2026-04-02 16:23:12 +02:00
Juan Camilo
d430ddd568 fix: prevent ANTHROPIC_API_KEY from interfering with Gemini provider auth
Two fixes for issue #133 where setting ANTHROPIC_API_KEY=dummy alongside
CLAUDE_CODE_USE_GEMINI=1 causes "Invalid API key" errors:

1. auth.ts: In the CI branch of getAnthropicApiKeyWithSource(), the
   ANTHROPIC_API_KEY value was returned without checking isUsing3PServices().
   A dummy key leaked into the Anthropic key resolution pipeline even when
   Gemini was the active provider. Now guards with isUsing3PServices().

2. errors.ts: The x-api-key error handler surfaced "Invalid API key" for
   any provider. Added getAPIProvider() === 'firstParty' guard so 3P users
   see the real underlying error instead of a misleading auth message.

Note: The cli.tsx Gemini validation fix (originally part of this PR) was
independently implemented in PR #121 and is already on main.
2026-04-02 15:40:07 +02:00
did:key:z6MkqDnb7Siv3Cwj7pGJq4T5EsUisECqR8KpnDLwcaZq5TPr
d2542c9a62 asdf
Squash the current repository state back into one baseline commit while
preserving the README reframing and repository contents.

Constraint: User explicitly requested a single squashed commit with subject "asdf"
Confidence: high
Scope-risk: broad
Reversibility: clean
Directive: This commit intentionally rewrites published history; coordinate before future force-pushes
Tested: git status clean; local history rewritten to one commit; force-pushed main to origin and instructkr
Not-tested: Fresh clone verification after push
2026-03-31 03:34:03 -07:00