Commit Graph

9 Commits

Author SHA1 Message Date
Kevin Codex
531e3f1059 feat(tools): resilient web search and fetch across all providers (#836)
- Add exponential backoff retry to DuckDuckGo adapter (3 attempts with
  jitter) to handle transient rate-limiting and connection errors.
- Add native fetch() fallback in WebFetch when axios hangs with custom
  DNS lookup in bundled contexts.
- Prevent broken native-path fallback for web search on OpenAI shim
  providers (minimax, moonshot, nvidia-nim, etc.) that do not support
  Anthropic's web_search_20250305 tool.
- Cherry-pick existing fixes:
  - a48bd56: cover codex/minimax/nvidia-nim in getSmallFastModel()
  - 31f0b68: 45s budget + raw-markdown fallback for secondary model
  - 446c1e8: sparse Codex /responses payload parsing
  - ae3f0b2: echo reasoning_content on assistant tool-call messages
- Fix domainCheck.test.ts mock modules to include isFirstPartyAnthropicBaseUrl
  and isGithubNativeAnthropicMode exports.

Co-authored-by: OpenClaude <openclaude@gitlawb.com>
2026-04-23 01:14:00 +08:00
guanjiawei
2ff5710329 fix retry Codex and OpenAI fetches via proxy-aware helper (#720) 2026-04-16 21:42:14 +08:00
FluxLuFFy
25ce2ca7bf fix: resolve 12 bugs across API, MCP, agent tools, web search, and context overflow (#674)
* fix: resolve 12 bugs across API, MCP, agent tools, web search, and context overflow

API fixes:
- Fix Gemini 400 error: delete 'store: false' field for Gemini endpoints
  (was globally injected, Gemini rejects unknown fields)
- Fix session timeout 500 errors after ~25min: add 120s idle timeout
  on SSE stream readers in openaiShim and codexShim to detect dead
  connections and trigger withRetry reconnection
- Fix context overflow 500 errors: add handler in errors.ts for 500
  responses caused by oversized conversation context (too many tokens),
  surfacing user-friendly message with recovery actions instead of raw
  'API Error: 500'

Agent loop fix:
- Fix premature task completion: detect continuation signals like
  'so now I have to do it' in assistant text without tool calls and
  inject a meta nudge to force the agent to continue

Web search improvements:
- Increase result counts: Bing/Tavily/Exa/Firecrawl from 10→15,
  Mojeek/You/Jina from default→10 (explicit), max_uses 8→15

MCP fixes:
- Reduce default tool timeout from ~27.8 hours to 5 minutes
  (tools no longer hang indefinitely on unresponsive servers)
- Add retry logic (3 attempts) for tools/list fetch failures
  (prevents all MCP tools from silently disappearing on timeout)
- Add abort signal check in URL elicitation retry loop
- Improve MCP error messages with server and tool name context

Agent tool fixes:
- Fix SendMessage race condition: double-check task status before
  auto-resuming stopped agents to prevent duplicate registration
- Fix auto-compact circuit breaker gap: when auto-compact fails 3+
  consecutive times, proactively block oversized context BEFORE the
  API call instead of letting it 500. Clear message with recovery
  instructions (/new, /compact, rewind).

Tests: 850 total, 0 failures (25 new bugfix tests)

* fix: address all 4 review blockers + 6 additional issues from PR #674

Blockers (from Vasanthdev2004 review):

1. Continuation nudge infinite loop — no loop guard
   Added continuationNudgeCount to State, capped at MAX_CONTINUATION_NUDGES (3).
   Counter increments on each nudge, resets on tool execution (next_turn).

2. Continuation signal regexes too broad — high false-positive rate
   Tightened all patterns to require explicit action verbs. Added completion
   marker check (done/finished/completed/summary). Broad patterns only fire
   on messages <80 chars.

3. BUGFIXES.md in repo root — scope contamination
   Removed. PR description already contains this info.

4. AgentTool dump state cleanup is comment-only, not a bug fix
   Wrapped clearInvokedSkillsForAgent and clearDumpState in individual
   try/catch blocks so one failure doesn't prevent the other.

Additional issues:

5+6. readWithTimeout ignores AbortSignal, timer leak on abort
   Added optional signal param to openaiStreamToAnthropic,
   codexStreamToAnthropic, collectCodexCompletedResponse, readSseEvents.
   Added abort listener that clears idle timer so AbortError surfaces
   cleanly instead of spurious idle timeout.

7. MCP error format change breaks consumers
   Reverted human-readable message to original errorDetails format.
   Moved server/tool context to telemetryMessage param only.

10. AgentTool test broken by comment change
   Updated test assertions to match new defensive cleanup text + try/catch.

12. Mojeek test regex dangerously broad
   Tightened to match searchParams.set('t', '10') specifically.

14. linkup.ts in providerCounts test — no result count field
   Removed from providers list (uses depth param, not result count).

15. Error message overlap between errors.ts and query.ts
   Prefixed errorDetails with 'Context overflow (500):' to distinguish.

Tests: 851 pass, 0 fail

---------

Co-authored-by: openclaude-bot <bot@openclaude.ai>
Co-authored-by: Fix Bot <fix@openclaude.dev>
2026-04-14 18:59:53 +08:00
FluxLuFFy
32fbd0c7b4 fix: custom web search — WEB_URL_TEMPLATE not recognized, timeout too short, silent native fallback (#537)
* fix: custom web search — WEB_URL_TEMPLATE not recognized, timeout too short, silent native fallback

1. custom.ts: Add WEB_URL_TEMPLATE to isConfigured() so the custom provider
   is recognized when configured via URL template alone.

2. custom.ts: Bump DEFAULT_TIMEOUT_SECONDS from 15s to 120s.
   Self-hosted search APIs (SearXNG, internal) commonly need 30-90s.

3. WebSearchTool.ts: When an explicit adapter is selected via
   WEB_SEARCH_PROVIDER=custom, do not silently fall through to the
   native Anthropic path on adapter errors or 0-hit results.
   - 0 hits: return directly (no fallback)
   - Error: throw the real error (no fallback)
   - Auto mode: existing fallback behavior preserved

* fix: tighten auto-mode adapter fallback — only swallow transient errors

Address review feedback: in auto mode, only fall through to native on
transient errors (network failure, timeout, HTTP 5xx). Config and
guardrail errors (SSRF, HTTPS, bad URL, header allowlist, etc.) now
surface properly instead of being silently swallowed.

---------

Co-authored-by: FluxLuFFy <fluxluffy@users.noreply.github.com>
2026-04-09 20:41:58 +08:00
FluxLuFFy
4ad6bc50c1 refactor: provider adapter system + 7 new search providers (bug-fixed) (#512)
* refactor: provider adapter system + 7 new search providers

Architecture:
- Each search backend is a small adapter implementing SearchProvider
- 12 providers: custom, tavily, exa, you, jina, bing, mojeek, linkup, firecrawl, duckduckgo + native
- WEB_SEARCH_PROVIDER controls selection: auto (fallback chain) or specific provider
- Auth always in headers, never in query strings

Bug fixes from review feedback:
- Fix applyDomainFilters catch block: keep hits with malformed URLs on blocked_domains
  (can't confirm blocked), drop on allowed_domains (can't confirm allowed)
- Add safeHostname() helper: safely extract hostname from URLs without throwing
- Replace unsafe new URL(r.url).hostname in 7 providers with safeHostname()
- Remove dead code: buildAllHeaders, buildAuthHeaders, parseExtraHeaders from types.ts
- Fix WEB_PARMS typo: consistently use WEB_QUERY_PARAM everywhere
- AbortSignal forwarded to fetch() in all 12 providers
- DuckDuckGo: wrap dynamic import in try/catch for graceful error
- Exa: remove double domain filtering (server-side already)
- runSearch(): aggregate all provider errors instead of throwing only the last one
- Retry logic: check numeric status code directly, retry 5xx/network, skip 4xx

Test coverage (44 tests, all passing):
- types.test.ts: safeHostname, normalizeHit, applyDomainFilters (20 tests)
- index.test.ts: getProviderMode, getProviderChain, getAvailableProviders (13 tests)
- custom.test.ts: extractHits flexible response parsing (11 tests)

Co-authored-by: FluxLuFFy <195792511+FluxLuFFy@users.noreply.github.com>

* security: add guardrails to custom search provider (Option B)

- HTTPS-only by default (opt-out: WEB_CUSTOM_ALLOW_HTTP=true)
- Private/localhost IPs blocked by default (opt-out: WEB_CUSTOM_ALLOW_PRIVATE=true)
- Header allowlist: only known-safe headers allowed unless WEB_CUSTOM_ALLOW_ARBITRARY_HEADERS=true
- Configurable timeout in seconds (WEB_CUSTOM_TIMEOUT_SEC, default 15)
- Configurable POST body limit (WEB_CUSTOM_MAX_BODY_KB, default 300)
- Removed max URL size restriction
- Audit log warning on first custom search call
- Updated .env.example and README_SEARCH_PROVIDERS.md with all new options

* fix: remove custom provider from auto chain (Option 1)

Remove customProvider from the auto fallback chain so it is only
available when WEB_SEARCH_PROVIDER=custom is explicitly selected.

Changes:
- Remove customProvider from ALL_PROVIDERS array in providers/index.ts
- Add 3 new tests verifying custom is excluded from auto chain
- Update README_SEARCH_PROVIDERS.md: auto priority, mode table, note
- Update .env.example: auto priority comment, custom mode annotation

All 47 tests pass (44 existing + 3 new).

Co-Authored-By: @Vasanthdev2004

* fix: address review blockers (routing, abort, config check, domain matching)

1. Native/Codex routing precedence in auto mode
   shouldUseAdapterProvider() now checks if native/first-party/vertex/foundry
   or Codex paths are available before falling back to adapter providers.
   Auto mode: native paths take precedence; adapter is fallback only.

2. AbortError stops provider chain immediately
   runSearch() now checks for AbortError/aborted signal before continuing
   the fallback chain. Cancelled searches don't create extra outbound requests.

3. Explicit provider mode fails fast on missing credentials
   runSearch() validates isConfigured() for explicit modes before attempting
   requests. Throws clear error: 'Search provider "X" is not configured.'

4. Domain filter exact-or-subdomain matching (fixes suffix collision)
   New hostMatchesDomain() helper: exact match or .subdomain match.
   badexample.com no longer matches example.com.

5. Tests: 56 pass (9 new) covering all 4 fixes

Co-Authored-By: @Vasanthdev2004

---------

Co-authored-by: Claude Fix <fix@openclaude.local>
Co-authored-by: FluxLuFFy <195792511+FluxLuFFy@users.noreply.github.com>
Co-authored-by: bot <bot@openclaw.ai>
2026-04-09 02:51:25 +08:00
Meetpatel006
e5c9a6f629 Enable Free DDG WebSearch For Non-Claude Models (#234)
* added duck duck go for websearch tools that allowed free searching

* update readme

* Replace @phukon/duckduckgo-search with duck-duck-scrape and fix Firecrawl routing priority, and add DDG error handling

* refactor: streamline DuckDuckGo search fallback to use Firecrawl directly on rate limit

* docs: update README to clarify DuckDuckGo web search fallback and its limitations with TOS
2026-04-04 09:21:54 +08:00
Leonardo Grigorio
ac4efae870 feat: add Firecrawl backend for WebSearch and WebFetch tools
WebSearch is currently disabled for all non-Anthropic providers (OpenAI
shim, DeepSeek, Ollama, etc.) because those providers have no native
search backend. This adds Firecrawl as a fallback that activates when
FIRECRAWL_API_KEY is set, unlocking web search for every model
openclaude supports.

WebFetch uses basic HTTP + Turndown for HTML-to-markdown conversion,
which fails silently on JS-rendered SPAs and bot-protected pages.
Firecrawl scrape replaces the fetch layer when FIRECRAWL_API_KEY is set,
returning clean markdown that handles dynamic content correctly.

Changes:
- WebSearchTool: add runFirecrawlSearch() using @mendable/firecrawl-js,
  respects allowed_domains (post-filter) and blocked_domains (-site: operators),
  includes result snippets alongside links. shouldUseFirecrawl() ensures
  firstParty/Vertex/Foundry/Codex providers keep their native backends.
- WebFetchTool: add scrapeWithFirecrawl(), drops into the existing
  applyPromptToMarkdown() pipeline so prompt processing is unchanged.
- Remove "Web search is only available in the US" restriction from
  prompt when Firecrawl is active (it works globally).
2026-04-02 12:18:20 -03:00
erdemozyol
cec3629017 fix: support codex web tools and non-git agents 2026-04-02 14:08:22 +03:00
did:key:z6MkqDnb7Siv3Cwj7pGJq4T5EsUisECqR8KpnDLwcaZq5TPr
d2542c9a62 asdf
Squash the current repository state back into one baseline commit while
preserving the README reframing and repository contents.

Constraint: User explicitly requested a single squashed commit with subject "asdf"
Confidence: high
Scope-risk: broad
Reversibility: clean
Directive: This commit intentionally rewrites published history; coordinate before future force-pushes
Tested: git status clean; local history rewritten to one commit; force-pushed main to origin and instructkr
Not-tested: Fresh clone verification after push
2026-03-31 03:34:03 -07:00