The previous `isPrivateHostname` used a list of regexes against
`URL.hostname`. Several literal-address forms slipped past it:
- IPv4-mapped IPv6 `[::ffff:127.0.0.1]` (WHATWG URL normalizes to
`[::ffff:7f00:1]`, which no regex matched) — lets callers reach
loopback and other private v4 via an IPv6 literal.
- ULA `fc00::/7` (e.g. `[fc00::1]`) — not covered.
- Link-local `fe80::/10` (e.g. `[fe80::1]`) — not covered.
- IPv4 `169.254.0.0/16` (cloud metadata, including 169.254.169.254),
`100.64.0.0/10` (CGNAT), and the full `0.0.0.0/8` — not covered.
- The IPv6 regex `/^\[::1?\]$/` also required brackets, but `URL.hostname`
returns bracketed form anyway, so this part happened to work.
WHATWG `new URL(...)` already normalizes short-form / numeric / hex /
octal IPv4 to dotted-quad before we see it, so those cases were in fact
handled — the remaining gaps were IPv6 and a few missing v4 ranges.
Replace the regex list with:
- a dotted-quad IPv4 parser + int range check covering 0/8, 10/8,
100.64/10, 127/8, 169.254/16, 172.16/12, 192.168/16;
- a small IPv6 parser (handles `::` compression and embedded v4 suffix)
+ a byte-range check covering `::`, `::1`, IPv4-mapped (recursing
into the v4 classifier), IPv4-compatible, `fc00::/7`, `fe80::/10`,
and `fec0::/10`.
Export `isPrivateHostname` and add unit tests covering every bypass
listed above plus public-address negatives.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
`fetchWithRetry` created a fresh `AbortController` per attempt and did:
signal?.addEventListener('abort', () => controller.abort(), { once: true })
The listener was never removed. Consequences:
- On retry, a second listener was attached to the caller's signal,
each closing over a different controller.
- After a successful fetch, the listener remained on the caller's
signal indefinitely, referencing a controller whose work was done.
For a long-lived caller signal this is a slow leak.
- The `{ once: true }` only helps if the signal actually fires — on
non-aborted signals the listener stays attached forever.
Replace the manual controller + timer + listener dance with
`AbortSignal.any([signal, AbortSignal.timeout(ms)])`, which the
codebase already uses elsewhere (see src/services/mcp/xaa.ts). This:
- has no user-code listener to leak,
- gives each attempt a fresh independent timeout,
- cleanly distinguishes caller-initiated abort from timeout via
`signal.aborted` vs `timeoutSignal.aborted` before rewriting the
error as "Custom search timed out after Ns".
Also resets `lastStatus` per attempt so a 5xx on attempt 0 can't leak
into attempt 1's retry decision, and collapses the two redundant
retry branches (`lastStatus >= 500` and `lastStatus === undefined`)
into one.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: custom web search — WEB_URL_TEMPLATE not recognized, timeout too short, silent native fallback
1. custom.ts: Add WEB_URL_TEMPLATE to isConfigured() so the custom provider
is recognized when configured via URL template alone.
2. custom.ts: Bump DEFAULT_TIMEOUT_SECONDS from 15s to 120s.
Self-hosted search APIs (SearXNG, internal) commonly need 30-90s.
3. WebSearchTool.ts: When an explicit adapter is selected via
WEB_SEARCH_PROVIDER=custom, do not silently fall through to the
native Anthropic path on adapter errors or 0-hit results.
- 0 hits: return directly (no fallback)
- Error: throw the real error (no fallback)
- Auto mode: existing fallback behavior preserved
* fix: tighten auto-mode adapter fallback — only swallow transient errors
Address review feedback: in auto mode, only fall through to native on
transient errors (network failure, timeout, HTTP 5xx). Config and
guardrail errors (SSRF, HTTPS, bad URL, header allowlist, etc.) now
surface properly instead of being silently swallowed.
---------
Co-authored-by: FluxLuFFy <fluxluffy@users.noreply.github.com>
* refactor: provider adapter system + 7 new search providers
Architecture:
- Each search backend is a small adapter implementing SearchProvider
- 12 providers: custom, tavily, exa, you, jina, bing, mojeek, linkup, firecrawl, duckduckgo + native
- WEB_SEARCH_PROVIDER controls selection: auto (fallback chain) or specific provider
- Auth always in headers, never in query strings
Bug fixes from review feedback:
- Fix applyDomainFilters catch block: keep hits with malformed URLs on blocked_domains
(can't confirm blocked), drop on allowed_domains (can't confirm allowed)
- Add safeHostname() helper: safely extract hostname from URLs without throwing
- Replace unsafe new URL(r.url).hostname in 7 providers with safeHostname()
- Remove dead code: buildAllHeaders, buildAuthHeaders, parseExtraHeaders from types.ts
- Fix WEB_PARMS typo: consistently use WEB_QUERY_PARAM everywhere
- AbortSignal forwarded to fetch() in all 12 providers
- DuckDuckGo: wrap dynamic import in try/catch for graceful error
- Exa: remove double domain filtering (server-side already)
- runSearch(): aggregate all provider errors instead of throwing only the last one
- Retry logic: check numeric status code directly, retry 5xx/network, skip 4xx
Test coverage (44 tests, all passing):
- types.test.ts: safeHostname, normalizeHit, applyDomainFilters (20 tests)
- index.test.ts: getProviderMode, getProviderChain, getAvailableProviders (13 tests)
- custom.test.ts: extractHits flexible response parsing (11 tests)
Co-authored-by: FluxLuFFy <195792511+FluxLuFFy@users.noreply.github.com>
* security: add guardrails to custom search provider (Option B)
- HTTPS-only by default (opt-out: WEB_CUSTOM_ALLOW_HTTP=true)
- Private/localhost IPs blocked by default (opt-out: WEB_CUSTOM_ALLOW_PRIVATE=true)
- Header allowlist: only known-safe headers allowed unless WEB_CUSTOM_ALLOW_ARBITRARY_HEADERS=true
- Configurable timeout in seconds (WEB_CUSTOM_TIMEOUT_SEC, default 15)
- Configurable POST body limit (WEB_CUSTOM_MAX_BODY_KB, default 300)
- Removed max URL size restriction
- Audit log warning on first custom search call
- Updated .env.example and README_SEARCH_PROVIDERS.md with all new options
* fix: remove custom provider from auto chain (Option 1)
Remove customProvider from the auto fallback chain so it is only
available when WEB_SEARCH_PROVIDER=custom is explicitly selected.
Changes:
- Remove customProvider from ALL_PROVIDERS array in providers/index.ts
- Add 3 new tests verifying custom is excluded from auto chain
- Update README_SEARCH_PROVIDERS.md: auto priority, mode table, note
- Update .env.example: auto priority comment, custom mode annotation
All 47 tests pass (44 existing + 3 new).
Co-Authored-By: @Vasanthdev2004
* fix: address review blockers (routing, abort, config check, domain matching)
1. Native/Codex routing precedence in auto mode
shouldUseAdapterProvider() now checks if native/first-party/vertex/foundry
or Codex paths are available before falling back to adapter providers.
Auto mode: native paths take precedence; adapter is fallback only.
2. AbortError stops provider chain immediately
runSearch() now checks for AbortError/aborted signal before continuing
the fallback chain. Cancelled searches don't create extra outbound requests.
3. Explicit provider mode fails fast on missing credentials
runSearch() validates isConfigured() for explicit modes before attempting
requests. Throws clear error: 'Search provider "X" is not configured.'
4. Domain filter exact-or-subdomain matching (fixes suffix collision)
New hostMatchesDomain() helper: exact match or .subdomain match.
badexample.com no longer matches example.com.
5. Tests: 56 pass (9 new) covering all 4 fixes
Co-Authored-By: @Vasanthdev2004
---------
Co-authored-by: Claude Fix <fix@openclaude.local>
Co-authored-by: FluxLuFFy <195792511+FluxLuFFy@users.noreply.github.com>
Co-authored-by: bot <bot@openclaw.ai>
Inline base64 source maps had been checked into tracked src files. This strips those comments from the repository without changing runtime behavior or adding ongoing guardrails, per the requested one-time cleanup scope.
Constraint: Keep this change limited to tracked source cleanup only
Rejected: Add CI/source verification guard | user requested one-time cleanup only
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If these directives reappear, fix the producing transform instead of reintroducing repo-side cleanup code
Tested: rg -n "sourceMappingURL" ., bun run smoke, bun run verify:privacy, bun run test:provider, npm run test:provider-recommendation
Not-tested: bun run typecheck (repository has many pre-existing unrelated failures)
Co-authored-by: anandh8x <test@example.com>
* added duck duck go for websearch tools that allowed free searching
* update readme
* Replace @phukon/duckduckgo-search with duck-duck-scrape and fix Firecrawl routing priority, and add DDG error handling
* refactor: streamline DuckDuckGo search fallback to use Firecrawl directly on rate limit
* docs: update README to clarify DuckDuckGo web search fallback and its limitations with TOS
WebSearch is currently disabled for all non-Anthropic providers (OpenAI
shim, DeepSeek, Ollama, etc.) because those providers have no native
search backend. This adds Firecrawl as a fallback that activates when
FIRECRAWL_API_KEY is set, unlocking web search for every model
openclaude supports.
WebFetch uses basic HTTP + Turndown for HTML-to-markdown conversion,
which fails silently on JS-rendered SPAs and bot-protected pages.
Firecrawl scrape replaces the fetch layer when FIRECRAWL_API_KEY is set,
returning clean markdown that handles dynamic content correctly.
Changes:
- WebSearchTool: add runFirecrawlSearch() using @mendable/firecrawl-js,
respects allowed_domains (post-filter) and blocked_domains (-site: operators),
includes result snippets alongside links. shouldUseFirecrawl() ensures
firstParty/Vertex/Foundry/Codex providers keep their native backends.
- WebFetchTool: add scrapeWithFirecrawl(), drops into the existing
applyPromptToMarkdown() pipeline so prompt processing is unchanged.
- Remove "Web search is only available in the US" restriction from
prompt when Firecrawl is active (it works globally).
Squash the current repository state back into one baseline commit while
preserving the README reframing and repository contents.
Constraint: User explicitly requested a single squashed commit with subject "asdf"
Confidence: high
Scope-risk: broad
Reversibility: clean
Directive: This commit intentionally rewrites published history; coordinate before future force-pushes
Tested: git status clean; local history rewritten to one commit; force-pushed main to origin and instructkr
Not-tested: Fresh clone verification after push