The OPENAI_CONTEXT_WINDOWS/OPENAI_MAX_OUTPUT_TOKENS tables only contained
the `github:copilot:<model>` namespaced form used when talking directly to
Copilot via /onboard-github. When OpenClaude is pointed at a LiteLLM proxy
(which routes Copilot using the standard `github_copilot/<model>` convention),
the lookup missed and fell back to the conservative 8k default — causing the
compaction loop to fire repeatedly on every tick and blocking requests
before they left the client with repeated "not in context window table"
warnings on stderr.
Mirror the 11 active Copilot models with LiteLLM-style keys in both tables.
No behavior change for users of /onboard-github since namespaced entries
remain untouched and `lookupByKey` picks exact matches first.
The previous `isPrivateHostname` used a list of regexes against
`URL.hostname`. Several literal-address forms slipped past it:
- IPv4-mapped IPv6 `[::ffff:127.0.0.1]` (WHATWG URL normalizes to
`[::ffff:7f00:1]`, which no regex matched) — lets callers reach
loopback and other private v4 via an IPv6 literal.
- ULA `fc00::/7` (e.g. `[fc00::1]`) — not covered.
- Link-local `fe80::/10` (e.g. `[fe80::1]`) — not covered.
- IPv4 `169.254.0.0/16` (cloud metadata, including 169.254.169.254),
`100.64.0.0/10` (CGNAT), and the full `0.0.0.0/8` — not covered.
- The IPv6 regex `/^\[::1?\]$/` also required brackets, but `URL.hostname`
returns bracketed form anyway, so this part happened to work.
WHATWG `new URL(...)` already normalizes short-form / numeric / hex /
octal IPv4 to dotted-quad before we see it, so those cases were in fact
handled — the remaining gaps were IPv6 and a few missing v4 ranges.
Replace the regex list with:
- a dotted-quad IPv4 parser + int range check covering 0/8, 10/8,
100.64/10, 127/8, 169.254/16, 172.16/12, 192.168/16;
- a small IPv6 parser (handles `::` compression and embedded v4 suffix)
+ a byte-range check covering `::`, `::1`, IPv4-mapped (recursing
into the v4 classifier), IPv4-compatible, `fc00::/7`, `fe80::/10`,
and `fec0::/10`.
Export `isPrivateHostname` and add unit tests covering every bypass
listed above plus public-address negatives.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
`fetchWithRetry` created a fresh `AbortController` per attempt and did:
signal?.addEventListener('abort', () => controller.abort(), { once: true })
The listener was never removed. Consequences:
- On retry, a second listener was attached to the caller's signal,
each closing over a different controller.
- After a successful fetch, the listener remained on the caller's
signal indefinitely, referencing a controller whose work was done.
For a long-lived caller signal this is a slow leak.
- The `{ once: true }` only helps if the signal actually fires — on
non-aborted signals the listener stays attached forever.
Replace the manual controller + timer + listener dance with
`AbortSignal.any([signal, AbortSignal.timeout(ms)])`, which the
codebase already uses elsewhere (see src/services/mcp/xaa.ts). This:
- has no user-code listener to leak,
- gives each attempt a fresh independent timeout,
- cleanly distinguishes caller-initiated abort from timeout via
`signal.aborted` vs `timeoutSignal.aborted` before rewriting the
error as "Custom search timed out after Ns".
Also resets `lastStatus` per attempt so a 5xx on attempt 0 can't leak
into attempt 1's retry decision, and collapses the two redundant
retry branches (`lastStatus >= 500` and `lastStatus === undefined`)
into one.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
React 19's react-reconciler@0.33 mutation path calls commitUpdate with
(instance, type, oldProps, newProps, fiber), but our Ink host config
still expected an updatePayload from prepareUpdate. That left mounted
ink-* nodes with stale onKeyDown, tabIndex, and textStyles, making menu
navigation and highlights appear stuck until remount.
Diff old/new props directly inside commitUpdate and add regression tests
covering in-place updates for ink-box handlers/attributes and ink-text
styles.
* fix: report cache reads in streaming and correct cost calculation
Fix two bugs in how the OpenAI-to-Anthropic shim handles cached tokens:
1. codexShim: streaming message_delta missing cache_read_input_tokens
The codexStreamToAnthropic() function builds the final message_delta
usage object inline (not through makeUsage()), and only included
input_tokens and output_tokens. cache_read_input_tokens was always 0,
so /cost never showed cache reads for Responses API models (GPT-5+).
Also fix makeUsage() to read input_tokens_details.cached_tokens and
prompt_tokens_details.cached_tokens for the non-streaming path.
2. Both shims: cost double-counting from convention mismatch
OpenAI includes cached tokens in input_tokens/prompt_tokens (i.e.,
input_tokens = uncached + cached). Anthropic treats input_tokens as
uncached only. The cost formula was:
cost = input_tokens * inputRate + cache_read * cacheRate
This double-counts cached tokens. Fix by subtracting cached from
input during the conversion:
input_tokens = prompt_tokens - cached_tokens
In practice this was inflating reported costs by ~2x for sessions
with high cache hit rates (which is most sessions, since Copilot
auto-caches server-side).
Fixes#515
* fix: omit zero cache read/write fields from /cost output
Only show "cache read" and "cache write" in /cost per-model usage when
the value is > 0. Providers like GitHub Copilot never report
cache_creation_input_tokens (the server manages its own cache), so
showing "0 cache write" on every line is misleading — it implies caching
is not working when it actually is.
Before:
claude-haiku: 2.6k input, 151 output, 39.8k cache read, 0 cache write ($0.04)
After:
claude-haiku: 2.6k input, 151 output, 39.8k cache read ($0.04)
---------
Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>
Set store: false in the request body for both the Chat Completions path
and the /responses fallback path in openaiShim.ts.
The codexShim (Responses API primary path) already sets store: false.
The Chat Completions path and the /responses fallback in openaiShim were
missing it.
store: false tells the API provider not to persist conversation data for
model training, logging, or other non-operational purposes. This is a
privacy measure — it does not affect caching or functionality.
Note: Whether third-party proxies (e.g. GitHub Copilot) honour this
parameter is provider-dependent, but setting it is a reasonable default
for user privacy.
Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>
Add context_window and max_output_tokens entries for all models available
through the GitHub Copilot proxy (Claude, GPT, Gemini, Grok), sourced from
https://api.githubcopilot.com/models.
Models are namespaced as "github:copilot:<model>" to avoid collisions with
the same model names served by other providers (which may have different
limits). A new lookupByKey() helper and qualified-key lookup in
lookupByModel() ensures the correct limits are selected when
OPENAI_MODEL=github:copilot.
Without this, Claude models on Copilot would use default context/output
limits that may not match the proxy's actual constraints, causing 400 errors
like "max_tokens is too large".
Related: #515
Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>
Treat profile-managed env as restart state rather than explicit user intent so saved OpenAI-compatible profiles can replace stale Ollama values on startup and persist correctly across restarts.
Co-authored-by: Claude Opus 4.6 <noreply@openclaude.dev>
* Stop canonical Anthropic headers from leaking into 3P shim requests
The remaining blocker from PR #268 was that canonical Anthropic headers such as
`anthropic-version` and `anthropic-beta` could still ride through supported 3P
paths even after the earlier x-anthropic/x-claude scrubber work. This tightens
header filtering inside the shim itself so direct defaultHeaders, env-driven
client setup, providerOverride routing, and per-request header injection all
share the same scrubber.
Constraint: Preserve non-Anthropic custom headers and provider auth while stripping only Anthropic/OpenClaude-internal headers from 3P requests
Rejected: Rely on client.ts filtering alone | direct shim construction and per-request headers would still leave gaps
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep header scrubbing centralized in the shim so new call paths do not reopen 3P leakage bugs
Tested: bun test src/services/api/openaiShim.test.ts src/services/api/client.test.ts src/utils/context.test.ts
Tested: bun run test:provider
Tested: bun run build && node dist/cli.mjs --version
Not-tested: bun run typecheck (repository baseline currently fails in many unrelated files)
* Keep OpenAI client tests from restoring undefined env as strings
The new header-leak regression tests in client.test.ts restored environment
variables via direct assignment, which can leave literal "undefined" strings in
process.env when the original value was unset. This switches the teardown over
to the same restore helper pattern already used in openaiShim.test.ts.
Constraint: Keep the fix limited to test hygiene without altering runtime behavior
Rejected: Restore only the two env vars Copilot called out | using one helper for all test env restores is simpler and less error-prone
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Use restore helpers for env teardown in tests so unset values stay deleted instead of becoming the string "undefined"
Tested: bun test src/services/api/client.test.ts src/services/api/openaiShim.test.ts src/utils/context.test.ts
Not-tested: Full provider suite (unchanged runtime path)
* Prevent GitHub Codex requests from forwarding unsanitized Anthropic headers
A base-sync with upstream exposed a separate GitHub+Codex transport branch
that still merged per-request headers raw before adding Copilot headers.
This keeps the filter aligned across Codex-family paths and adds explicit
regression tests for GitHub Codex routing, including providerOverride.
Constraint: Must not push or modify GitHub state while validating the reviewer concern
Rejected: Leave the GitHub Codex path unchanged | runtime repro showed anthropic-* headers still leaked after the upstream sync
Confidence: high
Scope-risk: narrow
Directive: Keep header scrubbing consistent across every Codex-family transport branch when provider routing changes
Tested: bun test src/services/api/openaiShim.test.ts
Tested: bun test src/services/api/client.test.ts src/services/api/codexShim.test.ts src/services/api/providerConfig.github.test.ts
Tested: bun run build
Not-tested: Full repository test suite
Treat default select focus as initial state so /theme and first-run previews follow keyboard navigation again.
Co-authored-by: anandh8x <test@example.com>
Add a /cache-probe slash command for debugging prompt caching behaviour
on OpenAI-compatible providers (GitHub Copilot, OpenAI direct).
The command sends two identical API requests in sequence and compares the
raw server response usage stats, showing:
- Input/output token counts
- Cache read tokens (from prompt_tokens_details or input_tokens_details)
- Latency for each request
- Cache hit rate percentage
Usage:
/cache-probe # test default model
/cache-probe claude-sonnet-4 # test specific model
/cache-probe gpt-5.4 --no-key # test without prompt_cache_key
The --no-key flag omits prompt_cache_key/prompt_cache_retention/store to
test whether the server does content-based auto-caching (it does on
GitHub Copilot).
This is a debugging/diagnostic tool, not intended for regular use. It was
instrumental in discovering that:
1. Copilot auto-caches server-side based on content hash
2. prompt_cache_key is ignored by the proxy
3. The streaming path was not reporting cached tokens
Only enabled when the provider is OpenAI or GitHub (not for firstParty
Anthropic which has different caching semantics).
Related: #515
Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>
* feat: add AutoFix config schema and reader module
Implements AutoFixConfigSchema (Zod v4) with validation for lint/test
commands, maxRetries (0-10, default 3), and timeout (1000-300000ms,
default 30000). Adds getAutoFixConfig helper that returns null for
disabled or invalid configs. All 9 unit tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add autoFix runner with lint/test command execution
Implements AutoFixRunner (Task 2) - executes lint and test shell commands
sequentially, short-circuits on lint failure, handles timeouts, and
produces structured AutoFixResult with AI-friendly error summaries.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add autoFix field to SettingsSchema with integration tests
Integrates AutoFixConfigSchema into SettingsSchema so autoFix settings
are validated at the settings layer. Adds two integration tests verifying
that valid configs are accepted and invalid configs (enabled with no
commands) are rejected.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add autoFix hook integration helpers (Task 4)
Implements shouldRunAutoFix and buildAutoFixContext functions used by
the PostToolUse hook to determine when to run auto-fix and format
errors as AI-readable context for injection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: wire autoFix into PostToolUse hook flow (Task 5)
Add auto-fix lint/test check after existing PostToolUse hooks in
runPostToolUseHooks. When autoFix is configured in settings, runs
lint/test commands after file_edit/file_write tools and yields
errors as hook_additional_context for the model to act on.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add /auto-fix slash command
Adds the /auto-fix prompt command that helps users configure autoFix settings
(lint/test commands, maxRetries, timeout) in .claude/settings.json.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: remove unused imports in autoFixRunner test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address review feedback — enforce maxRetries, wire abort signal, use cross-platform shell
1. Enforce maxRetries: track auto-fix attempts per query chain in toolHooks.ts
and stop feeding errors back after the configured limit is reached.
2. Wire abort signal to subprocess: subscribe to AbortController signal in
runCommand() and kill the process tree on abort. Uses detached process
groups on Unix to ensure child processes are also terminated.
3. Replace hardcoded bash with shell:true: use Node's cross-platform shell
resolution instead of spawn('bash', ['-c', ...]) so auto-fix commands
work on Windows and non-bash environments.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: custom web search — WEB_URL_TEMPLATE not recognized, timeout too short, silent native fallback
1. custom.ts: Add WEB_URL_TEMPLATE to isConfigured() so the custom provider
is recognized when configured via URL template alone.
2. custom.ts: Bump DEFAULT_TIMEOUT_SECONDS from 15s to 120s.
Self-hosted search APIs (SearXNG, internal) commonly need 30-90s.
3. WebSearchTool.ts: When an explicit adapter is selected via
WEB_SEARCH_PROVIDER=custom, do not silently fall through to the
native Anthropic path on adapter errors or 0-hit results.
- 0 hits: return directly (no fallback)
- Error: throw the real error (no fallback)
- Auto mode: existing fallback behavior preserved
* fix: tighten auto-mode adapter fallback — only swallow transient errors
Address review feedback: in auto mode, only fall through to native on
transient errors (network failure, timeout, HTTP 5xx). Config and
guardrail errors (SSRF, HTTPS, bad URL, header allowlist, etc.) now
surface properly instead of being silently swallowed.
---------
Co-authored-by: FluxLuFFy <fluxluffy@users.noreply.github.com>
* refactor: provider adapter system + 7 new search providers
Architecture:
- Each search backend is a small adapter implementing SearchProvider
- 12 providers: custom, tavily, exa, you, jina, bing, mojeek, linkup, firecrawl, duckduckgo + native
- WEB_SEARCH_PROVIDER controls selection: auto (fallback chain) or specific provider
- Auth always in headers, never in query strings
Bug fixes from review feedback:
- Fix applyDomainFilters catch block: keep hits with malformed URLs on blocked_domains
(can't confirm blocked), drop on allowed_domains (can't confirm allowed)
- Add safeHostname() helper: safely extract hostname from URLs without throwing
- Replace unsafe new URL(r.url).hostname in 7 providers with safeHostname()
- Remove dead code: buildAllHeaders, buildAuthHeaders, parseExtraHeaders from types.ts
- Fix WEB_PARMS typo: consistently use WEB_QUERY_PARAM everywhere
- AbortSignal forwarded to fetch() in all 12 providers
- DuckDuckGo: wrap dynamic import in try/catch for graceful error
- Exa: remove double domain filtering (server-side already)
- runSearch(): aggregate all provider errors instead of throwing only the last one
- Retry logic: check numeric status code directly, retry 5xx/network, skip 4xx
Test coverage (44 tests, all passing):
- types.test.ts: safeHostname, normalizeHit, applyDomainFilters (20 tests)
- index.test.ts: getProviderMode, getProviderChain, getAvailableProviders (13 tests)
- custom.test.ts: extractHits flexible response parsing (11 tests)
Co-authored-by: FluxLuFFy <195792511+FluxLuFFy@users.noreply.github.com>
* security: add guardrails to custom search provider (Option B)
- HTTPS-only by default (opt-out: WEB_CUSTOM_ALLOW_HTTP=true)
- Private/localhost IPs blocked by default (opt-out: WEB_CUSTOM_ALLOW_PRIVATE=true)
- Header allowlist: only known-safe headers allowed unless WEB_CUSTOM_ALLOW_ARBITRARY_HEADERS=true
- Configurable timeout in seconds (WEB_CUSTOM_TIMEOUT_SEC, default 15)
- Configurable POST body limit (WEB_CUSTOM_MAX_BODY_KB, default 300)
- Removed max URL size restriction
- Audit log warning on first custom search call
- Updated .env.example and README_SEARCH_PROVIDERS.md with all new options
* fix: remove custom provider from auto chain (Option 1)
Remove customProvider from the auto fallback chain so it is only
available when WEB_SEARCH_PROVIDER=custom is explicitly selected.
Changes:
- Remove customProvider from ALL_PROVIDERS array in providers/index.ts
- Add 3 new tests verifying custom is excluded from auto chain
- Update README_SEARCH_PROVIDERS.md: auto priority, mode table, note
- Update .env.example: auto priority comment, custom mode annotation
All 47 tests pass (44 existing + 3 new).
Co-Authored-By: @Vasanthdev2004
* fix: address review blockers (routing, abort, config check, domain matching)
1. Native/Codex routing precedence in auto mode
shouldUseAdapterProvider() now checks if native/first-party/vertex/foundry
or Codex paths are available before falling back to adapter providers.
Auto mode: native paths take precedence; adapter is fallback only.
2. AbortError stops provider chain immediately
runSearch() now checks for AbortError/aborted signal before continuing
the fallback chain. Cancelled searches don't create extra outbound requests.
3. Explicit provider mode fails fast on missing credentials
runSearch() validates isConfigured() for explicit modes before attempting
requests. Throws clear error: 'Search provider "X" is not configured.'
4. Domain filter exact-or-subdomain matching (fixes suffix collision)
New hostMatchesDomain() helper: exact match or .subdomain match.
badexample.com no longer matches example.com.
5. Tests: 56 pass (9 new) covering all 4 fixes
Co-Authored-By: @Vasanthdev2004
---------
Co-authored-by: Claude Fix <fix@openclaude.local>
Co-authored-by: FluxLuFFy <195792511+FluxLuFFy@users.noreply.github.com>
Co-authored-by: bot <bot@openclaw.ai>
Root cause: IMAGE_MAX_WIDTH and IMAGE_MAX_HEIGHT were set to 2000 — exactly the APIs many-image dimension limit. Images resized to exactly 2000px would get rejected when the conversation accumulated enough images to trigger the API's many-image mode.
Fix: Changed both constants from 2000 to 1568 in src/constants/apiLimits.ts:42-43. This is the resolution the API internally downscales to anyway (documented in the API's encoding/full_encoding.py), so there is zero effective quality loss. All images are now safely below the many-image threshold.
export const IMAGE_MAX_WIDTH = 1568
export const IMAGE_MAX_HEIGHT = 1568
Impact: The single constant change propagates everywhere — imageResizer.ts uses IMAGE_MAX_WIDTH/IMAGE_MAX_HEIGHT for all resize decisions, and the error messages reference these constants dynamically. No other files need changes.
The select cursor highlight was broken because isDeepStrictEqual in
use-select-navigation.ts and use-multi-select-state.ts would fail when
options contained identity-unstable properties (JSX label elements,
function onChange callbacks, computed disabled booleans). This caused
the reset logic to fire on every re-render, resetting focusedValue
back to the first option.
Replace isDeepStrictEqual with optionsNavigateEqual which only compares
properties that affect navigation behavior: value, disabled, and type.
ReactNode labels and function callbacks are intentionally excluded as
they are identity-unstable but don't change navigation semantics.
Fixes#472
Co-authored-by: OpenClaude Worker 3 <worker-3@openclaude.local>
Fixes#430. In normalizeSchemaForOpenAI(), the strict branch was adding every
property key to required[], including optional ones. This caused providers like
Groq, Azure OpenAI, and others to reject valid tool calls with a 400 /
tool_use_failed error because the model correctly omits optional arguments but
the provider sees them as missing required fields.
Root cause: the strict branch used `[...existingRequired, ...allKeys]` instead
of `existingRequired.filter(k => k in normalizedProps)`. The Gemini branch
already had the correct logic.
Fix: align the strict branch with the Gemini branch — only keep properties that
were already marked required in the original schema. The additionalProperties:
false constraint is preserved as strict-mode providers still require it.
Add regression test covering the Read tool schema (file_path required,
offset/limit/pages optional).
* fix: defer startup plugin checks and suppress recommendation dialogs during startup window (issue #363)
Root cause: performStartupChecks() fires immediately on REPL mount,
triggering plugin loading which populates trackedFiles, which triggers
useLspPluginRecommendation to surface an LSP recommendation dialog.
Since promptTypingSuppressionActive is false before any user input,
getFocusedInputDialog() returns the dialog, unmounting PromptInput
entirely and making the CLI appear frozen.
Fix: Two-pronged approach:
1. Defer performStartupChecks by 1500ms and gate on
promptTypingSuppressionActive so startup checks dont run while
the user is typing or has early input buffered
2. Suppress lower-priority startup dialogs (LSP recommendation,
plugin hint, desktop upsell) until startupChecksStartedRef is
true, preventing them from stealing focus during the vulnerable
startup window
This also explains why --bare mode and disabling plugins work:
--bare mode skips plugin loading entirely, and disabling the
autoresearch plugin eliminates the LSP match, so lspRecommendation
stays null and PromptInput renders normally.
* fix: move startup checks effect after promptTypingSuppressionActive declaration
Fixes temporal dead zone warning flagged by code-quality bot.
promptTypingSuppressionActive is declared on line ~1340 but the
useEffect was on line ~800, causing a reference-before-declaration.
Also adds missing semicolons for style consistency.
* fix: gate startup checks on prompt readiness, not just a timeout (issue #363)
The previous approach used a fixed 1500ms timeout, but as gnanam1990
pointed out, if a user pauses for >1.5s before typing the timer can
still fire and recommendation dialogs can steal focus. This is a
timing mitigation, not a reliable fix.
New approach: gate startup checks on actual prompt readiness:
1. After first message submission (submitCount > 0) — always safe
2. After grace period (3s) elapsed AND user is idle — safe because
no dialog will interrupt an idle user who hasn't started typing
3. While user is actively typing — deferrred until they stop
This ensures startup checks never steal focus from a prompt the user
is about to type into, regardless of how long they pause before typing.
Also removes the old STARTUP_CHECK_DELAY_MS constant in favor of
STARTUP_GRACE_PERIOD_MS with clearer semantics.
* fix: move startup checks after submitCount declaration to avoid temporal dead zone
Code quality bot flagged that submitCount was used before its declaration.
Moved the entire startup checks block to after the submitCount useState
declaration. Also added nullish coalescing (submitCount ?? 0) per bot
suggestion.
* fix: gate startup checks strictly on first submission, remove grace period (issue #363)
As gnanam1990 pointed out, the 3s grace period still allows the failure
mode: if a user pauses for a few seconds before typing, startup checks
fire and recommendation dialogs steal focus. A grace period is still a
timing mitigation, not a reliable fix.
New approach: startup checks only run after the user has submitted their
first message (submitCount > 0). No grace period, no timeout. This
guarantees the prompt gets first interaction — no dialog can steal focus
before the user has actually used the CLI.
If the user never submits a message, startup checks never run. That's
acceptable because with no user interaction there's no need for plugin
installations or marketplace seeding.
---------
Co-authored-by: OpenClaude Worker 3 <worker-3@openclaude.local>
* update gitHub copilot API with offical client id and update model configurations
* test: add unit tests for exchangeForCopilotToken and enhance GitHub model normalization
* remove PAT token feature
* test(api): harden provider tests against env leakage
* Added back trimmed github auth token
* added auto refresh logic for auto token along with test
* fix: remove forked provider validation in cli.tsx and clear stale provider env vars in /onboard-github
* refactor: streamline environment variable handling in mergeUserSettingsEnv
* fix: clear stale provider env vars to ensure correct GH routing
* Remove internal-only tooling from the external build (#352)
* Remove internal-only tooling without changing external runtime contracts
This trims the lowest-risk internal-only surfaces first: deleted internal
modules are replaced by build-time no-op stubs, the bundled stuck skill is
removed, and the insights S3 upload path now stays local-only. The privacy
verifier is expanded and the remaining bundled internal Slack/Artifactory
strings are neutralized without broad repo-wide renames.
Constraint: Keep the first PR deletion-heavy and avoid mass rewrites of USER_TYPE, tengu, or claude_code identifiers
Rejected: One-shot DMCA cleanup branch | too much semantic risk for a first PR
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Treat full-repo typecheck as a baseline issue on this upstream snapshot; do not claim this commit introduced the existing non-Phase-A errors without isolating them first
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Not-tested: Full repo typecheck (currently fails on widespread pre-existing upstream errors outside this change set)
* Keep minimal source shims so CI can import Phase A cleanup paths
The first PR removed internal-only source files entirely, but CI provider
and context tests import those modules directly from source rather than
through the build-time no-telemetry stubs. This restores tiny no-op source
shims so tests and local source imports resolve while preserving the same
external runtime behavior.
Constraint: GitHub Actions runs source-level tests in addition to bundled build/privacy checks
Rejected: Revert the entire deletion pass | unnecessary once the import contract is satisfied by small shims
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: For later cleanup phases, treat build-time stubs and source-test imports as separate compatibility surfaces
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (still noisy on this upstream snapshot)
---------
Co-authored-by: anandh8x <test@example.com>
* Reduce internal-only labeling noise in source comments (#355)
This pass rewrites comment-only ANT-ONLY markers to neutral internal-only
language across the source tree without changing runtime strings, flags,
commands, or protocol identifiers. The goal is to lower obvious internal
prose leakage while keeping the diff mechanically safe and easy to review.
Constraint: Phase B is limited to comments/prose only; runtime strings and user-facing labels remain deferred
Rejected: Broad search-and-replace across strings and command descriptions | too risky for a prose-only pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Remaining ANT-ONLY hits are mostly runtime/user-facing strings and should be handled separately from comment cleanup
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
Co-authored-by: anandh8x <test@example.com>
* Neutralize internal Anthropic prose in explanatory comments (#357)
This is a small prose-only follow-up that rewrites clearly internal or
explanatory Anthropic comment language to neutral wording in a handful of
high-confidence files. It avoids runtime strings, flags, command labels,
protocol identifiers, and provider-facing references.
Constraint: Keep this pass narrowly scoped to comments/documentation only
Rejected: Broader Anthropic comment sweep across functional API/protocol references | too ambiguous for a safe prose-only PR
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Leave functional Anthropic references (API behavior, SDKs, URLs, provider labels, protocol docs) for separate reviewed passes
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
Co-authored-by: anandh8x <test@example.com>
* Neutralize remaining internal-only diagnostic labels (#359)
This pass rewrites a small set of ant-only diagnostic and UI labels to
neutral internal wording while leaving command definitions, flags, and
runtime logic untouched. It focuses on internal debug output, dead UI
branches, and noninteractive headings rather than broader product text.
Constraint: Label cleanup only; do not change command semantics or ant-only logic gates
Rejected: Renaming ant-only command descriptions in main.tsx | broader UX surface better handled in a separate reviewed pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Remaining ANT-ONLY hits are mostly command descriptions and intentionally deferred user-facing strings
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
Co-authored-by: anandh8x <test@example.com>
* Finish eliminating remaining ANT-ONLY source labels (#360)
This extends the label-only cleanup to the remaining internal-only command,
debug, and heading strings so the source tree no longer contains ANT-ONLY
markers. The pass still avoids logic changes and only renames labels shown
in internal or gated surfaces.
Constraint: Update the existing label-cleanup PR without widening scope into behavior changes
Rejected: Leave the last ANT-ONLY strings for a later pass | low-cost cleanup while the branch is already focused on labels
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: The next phase should move off label cleanup and onto a separately scoped logic or rebrand slice
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
Co-authored-by: anandh8x <test@example.com>
* Stub internal-only recording and model capability helpers (#377)
This follow-up Phase C-lite slice replaces purely internal helper modules
with stable external no-op surfaces and collapses internal elevated error
logging to a no-op. The change removes additional USER_TYPE-gated helper
behavior without touching product-facing runtime flows.
Constraint: Keep this PR limited to isolated helper modules that are already external no-ops in practice
Rejected: Pulling in broader speculation or logging sink changes | less isolated and easier to debate during review
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Continue Phase C with similarly isolated helpers before moving into mixed behavior files
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
Co-authored-by: anandh8x <test@example.com>
* Remove internal-only bundled skills and mock helpers (#376)
* Remove internal-only bundled skills and mock rate-limit behavior
This takes the next planned Phase C-lite slice by deleting bundled skills
that only ever registered for internal users and replacing the internal
mock rate-limit helper with a stable no-op external stub. The external
build keeps the same behavior while removing a concentrated block of
USER_TYPE-gated dead code.
Constraint: Limit this PR to isolated internal-only helpers and avoid bridge, oauth, or rebrand behavior
Rejected: Broad USER_TYPE cleanup across mixed runtime surfaces | too risky for the next medium-sized PR
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: The next cleanup pass should continue with similarly isolated USER_TYPE helpers before touching main.tsx or protocol-heavy code
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
* Align internal-only helper removal with remaining user guidance
This follow-up fixes the mock billing stub to be a true no-op and removes
stale user-facing references to /verify and /skillify from the same PR.
It also leaves a clearer paper trail for review: the deleted verify skill
was explicitly ant-gated before removal, and the remaining mock helper
callers still resolve to safe no-op returns in the external build.
Constraint: Keep the PR focused on consistency fixes and reviewer-requested evidence, not new cleanup scope
Rejected: Leave stale guidance for a later PR | would make this branch internally inconsistent after skill removal
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When deleting gated features, always sweep user guidance and coordinator prompts in the same pass
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy; changed-file scan still shows only pre-existing tipRegistry errors outside edited lines)
* Clarify generic workflow wording after skill removal
This removes the last generic verification-skill wording that could still
be read as pointing at a deleted bundled command. The guidance now talks
about project workflows rather than a specific bundled verify skill.
Constraint: Keep the follow-up limited to reviewer-facing wording cleanup on the same PR
Rejected: Leave generic wording as-is | still too easy to misread after the explicit /verify references were removed
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When removing bundled commands, scrub both explicit and generic references in the same branch
Tested: bun run build
Tested: bun run smoke
Not-tested: Additional checks unchanged by wording-only follow-up
---------
Co-authored-by: anandh8x <test@example.com>
* test(api): add GEMINI_AUTH_MODE to environment setup in tests
* test: isolate GitHub/Gemini credential tests with fresh module imports and explicit non-bare env setup to prevent cross-test mock/cache leaks
* fix: update GitHub Copilot base URL and model defaults for improved compatibility
* fix: enhance error handling in OpenAI API response processing
* fix: improve error handling for GitHub Copilot API responses and streamline error body consumption
* fix: enhance response handling in OpenAI API shim for better error reporting and support for streaming responses
* feat: enhance GitHub device flow with fresh module import and token validation improvements
* fix: separate Copilot API routing from GitHub Models, clear stale env vars, honor providerOverride.apiKey
* fix: route GitHub GPT-5/Codex to Copilot API, show all Copilot models in picker, clear stale env vars
* fix GitHub Models API regression
* feat: update GitHub authentication to require OAuth tokens, normalize model handling for Copilot and GitHub Models
* fix: update GitHub token validation to support OAuth tokens and improve endpoint type handling
---------
Co-authored-by: Anandan <anandan.8x@gmail.com>
Co-authored-by: anandh8x <test@example.com>
* fix: strip Anthropic-specific params from 3P provider paths
Three silent failure modes affecting all third-party provider users:
1. Thinking blocks serialized as <thinking> text corrupt multi-turn
context — strip them instead of converting to raw text tags.
2. Unknown models fall through to 200k context window default, so
auto-compact never triggers — use conservative 8k for unknown
3P models with a warning log.
3. Session resume with thinking blocks causes 400 or context corruption
on 3P providers — strip thinking/redacted_thinking content blocks
from deserialized messages when resuming against a non-Anthropic
provider.
Addresses findings 2, 3, and 5 from #248.
* test: align resume stripping expectation with orphan-thinking filter
* test: isolate provider env in conversation recovery tests
* test: move provider-sensitive resume coverage behind module mocks
* test: trim extra blank lines in conversation recovery test
Keep the focused provider-resume test diff clean so the regression branch stays easy to review.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
---------
Co-authored-by: Claude Opus 4.6 <noreply@openclaude.dev>
* fix: restore Grep and Glob reliability on OpenAI paths
Preserve Grep and Glob pattern fields during OpenAI/Codex schema sanitization, and fall back to system ripgrep when the packaged binary is missing. This keeps search tool schemas intact and improves Linux usability for npm/source installs.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
* test: clean up ripgrep fallback test helpers
Remove the unused ripgrepCommand import and normalize mocked builtin ripgrep paths so the test behaves consistently across platforms.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
* test: remove duplicate Codex URI schema case
Drop the duplicated WebFetch URI-format test in codexShim.test.ts so test names stay unique and failures remain easier to read.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
* test: stabilize ripgrep fallback coverage
Avoid fs/module mocking in ripgrep fallback tests by extracting the config selection logic into a pure helper. This preserves the fallback coverage while removing the test interaction that caused the narrowed Bun hang repro.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
* test: tighten ripgrep and schema coverage
Align the ripgrep fallback test with the actual auto-fallback branch, clean up strict typing in schema sanitizer tests, and tighten ripgrep error narrowing for type safety.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
---------
Co-authored-by: Claude Opus 4.6 <noreply@openclaude.dev>
* fix: address code scanning alerts
Parse Gemini hostnames instead of matching raw URL substrings, redact gRPC error logs, and harden the Finder drag-drop test escape helper so the flagged paths are fixed without regressing working behavior.
* Potential fix for pull request finding 'CodeQL / Clear-text logging of sensitive information'
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix: restore safe grpc error summaries
A later autofix commit removed the exported gRPC error summarizer while the new regression test still imported it. Restore the safe name/code-only summary so CI stays green without reintroducing clear-text logging.
* fix: keep grpc logging generic
Remove the stale helper/test pair and keep the gRPC startup and stream logs free of error-derived data so the CodeQL clear-text logging alert stays closed while the rest of the security fixes remain intact.
---------
Co-authored-by: OpenClaude Worker 3 <worker-3@openclaude.local>
Co-authored-by: Vasanth T <148849890+Vasanthdev2004@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
When a proxy is configured, configureGlobalAgents() loads undici to set a
global dispatcher. However, undici v7.24.6 requires Node.js >= 20.18.1 and
references globalThis.File at module evaluation time for webidl type assertions.
Node 18 lacks the File global, causing ReferenceError inside the bundled
__commonJS require chain, which deadlocks due to unresolved circular
dependencies in the module initialization.
Fix by polyfilling globalThis.File early in cli.tsx entrypoint, before any
undici code loads. Try node:buffer.File (available in Node 18.13+), fallback
to minimal Blob-based stub.
Fixes: bun run start hangs indefinitely when HTTP_PROXY/HTTPS_PROXY is set
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>