Two provider routing bugs that cause silent wrong-model failures:
1. model.ts: getUserSpecifiedModelSetting() read ANTHROPIC_MODEL ||
GEMINI_MODEL || OPENAI_MODEL with no provider check. A user
switching from Anthropic to OpenAI with ANTHROPIC_MODEL still set
would silently send the Anthropic model name to the OpenAI API.
Now gates each env var behind the active provider from
getAPIProvider().
2. providers.ts: isCodexModel() maintained a hardcoded list of 8 model
names that was missing gpt-5.4-mini and gpt-5.2 from the canonical
CODEX_ALIAS_MODELS table in providerConfig.ts. This caused a
split-brain: getAPIProvider() returned 'openai' while
resolveProviderRequest() selected 'codex_responses' transport.
Now delegates to the exported isCodexAlias() to keep both detection
systems in sync.
* fix: OAuth tokens secure storage for Windows & Linux
* fix: OAuth tokens secure storage for Windows & Linux #215
* fix: OAuth tokens secure storage for Windows & Linux #215
* fix: OAuth tokens secure storage for Windows & Linux #215
* fix: disable experimental API betas by default to prevent 500 errors
Tool search (defer_loading), global cache scope, and context management
betas require internal Anthropic server-side support. External accounts
receive 500 Internal Server Error when these are sent.
Set CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=true by default in the CLI
entrypoint. Users with internal access can opt back in with =false.
Also includes: cache key stability fixes (Sonnet 1M latch, system-before-
messages key ordering, resume fingerprint isMeta skip), sideQuery default
cleanup, and /dream command.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: standardize API headers to Headers type and enable tengu feature flags by default
* fix: address PR review — dream lock, MCP betas guard, redundant Partial
- Call recordConsolidation() programmatically in /dream instead of
delegating to model prompt (unreliable)
- Add CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS guard to MCP entrypoint
(was only in CLI entrypoint, causing 500s in MCP server mode)
- Remove redundant ? markers from SecretValueSource Partial<{}> type
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getPromptCachingEnabled() returned true for all providers including
Azure Foundry, OpenAI, Gemini, and GitHub. This caused cache_control
blocks to be injected into every request sent to 3P endpoints.
Azure Foundry's strict Responsible AI content filter treats unexpected
Anthropic-specific fields (cache_control: { type: "ephemeral" }) as
a jailbreak signal and rejects the request with a 400 error — even
for a simple prompt like "hi".
Fix: return false early when provider is not firstParty, bedrock, or
vertex — the only providers that understand and support prompt caching.
Fixes#273
Related: #267 (Finding 1)
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add agentModels and agentRouting to SettingsSchema
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add agentRouting module for per-agent provider resolution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: thread providerOverride through OpenAI shim for per-agent routing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: getAnthropicClient accepts providerOverride for agent routing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: thread providerOverride through Options and queryModel calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: thread providerOverride through query loop and ToolUseContext
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: resolve agent routing in runAgent and inject providerOverride
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add Agent Routing configuration guide to README
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add unit tests for resolveAgentProvider + plaintext api_key note
- 15 tests covering priority chain (name > subagentType > default > null)
- normalize() case-insensitive and hyphen/underscore equivalence
- Edge cases: null settings, missing config sections, non-existent model
- README note about api_key stored in plaintext
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* security: address code review — SSRF, credential leak, key collision
- base_url schema now uses z.string().url() for SSRF mitigation
- Strip auth headers (Authorization, x-api-key, api-key) from
defaultHeaders when providerOverride is active, preventing
Anthropic credentials from leaking to third-party endpoints
- Warn on duplicate normalized routing keys to prevent silent shadowing
- providerOverride.apiKey is never logged (verified via grep)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: 冯俊辉 <fengjunhui@shiyanjia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add --provider CLI flag for multi-provider support
Adds a --provider flag that maps friendly provider names to the
environment variables the codebase uses for provider detection.
No more manual env-var configuration — users can now simply run:
openclaude --provider openai --model gpt-4o
openclaude --provider gemini --model gemini-2.0-flash
openclaude --provider ollama --model llama3.2
openclaude --provider bedrock
openclaude --provider vertex
Implementation details:
- providerFlag.ts: core logic — maps provider names to env vars,
uses ??= so explicit env vars always win over the flag defaults
- providerFlag.test.ts: 18 tests covering all 7 providers,
error messages, model passthrough, and env-var precedence
- cli.tsx: early fast-path (mirrors --bare pattern) — sets env
vars before Commander option-building and module constants run
- main.tsx: adds --provider to Commander option chain for --help
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: custom OPENAI_BASE_URL always wins over Codex model alias detection
When OPENAI_MODEL=gpt-5.4 (or gpt-5.4-mini) and a custom OPENAI_BASE_URL
is set (Azure, OpenRouter, etc), the transport was incorrectly forced to
codex_responses because gpt-5.4 is in CODEX_ALIAS_MODELS. This caused
requests to be sent with Codex auth instead of the user's API key,
resulting in 401 Unauthorized errors.
Fix: only use codex_responses when the base URL is explicitly the Codex
endpoint, OR when no custom base URL is set and the model is a Codex
alias. An explicit OPENAI_BASE_URL always takes priority over model-name
based Codex detection.
Verified locally: gpt-5.4 via OpenRouter now correctly shows
Provider=OpenRouter, Endpoint=https://openrouter.ai/api/v1 instead of
routing to chatgpt.com/backend-api/codex.
Fixes#200, #203
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The Gemini provider uses Google's OpenAI-compatible endpoint
(generativelanguage.googleapis.com/v1beta/openai) but the client
routing condition in client.ts only checked CLAUDE_CODE_USE_OPENAI
and CLAUDE_CODE_USE_GITHUB — CLAUDE_CODE_USE_GEMINI was missing.
This caused every Gemini request to fall through to the Anthropic
client path. Since ANTHROPIC_API_KEY is not set when using Gemini,
the Anthropic SDK threw:
"Could not resolve authentication method. Expected either apiKey
or authToken to be set."
Fix: add CLAUDE_CODE_USE_GEMINI to the OpenAI shim routing condition
so Gemini requests correctly reach createOpenAIShimClient(), which
maps GEMINI_API_KEY → OPENAI_API_KEY and sets OPENAI_BASE_URL to
the Google endpoint.
Closes#176
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously getRateLimitResetDelayMs only read the Anthropic-specific
'anthropic-ratelimit-unified-reset' header (Unix timestamp), returning
null for every other provider. This meant OpenAI, GitHub, and Codex
users in persistent retry mode (CLAUDE_CODE_UNATTENDED_RETRY=1) always
fell back to dumb exponential backoff even when the server included an
exact reset time in the response headers.
This change makes the function provider-aware:
- firstParty (Anthropic): existing behaviour preserved — reads
'anthropic-ratelimit-unified-reset' Unix timestamp
- openai / codex / github: reads 'x-ratelimit-reset-requests' and
'x-ratelimit-reset-tokens' (OpenAI relative duration strings like
"1s", "6m0s", "1h30m0s"), picks the larger of the two so retries
don't fire before both token and request limits have reset
- bedrock / vertex / foundry / gemini: returns null (no standard
reset header for these providers)
Adds parseOpenAIDuration() as an exported helper to convert OpenAI's
duration format into milliseconds.
16 new tests covering all provider paths and edge cases.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. errors.ts: Add getCustomOffSwitchMessage() that returns a
provider-neutral message for 3P users instead of the hardcoded
"Opus is experiencing high load, please use /model to switch to
Sonnet" which is misleading for OpenAI/Gemini/Ollama users.
The original constant is preserved for backward-compatible string
matching in error handlers.
2. Onboarding.tsx: Skip the "approve API key" step when a 3P provider
is active. Previously, having ANTHROPIC_API_KEY in the environment
(e.g., from a previous Anthropic setup) triggered an irrelevant
Anthropic key approval UI even when using Gemini or OpenAI.
1. cli/update.ts: Block the update command for third-party providers.
The update mechanism downloads from Anthropic's GCS bucket, which
would silently replace the OpenClaude build (with the OpenAI shim)
with the upstream Claude Code binary (without it). Now shows an
actionable message directing users to rebuild from source.
2. codexShim.ts: Filter thinking blocks from assistant history, matching
the openaiShim behavior. Without this, thinking blocks were included
as plain text in assistant messages for the Codex transport but
excluded for the OpenAI transport — causing inconsistent history
when switching providers mid-session.
Three targeted fixes:
1. Replace Math.random() with crypto.randomUUID() for message and tool
call IDs in both openaiShim.ts and codexShim.ts. Math.random() is
not cryptographically secure and predictable in seeded environments.
2. Anchor Azure endpoint detection to parsed hostname instead of raw
URL regex. Adds support for Azure AI Foundry (services.ai.azure.com)
alongside existing cognitiveservices and openai Azure endpoints.
Prevents SSRF-style bypass via path segments.
3. Surface content safety filter blocks to the user. When Gemini or
Azure returns finish_reason 'content_filter' or 'safety', emit a
visible text block '[Content blocked by provider safety filter]'
instead of silently returning empty/truncated content with
stop_reason 'end_turn'. Applied to both streaming and non-streaming.
Two fixes for issue #133 where setting ANTHROPIC_API_KEY=dummy alongside
CLAUDE_CODE_USE_GEMINI=1 causes "Invalid API key" errors:
1. auth.ts: In the CI branch of getAnthropicApiKeyWithSource(), the
ANTHROPIC_API_KEY value was returned without checking isUsing3PServices().
A dummy key leaked into the Anthropic key resolution pipeline even when
Gemini was the active provider. Now guards with isUsing3PServices().
2. errors.ts: The x-api-key error handler surfaced "Invalid API key" for
any provider. Added getAPIProvider() === 'firstParty' guard so 3P users
see the real underlying error instead of a misleading auth message.
Note: The cli.tsx Gemini validation fix (originally part of this PR) was
independently implemented in PR #121 and is already on main.
OpenAI returns cached token counts in usage.prompt_tokens_details.cached_tokens
but the shim hardcoded cache_read_input_tokens to 0. This made prompt
caching invisible to the cost tracker and session summary even when
OpenAI's automatic caching was actively reducing costs.
Changes:
- Extend OpenAIStreamChunk usage interface with prompt_tokens_details
- Map cached_tokens to cache_read_input_tokens in convertChunkUsage()
- Same fix in _convertNonStreamingResponse() for non-streaming path
- cache_creation_input_tokens remains 0 (OpenAI auto-caching has no
creation cost — it is free and automatic)
Addresses the most critical remaining issues in the provider shim layer,
building on top of #124 (recursive schema normalization + try/finally).
openaiShim.ts:
- Throw APIError via SDK factory instead of plain Error — enables retry
on 429/503 (was completely broken: zero retries for all 3P providers)
- Guard stop_reason !== null before emitting usage-only message_delta
(Azure/Groq send usage before finish_reason)
- Fix assistant content: join text parts instead of invalid as-string cast
(Mistral rejects array content on assistant role)
- Expose real HTTP Response in withResponse() for header inspection
- Skip stream_options for local providers (Ollama < 0.5 compatibility)
codexShim.ts:
- Throw APIError at all 4 throw sites (HTTP + 3 streaming errors)
- Add tool_choice 'none' mapping (was silently ignored)
- Forward is_error flag with Error: prefix (matching openaiShim)
- Set competing provider flags to undefined in updateSettingsForSource to ensure clean GitHub boot
- Fix resolveProviderRequest to default to github:copilot when OPENAI_MODEL is unset
- Hydrate secure tokens and managed settings in system-check.ts to prevent false negatives
- Add models:read scope to GitHub device flow