OpenAI returns cached token counts in usage.prompt_tokens_details.cached_tokens
but the shim hardcoded cache_read_input_tokens to 0. This made prompt
caching invisible to the cost tracker and session summary even when
OpenAI's automatic caching was actively reducing costs.
Changes:
- Extend OpenAIStreamChunk usage interface with prompt_tokens_details
- Map cached_tokens to cache_read_input_tokens in convertChunkUsage()
- Same fix in _convertNonStreamingResponse() for non-streaming path
- cache_creation_input_tokens remains 0 (OpenAI auto-caching has no
creation cost — it is free and automatic)
Replace raw === '1' || === 'true' comparisons with isEnvTruthy() in
context.ts for consistency with getAPIProvider() in providers.ts.
This also covers the newly added CLAUDE_CODE_USE_GITHUB provider.
Add native Gemini model entries (without google/ prefix) to both
context window and max output token tables. Corrects gemini-2.5-pro
and gemini-2.5-flash max output tokens to 65,536 (was 8,192/32,768).
Addresses the most critical remaining issues in the provider shim layer,
building on top of #124 (recursive schema normalization + try/finally).
openaiShim.ts:
- Throw APIError via SDK factory instead of plain Error — enables retry
on 429/503 (was completely broken: zero retries for all 3P providers)
- Guard stop_reason !== null before emitting usage-only message_delta
(Azure/Groq send usage before finish_reason)
- Fix assistant content: join text parts instead of invalid as-string cast
(Mistral rejects array content on assistant role)
- Expose real HTTP Response in withResponse() for header inspection
- Skip stream_options for local providers (Ollama < 0.5 compatibility)
codexShim.ts:
- Throw APIError at all 4 throw sites (HTTP + 3 streaming errors)
- Add tool_choice 'none' mapping (was silently ignored)
- Forward is_error flag with Error: prefix (matching openaiShim)
- Set competing provider flags to undefined in updateSettingsForSource to ensure clean GitHub boot
- Fix resolveProviderRequest to default to github:copilot when OPENAI_MODEL is unset
- Hydrate secure tokens and managed settings in system-check.ts to prevent false negatives
- Add models:read scope to GitHub device flow
Wait for failed MCP transport cleanup before command exit so targeted live checks do not crash on Windows.
Co-Authored-By: Claude <noreply@anthropic.com>
Add the MCP doctor subcommand with text and JSON output, config-only mode, and scope filtering so users can diagnose MCP issues from the CLI.
Co-Authored-By: Claude <noreply@anthropic.com>
Add the diagnostics core and report model for MCP health, scope, and config analysis. This creates the structured report used by both text and JSON doctor output.
Co-Authored-By: Claude <noreply@anthropic.com>
The version kill-switch calls Anthropic's GrowthBook endpoint to
enforce a minimum version. This is currently safe for 3P users only
because isAnalyticsDisabled() returns true (disabling GrowthBook).
Adding an explicit provider guard makes this safety independent of the
analytics stub, preventing 3P users from being blocked by Anthropic's
version requirements in case of future upstream merges.
Add provider guard to migrateSonnet1mToSonnet45() so it only runs for
firstParty (Anthropic) users. Without this, a 3P user with
model='sonnet[1m]' would have it rewritten to an Anthropic-specific
alias that is invalid for OpenAI/Gemini/Ollama providers.
- Make getProviderLabel() switch exhaustive with explicit openai/gemini
arms instead of falling through to env-var checks in default
- Add clarifying comment on additionalProperties override in schema
normalization
Partially addresses #112. The streaming reader in openaiStreamToAnthropic
had no error handling - if an error occurred during streaming, the reader
lock was never released. Wrapped the while loop in try/finally to ensure
reader.releaseLock() is always called.
Partially addresses #39. The cost threshold dialog hardcoded
'Anthropic API' in the title, which is misleading for users on
OpenAI, Gemini, Ollama, or other providers. Now detects the active
provider via getAPIProvider() and shows the correct label.
Fixes#111. normalizeSchemaForOpenAI only processed the top-level
object schema, leaving nested objects untouched. OpenAI strict mode
rejects schemas where nested objects have properties not listed in
their required array, causing 400 errors on tools with nested params.
Now recurses into properties, items, and anyOf/oneOf/allOf combinators
(matching the pattern used by enforceStrictSchema in codexShim.ts).
Also adds additionalProperties: false to nested objects in strict mode.
Build verified passing.
- Introduced a new provider profile for Atomic Chat, allowing it to be used alongside existing providers.
- Updated `package.json` to include a new development script for launching Atomic Chat.
- Modified `smart_router.py` to recognize Atomic Chat as a local provider that does not require an API key.
- Enhanced provider discovery and launch scripts to handle Atomic Chat, including model listing and connection checks.
- Added tests to ensure proper environment setup and behavior for Atomic Chat profiles.
This update expands the functionality of the application to support local LLMs via Atomic Chat, improving versatility for users.
- Introduced environment variable CLAUDE_CODE_USE_GITHUB to enable GitHub Models.
- Added checks for GITHUB_TOKEN or GH_TOKEN for authentication.
- Updated base URL handling to include GitHub Models default.
- Enhanced provider detection and error handling for GitHub Models.
- Updated relevant functions and components to accommodate the new provider.
When using non-Anthropic providers (Ollama, Gemini, Codex), the
underlying call to queryHaiku for session title generation fails.
Previously, this caused the catch block to return null, leaving the
terminal tab permanently stuck on 'Claude Code'.
Now, when the API call fails, we gracefully derive a title locally from
the user's first message (first 7 words, sentence-cased), ensuring
users still see a meaningful session title in their terminal tab.
Azure OpenAI API rejects the max_tokens parameter and requires
max_completion_tokens instead. This change ensures the conversion
is robust by validating that max_tokens is a positive number before
using it, preventing edge cases like null or "null" string values
from being incorrectly sent.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>