Models served through Ollama/vLLM with strict Jinja templates (Devstral,
Mistral, etc.) require strict user↔assistant role alternation and reject
requests with consecutive messages of the same role.
convertMessages() could produce consecutive user or assistant messages in
three scenarios: batched user input, text-only + tool_use assistant turns,
and tool result remainders followed by another user message.
Added a coalescing pass at the end of convertMessages() that merges
consecutive same-role messages (string concat or array concat), preserving
tool_calls on assistant messages. Tool and system messages are excluded
from coalescing as they have their own alternation rules.
Includes regression tests for both user and assistant coalescing.
Fixes#202
* fix: disable experimental API betas by default to prevent 500 errors
Tool search (defer_loading), global cache scope, and context management
betas require internal Anthropic server-side support. External accounts
receive 500 Internal Server Error when these are sent.
Set CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=true by default in the CLI
entrypoint. Users with internal access can opt back in with =false.
Also includes: cache key stability fixes (Sonnet 1M latch, system-before-
messages key ordering, resume fingerprint isMeta skip), sideQuery default
cleanup, and /dream command.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: standardize API headers to Headers type and enable tengu feature flags by default
* fix: address PR review — dream lock, MCP betas guard, redundant Partial
- Call recordConsolidation() programmatically in /dream instead of
delegating to model prompt (unreliable)
- Add CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS guard to MCP entrypoint
(was only in CLI entrypoint, causing 500s in MCP server mode)
- Remove redundant ? markers from SecretValueSource Partial<{}> type
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add agentModels and agentRouting to SettingsSchema
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add agentRouting module for per-agent provider resolution
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: thread providerOverride through OpenAI shim for per-agent routing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: getAnthropicClient accepts providerOverride for agent routing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: thread providerOverride through Options and queryModel calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: thread providerOverride through query loop and ToolUseContext
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: resolve agent routing in runAgent and inject providerOverride
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add Agent Routing configuration guide to README
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add unit tests for resolveAgentProvider + plaintext api_key note
- 15 tests covering priority chain (name > subagentType > default > null)
- normalize() case-insensitive and hyphen/underscore equivalence
- Edge cases: null settings, missing config sections, non-existent model
- README note about api_key stored in plaintext
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* security: address code review — SSRF, credential leak, key collision
- base_url schema now uses z.string().url() for SSRF mitigation
- Strip auth headers (Authorization, x-api-key, api-key) from
defaultHeaders when providerOverride is active, preventing
Anthropic credentials from leaking to third-party endpoints
- Warn on duplicate normalized routing keys to prevent silent shadowing
- providerOverride.apiKey is never logged (verified via grep)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: 冯俊辉 <fengjunhui@shiyanjia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three targeted fixes:
1. Replace Math.random() with crypto.randomUUID() for message and tool
call IDs in both openaiShim.ts and codexShim.ts. Math.random() is
not cryptographically secure and predictable in seeded environments.
2. Anchor Azure endpoint detection to parsed hostname instead of raw
URL regex. Adds support for Azure AI Foundry (services.ai.azure.com)
alongside existing cognitiveservices and openai Azure endpoints.
Prevents SSRF-style bypass via path segments.
3. Surface content safety filter blocks to the user. When Gemini or
Azure returns finish_reason 'content_filter' or 'safety', emit a
visible text block '[Content blocked by provider safety filter]'
instead of silently returning empty/truncated content with
stop_reason 'end_turn'. Applied to both streaming and non-streaming.
OpenAI returns cached token counts in usage.prompt_tokens_details.cached_tokens
but the shim hardcoded cache_read_input_tokens to 0. This made prompt
caching invisible to the cost tracker and session summary even when
OpenAI's automatic caching was actively reducing costs.
Changes:
- Extend OpenAIStreamChunk usage interface with prompt_tokens_details
- Map cached_tokens to cache_read_input_tokens in convertChunkUsage()
- Same fix in _convertNonStreamingResponse() for non-streaming path
- cache_creation_input_tokens remains 0 (OpenAI auto-caching has no
creation cost — it is free and automatic)
Addresses the most critical remaining issues in the provider shim layer,
building on top of #124 (recursive schema normalization + try/finally).
openaiShim.ts:
- Throw APIError via SDK factory instead of plain Error — enables retry
on 429/503 (was completely broken: zero retries for all 3P providers)
- Guard stop_reason !== null before emitting usage-only message_delta
(Azure/Groq send usage before finish_reason)
- Fix assistant content: join text parts instead of invalid as-string cast
(Mistral rejects array content on assistant role)
- Expose real HTTP Response in withResponse() for header inspection
- Skip stream_options for local providers (Ollama < 0.5 compatibility)
codexShim.ts:
- Throw APIError at all 4 throw sites (HTTP + 3 streaming errors)
- Add tool_choice 'none' mapping (was silently ignored)
- Forward is_error flag with Error: prefix (matching openaiShim)
- Make getProviderLabel() switch exhaustive with explicit openai/gemini
arms instead of falling through to env-var checks in default
- Add clarifying comment on additionalProperties override in schema
normalization
Partially addresses #112. The streaming reader in openaiStreamToAnthropic
had no error handling - if an error occurred during streaming, the reader
lock was never released. Wrapped the while loop in try/finally to ensure
reader.releaseLock() is always called.
Fixes#111. normalizeSchemaForOpenAI only processed the top-level
object schema, leaving nested objects untouched. OpenAI strict mode
rejects schemas where nested objects have properties not listed in
their required array, causing 400 errors on tools with nested params.
Now recurses into properties, items, and anyOf/oneOf/allOf combinators
(matching the pattern used by enforceStrictSchema in codexShim.ts).
Also adds additionalProperties: false to nested objects in strict mode.
Build verified passing.
- Introduced environment variable CLAUDE_CODE_USE_GITHUB to enable GitHub Models.
- Added checks for GITHUB_TOKEN or GH_TOKEN for authentication.
- Updated base URL handling to include GitHub Models default.
- Enhanced provider detection and error handling for GitHub Models.
- Updated relevant functions and components to accommodate the new provider.
Azure OpenAI API rejects the max_tokens parameter and requires
max_completion_tokens instead. This change ensures the conversion
is robust by validating that max_tokens is a positive number before
using it, preventing edge cases like null or "null" string values
from being incorrectly sent.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs in convertTools() caused Gemini's OpenAI-compatible endpoint
to reject tool schemas with 400 "schema requires unspecified property":
1. The Agent tool patch unconditionally pushed 'message' into required[]
even though 'message' is not a property of the Agent schema. Gemini
strictly validates that every key in required[] exists in properties.
2. normalizeSchemaForOpenAI() added all property keys to required[] for
OpenAI strict mode, but this conflicts with Gemini's stricter schema
validation which rejects required keys absent from properties.
Fix:
- Agent tool patch now only adds a key to required[] if it exists in
schema.properties (fixes the 'message' 400 error on Gemini)
- normalizeSchemaForOpenAI() accepts a strict flag: true for OpenAI
(promotes all property keys into required[]), false for Gemini
(filters required[] to only keys present in properties)
- convertTools() detects CLAUDE_CODE_USE_GEMINI and passes strict=false
Fixes#82
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Azure OpenAI and newer OpenAI models (o1, o3, o4...) reject `max_tokens`
with a 400 error and require `max_completion_tokens` instead.
Maps `params.max_tokens` → `max_completion_tokens` in the request body,
which is the current standard across OpenAI-compatible providers.
This commit addresses strict schema validation limitations when running subagents under OpenAI backend shims.
- Drops empty properties from payloads (like Record<string, string>) that break OpenAI's Structured Outputs validation.
- Handles edge cases for automated initial teams when subagents bypass standard creation routines.
- Aborts sending unsupported experimental backend parameters like temperature and top_p for GPT-5 derivatives.
Some providers send an empty string as the first delta to signal
streaming start. The falsy check `if (delta.content)` treated "" as
absent, skipping content_block_start. The next delta with actual
content was emitted without it, violating the Anthropic protocol.
Changed to `delta.content != null` to distinguish between absent field
and empty string.
Relates to #42
Co-Authored-By: Juan Camilo <juancamilo.auriti@gmail.com>
OpenAI and Codex enforce strict JSON Schema validation — every key in
`properties` must also appear in `required`. Anthropic schemas often
mark fields as optional (omitted from `required`), which causes 400
errors on OpenAI/Codex endpoints.
Example: the Agent tool has `subagent_type` in `properties` but not
in `required`, producing:
"Invalid schema for function 'Agent': Missing 'subagent_type'
in required array"
Fix: add `normalizeSchemaForOpenAI()` in `convertTools()` that ensures
`required` is a superset of all `properties` keys before the schema is
sent to the API. Existing `required` entries are preserved; missing
ones are appended. Schemas without `properties` pass through unchanged.
Fixes#46.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Anthropic-to-OpenAI tool_choice mapping handled 'auto', 'any', and
'tool' but not 'none'. When 'none' was passed, the request was sent
without tool_choice, defaulting to 'auto' — the opposite of the
intended behavior (disable tool use).
Relates to #30
Co-Authored-By: Juan Camilo <juancamilo.auriti@gmail.com>
When certain OpenAI-compatible APIs (LM Studio, some proxies) send
multiple stream chunks with finish_reason set, the finish block ran
multiple times — emitting content_block_stop and message_delta for
each one. Each content_block_stop caused claude.ts to create and yield
a new assistant message, making every response appear twice in the UI.
Fix: add hasProcessedFinishReason flag (same pattern as the existing
hasEmittedFinalUsage flag) so the finish block only executes once per
response regardless of how many chunks contain finish_reason.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Google Gemini as a first-class provider using Gemini's OpenAI-compatible
endpoint, supporting gemini-2.0-flash, gemini-2.5-pro, and gemini-2.0-flash-lite
across all three model tiers (opus/sonnet/haiku).
- Add 'gemini' to APIProvider type with CLAUDE_CODE_USE_GEMINI env detection
- Map all 11 model configs to appropriate Gemini models per tier
- Route Gemini through existing OpenAI shim (generativelanguage.googleapis.com)
- Support GEMINI_API_KEY and GOOGLE_API_KEY for authentication
- Fix model display name to show actual Gemini model instead of Claude fallback
- Add Gemini support to provider-launch, provider-bootstrap, system-check scripts
- Add dev:gemini npm script for local development
Bootstrap: bun run profile:init -- --provider gemini --api-key <key>
Launch: bun run dev:gemini
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a new 'openai' API provider that translates Anthropic SDK calls to
OpenAI chat completions format, enabling Claude Code's full tool system
(bash, file read/write/edit, grep, glob, agents) with any OpenAI-compatible
model: GPT-4o, DeepSeek, Gemini, Llama, Ollama, OpenRouter, and 200+ more.
Set CLAUDE_CODE_USE_OPENAI=1, OPENAI_API_KEY, and OPENAI_MODEL to use.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>