Commit Graph

13 Commits

Author SHA1 Message Date
Zartris
f4ac709fa6 fix: report cache reads in streaming and correct cost calculation (#577)
* fix: report cache reads in streaming and correct cost calculation

Fix two bugs in how the OpenAI-to-Anthropic shim handles cached tokens:

1. codexShim: streaming message_delta missing cache_read_input_tokens
   The codexStreamToAnthropic() function builds the final message_delta
   usage object inline (not through makeUsage()), and only included
   input_tokens and output_tokens. cache_read_input_tokens was always 0,
   so /cost never showed cache reads for Responses API models (GPT-5+).

   Also fix makeUsage() to read input_tokens_details.cached_tokens and
   prompt_tokens_details.cached_tokens for the non-streaming path.

2. Both shims: cost double-counting from convention mismatch
   OpenAI includes cached tokens in input_tokens/prompt_tokens (i.e.,
   input_tokens = uncached + cached). Anthropic treats input_tokens as
   uncached only. The cost formula was:
     cost = input_tokens * inputRate + cache_read * cacheRate
   This double-counts cached tokens. Fix by subtracting cached from
   input during the conversion:
     input_tokens = prompt_tokens - cached_tokens

   In practice this was inflating reported costs by ~2x for sessions
   with high cache hit rates (which is most sessions, since Copilot
   auto-caches server-side).

Fixes #515

* fix: omit zero cache read/write fields from /cost output

Only show "cache read" and "cache write" in /cost per-model usage when
the value is > 0. Providers like GitHub Copilot never report
cache_creation_input_tokens (the server manages its own cache), so
showing "0 cache write" on every line is misleading — it implies caching
is not working when it actually is.

Before:
  claude-haiku:  2.6k input, 151 output, 39.8k cache read, 0 cache write ($0.04)

After:
  claude-haiku:  2.6k input, 151 output, 39.8k cache read ($0.04)

---------

Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>
2026-04-10 23:40:42 +08:00
Kevin Codex
42b121bd0d Fix/openclaude diagnostics settings (#483)
* fix: use openclaude paths in diagnostics and settings

* fix: strip leaked reasoning from assistant output

* fix: preserve legacy claude config compatibility

* fix: tighten path and reasoning compatibility

* fix: buffer streamed reasoning leak preambles

* test: cover openclaude migration and reasoning fixes

* test: isolate execFileNoThrow from cross-file mocks
2026-04-09 20:42:51 +08:00
step325
70cfa61582 fix: disable experimental API betas by default, reduce side query token usage, standardize Headers type (#281)
* fix: disable experimental API betas by default to prevent 500 errors

Tool search (defer_loading), global cache scope, and context management
betas require internal Anthropic server-side support. External accounts
receive 500 Internal Server Error when these are sent.

Set CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=true by default in the CLI
entrypoint. Users with internal access can opt back in with =false.

Also includes: cache key stability fixes (Sonnet 1M latch, system-before-
messages key ordering, resume fingerprint isMeta skip), sideQuery default
cleanup, and /dream command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: standardize API headers to Headers type and enable tengu feature flags by default

* fix: address PR review — dream lock, MCP betas guard, redundant Partial

- Call recordConsolidation() programmatically in /dream instead of
  delegating to model prompt (unreliable)
- Add CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS guard to MCP entrypoint
  (was only in CLI entrypoint, causing 500s in MCP server mode)
- Remove redundant ? markers from SecretValueSource Partial<{}> type

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:40:07 +08:00
Kevin Codex
a6ed57d0f4 Merge pull request #161 from auriti/fix/block-update-for-3p-providers
fix: block update command for 3P providers, align thinking block handling
2026-04-03 01:52:54 +08:00
Juan Camilo
1709f5c098 fix: block update command for 3P providers, align thinking block handling
1. cli/update.ts: Block the update command for third-party providers.
   The update mechanism downloads from Anthropic's GCS bucket, which
   would silently replace the OpenClaude build (with the OpenAI shim)
   with the upstream Claude Code binary (without it). Now shows an
   actionable message directing users to rebuild from source.

2. codexShim.ts: Filter thinking blocks from assistant history, matching
   the openaiShim behavior. Without this, thinking blocks were included
   as plain text in assistant messages for the Codex transport but
   excluded for the OpenAI transport — causing inconsistent history
   when switching providers mid-session.
2026-04-02 16:18:10 +02:00
Juan Camilo
5d6443799a fix: crypto.randomUUID for IDs, Azure Foundry detection, safety filter visibility
Three targeted fixes:

1. Replace Math.random() with crypto.randomUUID() for message and tool
   call IDs in both openaiShim.ts and codexShim.ts. Math.random() is
   not cryptographically secure and predictable in seeded environments.

2. Anchor Azure endpoint detection to parsed hostname instead of raw
   URL regex. Adds support for Azure AI Foundry (services.ai.azure.com)
   alongside existing cognitiveservices and openai Azure endpoints.
   Prevents SSRF-style bypass via path segments.

3. Surface content safety filter blocks to the user. When Gemini or
   Azure returns finish_reason 'content_filter' or 'safety', emit a
   visible text block '[Content blocked by provider safety filter]'
   instead of silently returning empty/truncated content with
   stop_reason 'end_turn'. Applied to both streaming and non-streaming.
2026-04-02 16:14:35 +02:00
sooth
5c4469fe81 fix: trim persisted tool results and sanitize MCP schemas 2026-04-02 09:20:40 -04:00
Kevin Codex
9f48bb4431 Merge pull request #135 from auriti/fix/shim-reliability-and-protocol-compliance
fix: shim reliability and protocol compliance overhaul
2026-04-02 21:15:44 +08:00
Juan Camilo
f4818dc213 fix: shim reliability and protocol compliance overhaul
Addresses the most critical remaining issues in the provider shim layer,
building on top of #124 (recursive schema normalization + try/finally).

openaiShim.ts:
- Throw APIError via SDK factory instead of plain Error — enables retry
  on 429/503 (was completely broken: zero retries for all 3P providers)
- Guard stop_reason !== null before emitting usage-only message_delta
  (Azure/Groq send usage before finish_reason)
- Fix assistant content: join text parts instead of invalid as-string cast
  (Mistral rejects array content on assistant role)
- Expose real HTTP Response in withResponse() for header inspection
- Skip stream_options for local providers (Ollama < 0.5 compatibility)

codexShim.ts:
- Throw APIError at all 4 throw sites (HTTP + 3 streaming errors)
- Add tool_choice 'none' mapping (was silently ignored)
- Forward is_error flag with Error: prefix (matching openaiShim)
2026-04-02 14:41:40 +02:00
erdemozyol
cec3629017 fix: support codex web tools and non-git agents 2026-04-02 14:08:22 +03:00
step325
66f5981c45 fix(codex): Support Multi-Agent framework schemas for OpenAI/Codex endpoints
This commit addresses strict schema validation limitations when running subagents under OpenAI backend shims.

- Drops empty properties from payloads (like Record<string, string>) that break OpenAI's Structured Outputs validation.

- Handles edge cases for automated initial teams when subagents bypass standard creation routines.

- Aborts sending unsupported experimental backend parameters like temperature and top_p for GPT-5 derivatives.
2026-04-01 19:47:26 +02:00
Daniel
372ba31c17 feat: enhance tool conversion to support strict mode based on schema validation 2026-04-01 22:55:56 +08:00
vp
cbeed0f76f Add Codex plan/spark provider support 2026-04-01 10:44:35 +03:00