Compare commits

..

2 Commits

Author SHA1 Message Date
Juan Camilo
a06ea87545 fix(security): remove unauthenticated file-based permission polling
Remove the legacy file-based permission polling from useSwarmPermissionPoller
that read from ~/.claude/teams/{name}/permissions/resolved/ — an unauthenticated
directory where any local process could forge approval files to auto-approve
tool uses for swarm teammates.

The file polling was dead code:
- The useSwarmPermissionPoller() hook was never mounted by any component
- resolvePermission() (the file writer) was never imported outside its module
- Permission responses are delivered exclusively via the mailbox system:
  Leader: sendPermissionResponseViaMailbox() → writeToMailbox()
  Worker: useInboxPoller → processMailboxPermissionResponse()

Changes:
- Remove file polling loop, processResponse(), and React hook imports from
  useSwarmPermissionPoller.ts (now a pure callback registry module)
- Mark 7 file-based functions as @deprecated in permissionSync.ts
- Add 4 regression tests verifying the removal

No exported functions removed — only deprecated. All 5 consumer modules
verified: they import only mailbox-based functions that remain unchanged.
2026-04-20 14:38:57 +02:00
Juan Camilo
c0354e8699 fix(security): harden project settings trust boundary + MCP sanitization
- Sanitize MCP tool result text with recursivelySanitizeUnicode() to prevent
  Unicode injection via malicious MCP servers (tool definitions and prompts
  were already sanitized, but tool call results were not)
- Read sandbox.enabled only from trusted settings sources (user, local, flag,
  policy) — exclude projectSettings to prevent malicious repos from silently
  disabling the sandbox via .claude/settings.json
- Disable git hooks in plugin marketplace clone/pull/submodule operations
  with core.hooksPath=/dev/null to prevent code execution from cloned repos
- Remove ANTHROPIC_FOUNDRY_API_KEY from SAFE_ENV_VARS to prevent credential
  injection from project-scoped settings without trust verification
- Add ssrfGuardedLookup to WebFetch HTTP requests to block DNS rebinding
  attacks that could reach cloud metadata or internal services

Security: closes trust boundary gap where project settings could override
security-critical configuration. Follows the existing pattern established
by hasAllowBypassPermissionsMode() which already excludes projectSettings.

Co-authored-by: auriti <auriti@users.noreply.github.com>
2026-04-20 14:11:46 +02:00
78 changed files with 390 additions and 9914 deletions

View File

@@ -149,23 +149,6 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
# Use a custom OpenAI-compatible endpoint (optional — defaults to api.openai.com) # Use a custom OpenAI-compatible endpoint (optional — defaults to api.openai.com)
# OPENAI_BASE_URL=https://api.openai.com/v1 # OPENAI_BASE_URL=https://api.openai.com/v1
# Fallback context window size (tokens) when the model is not found in the
# built-in table (default: 128000). Increase this for models with larger
# context windows (e.g. 200000 for Claude-sized contexts).
# CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW=128000
# Per-model context window overrides as a JSON object.
# Takes precedence over the built-in table, so you can register new or
# custom models without patching source.
# Example: CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS={"my-corp/llm-v3":262144,"gpt-4o-mini":128000}
# CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS=
# Per-model maximum output token overrides as a JSON object.
# Use this alongside CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS when your model
# supports a different output limit than what the built-in table specifies.
# Example: CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS={"my-corp/llm-v3":8192}
# CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS=
# ----------------------------------------------------------------------------- # -----------------------------------------------------------------------------
# Option 3: Google Gemini # Option 3: Google Gemini
@@ -284,16 +267,6 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
# Disable "Co-authored-by" line in git commits made by OpenClaude # Disable "Co-authored-by" line in git commits made by OpenClaude
# OPENCLAUDE_DISABLE_CO_AUTHORED_BY=1 # OPENCLAUDE_DISABLE_CO_AUTHORED_BY=1
# Disable strict tool schema normalization for non-Gemini providers
# Useful when MCP tools with complex optional params (e.g. list[dict])
# trigger "Extra required key ... supplied" errors from OpenAI-compatible endpoints
# OPENCLAUDE_DISABLE_STRICT_TOOLS=1
# Disable hidden <system-reminder> messages injected into tool output
# Suppresses the file-read cyber-risk reminder and the todo/task tool nudges
# Useful for users who want full transparency over what the model sees
# OPENCLAUDE_DISABLE_TOOL_REMINDERS=1
# Custom timeout for API requests in milliseconds (default: varies) # Custom timeout for API requests in milliseconds (default: varies)
# API_TIMEOUT_MS=60000 # API_TIMEOUT_MS=60000

2
.gitignore vendored
View File

@@ -7,8 +7,6 @@ dist/
.openclaude-profile.json .openclaude-profile.json
reports/ reports/
GEMINI.md GEMINI.md
CLAUDE.md
package-lock.json package-lock.json
/.claude /.claude
coverage/ coverage/
agent.log

View File

@@ -1,3 +1,3 @@
{ {
".": "0.6.0" ".": "0.5.2"
} }

View File

@@ -1,33 +1,5 @@
# Changelog # Changelog
## [0.6.0](https://github.com/Gitlawb/openclaude/compare/v0.5.2...v0.6.0) (2026-04-22)
### Features
* add model caching and benchmarking utilities ([#671](https://github.com/Gitlawb/openclaude/issues/671)) ([2b15e16](https://github.com/Gitlawb/openclaude/commit/2b15e16421f793f954a92c53933a07094544b29d))
* add thinking token extraction ([#798](https://github.com/Gitlawb/openclaude/issues/798)) ([268c039](https://github.com/Gitlawb/openclaude/commit/268c0398e4bf1ab898069c61500a2b3c226a0322))
* **api:** compress old tool_result content for small-context providers ([#801](https://github.com/Gitlawb/openclaude/issues/801)) ([a6a3de5](https://github.com/Gitlawb/openclaude/commit/a6a3de5ac155fe9d00befbfcab98d439314effd8))
* **api:** improve local provider reliability with readiness and self-healing ([#738](https://github.com/Gitlawb/openclaude/issues/738)) ([4cb963e](https://github.com/Gitlawb/openclaude/commit/4cb963e660dbd6ee438c04042700db05a9d32c59))
* **api:** smart model routing primitive (cheap-for-simple, strong-for-hard) ([#785](https://github.com/Gitlawb/openclaude/issues/785)) ([e908864](https://github.com/Gitlawb/openclaude/commit/e908864da7e7c987a98053ac5d18d702e192db2b))
* enable 15 additional feature flags in open build ([#667](https://github.com/Gitlawb/openclaude/issues/667)) ([6a62e3f](https://github.com/Gitlawb/openclaude/commit/6a62e3ff76ba9ba446b8e20cf2bb139ee76a9387))
* native Anthropic API mode for Claude models on GitHub Copilot ([#579](https://github.com/Gitlawb/openclaude/issues/579)) ([fdef4a1](https://github.com/Gitlawb/openclaude/commit/fdef4a1b4ce218ded4937ca83b30acce7c726472))
* **provider:** expose Atomic Chat in /provider picker with autodetect ([#810](https://github.com/Gitlawb/openclaude/issues/810)) ([ee19159](https://github.com/Gitlawb/openclaude/commit/ee19159c17b3de3b4a8b4a4541a6569f4261d54e))
* **provider:** zero-config autodetection primitive ([#784](https://github.com/Gitlawb/openclaude/issues/784)) ([a5bfcbb](https://github.com/Gitlawb/openclaude/commit/a5bfcbbadf8e9a1fd42f3e103d295524b8da64b0))
### Bug Fixes
* **api:** ensure strict role sequence and filter empty assistant messages after interruption ([#745](https://github.com/Gitlawb/openclaude/issues/745) regression) ([#794](https://github.com/Gitlawb/openclaude/issues/794)) ([06e7684](https://github.com/Gitlawb/openclaude/commit/06e7684eb56df8e694ac784575e163641931c44c))
* Collapse all-text arrays to string for DeepSeek compatibility ([#806](https://github.com/Gitlawb/openclaude/issues/806)) ([761924d](https://github.com/Gitlawb/openclaude/commit/761924daa7e225fe8acf41651408c7cae639a511))
* **model:** codex/nvidia-nim/minimax now read OPENAI_MODEL env ([#815](https://github.com/Gitlawb/openclaude/issues/815)) ([4581208](https://github.com/Gitlawb/openclaude/commit/458120889f6ce54cc9f0b287461d5e38eae48a20))
* **provider:** saved profile ignored when stale CLAUDE_CODE_USE_* in shell ([#807](https://github.com/Gitlawb/openclaude/issues/807)) ([13de4e8](https://github.com/Gitlawb/openclaude/commit/13de4e85df7f5fadc8cd15a76076374dc112360b))
* rename .claude.json to .openclaude.json with legacy fallback ([#582](https://github.com/Gitlawb/openclaude/issues/582)) ([4d4fb28](https://github.com/Gitlawb/openclaude/commit/4d4fb2880e4d0e3a62d8715e1ec13d932e736279))
* replace discontinued gemini-2.5-pro-preview-03-25 with stable gemini-2.5-pro ([#802](https://github.com/Gitlawb/openclaude/issues/802)) ([64582c1](https://github.com/Gitlawb/openclaude/commit/64582c119d5d0278195271379da4a68d59a89c1f)), closes [#398](https://github.com/Gitlawb/openclaude/issues/398)
* **security:** harden project settings trust boundary + MCP sanitization ([#789](https://github.com/Gitlawb/openclaude/issues/789)) ([ae3b723](https://github.com/Gitlawb/openclaude/commit/ae3b723f3b297b49925cada4728f3174aee8bf12))
* **test:** autoCompact floor assertion is flag-sensitive ([#816](https://github.com/Gitlawb/openclaude/issues/816)) ([c13842e](https://github.com/Gitlawb/openclaude/commit/c13842e91c7227246520955de6ae0636b30def9a))
* **ui:** prevent provider manager lag by deferring sync I/O ([#803](https://github.com/Gitlawb/openclaude/issues/803)) ([85eab27](https://github.com/Gitlawb/openclaude/commit/85eab2751e7d351bb0ed6a3fe0e15461d241c9cb))
## [0.5.2](https://github.com/Gitlawb/openclaude/compare/v0.5.1...v0.5.2) (2026-04-20) ## [0.5.2](https://github.com/Gitlawb/openclaude/compare/v0.5.1...v0.5.2) (2026-04-20)

View File

@@ -125,7 +125,7 @@ Advanced and source-build guides:
| Codex OAuth | `/provider` | Opens ChatGPT sign-in in your browser and stores Codex credentials securely | | Codex OAuth | `/provider` | Opens ChatGPT sign-in in your browser and stores Codex credentials securely |
| Codex | `/provider` | Uses existing Codex CLI auth, OpenClaude secure storage, or env credentials | | Codex | `/provider` | Uses existing Codex CLI auth, OpenClaude secure storage, or env credentials |
| Ollama | `/provider`, env vars, or `ollama launch` | Local inference with no API key | | Ollama | `/provider`, env vars, or `ollama launch` | Local inference with no API key |
| Atomic Chat | `/provider`, env vars, or `bun run dev:atomic-chat` | Local Model Provider; auto-detects loaded models | | Atomic Chat | advanced setup | Local Apple Silicon backend |
| Bedrock / Vertex / Foundry | env vars | Additional provider integrations for supported environments | | Bedrock / Vertex / Foundry | env vars | Additional provider integrations for supported environments |
## What Works ## What Works

View File

@@ -1,333 +0,0 @@
# Hook Chains (Self-Healing Agent Mesh MVP)
Hook Chains provide an event-driven recovery layer for important workflow failures.
When a matching hook event occurs, OpenClaude evaluates declarative rules and can dispatch remediation actions such as:
- `spawn_fallback_agent`
- `notify_team`
- `warm_remote_capacity`
## Disabled-By-Default Rollout
> **Rollout recommendation:** keep Hook Chains disabled until you validate rules in your environment.
>
> - Set top-level config to `"enabled": false` initially.
> - Enable per environment when ready.
> - Dispatch is gated by `feature('HOOK_CHAINS')`.
> - Env gate defaults to off unless `CLAUDE_CODE_ENABLE_HOOK_CHAINS=1` is set.
This keeps existing workflows unchanged while you tune guard windows and action behavior.
## Feature Overview
Hook Chains are loaded from a deterministic config file and evaluated on dispatched hook events.
MVP runtime trigger wiring:
- `PostToolUseFailure` hooks dispatch Hook Chains with outcome `failed`.
- `TaskCompleted` hooks dispatch Hook Chains with outcome:
- `success` when completion hooks did not block.
- `failed` when completion hooks returned blocking errors or prevented continuation.
Default config path:
- `.openclaude/hook-chains.json`
Override path:
- `CLAUDE_CODE_HOOK_CHAINS_CONFIG_PATH=/abs/or/relative/path/to/hook-chains.json`
Global gate:
- `feature('HOOK_CHAINS')` must be enabled in the build
- `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0|1` (defaults to disabled when unset)
## Safety Guarantees
The runtime is intentionally conservative:
- **Depth guard:** chain dispatch is blocked when `chainDepth >= maxChainDepth`.
- **Rule cooldown:** each rule can only re-fire after cooldown expires.
- **Dedup window:** identical event/action combinations are suppressed for a window.
- **Abort-safe behavior:** if the current signal is aborted, actions skip safely.
- **Policy-aware remote warm:** `warm_remote_capacity` skips when remote sessions are policy denied.
- **Bridge inactive no-op:** `warm_remote_capacity` safely skips when no active bridge handle exists.
- **Missing team context safety:** `notify_team` skips with structured reason if no team context/team file is available.
- **Fallback launcher safety:** `spawn_fallback_agent` fails with a structured reason when launch permissions/context are unavailable.
## Configuration Schema Reference
Top-level object:
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 30000,
"defaultDedupWindowMs": 30000,
"rules": []
}
```
### Top-Level Fields
| Field | Type | Required | Notes |
|---|---|---:|---|
| `version` | `1` | No | Defaults to `1`. |
| `enabled` | `boolean` | No | Global feature switch for this config file. |
| `maxChainDepth` | `integer` | No | Global depth guard (default `2`, max `10`). |
| `defaultCooldownMs` | `integer` | No | Default rule cooldown in ms (default `30000`). |
| `defaultDedupWindowMs` | `integer` | No | Default action dedup window in ms (default `30000`). |
| `rules` | `HookChainRule[]` | No | Defaults to `[]`. May be omitted or empty; when no rules are present, dispatch is a no-op and returns `enabled: false`. |
> **Note:** An empty ruleset is valid and can be used to keep Hook Chains configured but effectively disabled until rules are added.
### Rule Object (`HookChainRule`)
```json
{
"id": "task-failure-recovery",
"enabled": true,
"trigger": {
"event": "TaskCompleted",
"outcome": "failed"
},
"condition": {
"toolNames": ["Edit"],
"taskStatuses": ["failed"],
"errorIncludes": ["timeout", "permission denied"],
"eventFieldEquals": {
"meta.source": "scheduler"
}
},
"cooldownMs": 60000,
"dedupWindowMs": 30000,
"maxDepth": 2,
"actions": []
}
```
| Field | Type | Required | Notes |
|---|---|---:|---|
| `id` | `string` | Yes | Stable identifier used in telemetry/guards. |
| `enabled` | `boolean` | No | Per-rule switch. |
| `trigger.event` | `HookEvent` | Yes | Event name to match. |
| `trigger.outcome` | `"success"|"failed"|"timeout"|"unknown"` | No | Single outcome matcher. |
| `trigger.outcomes` | `Outcome[]` | No | Multi-outcome matcher. Use either `outcome` or `outcomes`. |
| `condition` | `object` | No | Optional extra matching constraints. |
| `cooldownMs` | `integer` | No | Overrides global cooldown for this rule. |
| `dedupWindowMs` | `integer` | No | Overrides global dedup for this rule. |
| `maxDepth` | `integer` | No | Per-rule depth cap. |
| `actions` | `HookChainAction[]` | Yes | One or more actions to execute in order. |
### Condition Fields
| Field | Type | Notes |
|---|---|---|
| `toolNames` | `string[]` | Matches `tool_name` / `toolName` in event payload. |
| `taskStatuses` | `string[]` | Matches `task_status` / `taskStatus` / `status`. |
| `errorIncludes` | `string[]` | Case-insensitive substring match against `error` / `reason` / `message`. |
| `eventFieldEquals` | `Record<string, string\|number\|boolean>` | Dot-path equality against payload (example: `"meta.source": "scheduler"`). |
### Actions
#### `spawn_fallback_agent`
```json
{
"type": "spawn_fallback_agent",
"id": "fallback-1",
"enabled": true,
"dedupWindowMs": 30000,
"description": "Fallback recovery for failed task",
"promptTemplate": "Recover task ${TASK_SUBJECT}. Event=${EVENT_NAME}, outcome=${OUTCOME}, error=${ERROR}. Payload=${PAYLOAD_JSON}",
"agentType": "general-purpose",
"model": "sonnet"
}
```
#### `notify_team`
```json
{
"type": "notify_team",
"id": "notify-ops",
"enabled": true,
"dedupWindowMs": 30000,
"teamName": "mesh-team",
"recipients": ["*"],
"summary": "Hook chain ${RULE_ID} fired",
"messageTemplate": "Event=${EVENT_NAME} outcome=${OUTCOME}\nTask=${TASK_ID}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
}
```
#### `warm_remote_capacity`
```json
{
"type": "warm_remote_capacity",
"id": "warm-bridge",
"enabled": true,
"dedupWindowMs": 60000,
"createDefaultEnvironmentIfMissing": false
}
```
## Complete Example Configs
### 1) Retry via Fallback Agent
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 30000,
"defaultDedupWindowMs": 30000,
"rules": [
{
"id": "retry-task-via-fallback",
"trigger": {
"event": "TaskCompleted",
"outcome": "failed"
},
"cooldownMs": 60000,
"actions": [
{
"type": "spawn_fallback_agent",
"id": "spawn-retry-agent",
"description": "Retry failed task with fallback agent",
"promptTemplate": "A task failed. Recover it safely.\nTask=${TASK_SUBJECT}\nDescription=${TASK_DESCRIPTION}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}",
"agentType": "general-purpose",
"model": "sonnet"
}
]
}
]
}
```
### 2) Notify Only
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 30000,
"defaultDedupWindowMs": 30000,
"rules": [
{
"id": "notify-on-tool-failure",
"trigger": {
"event": "PostToolUseFailure",
"outcome": "failed"
},
"condition": {
"toolNames": ["Edit", "Write", "Bash"]
},
"actions": [
{
"type": "notify_team",
"id": "notify-team-failure",
"recipients": ["*"],
"summary": "Tool failure detected",
"messageTemplate": "Tool failure detected.\nEvent=${EVENT_NAME} outcome=${OUTCOME}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
}
]
}
]
}
```
### 3) Combined Fallback + Notify + Bridge Warm
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 45000,
"defaultDedupWindowMs": 30000,
"rules": [
{
"id": "full-recovery-chain",
"trigger": {
"event": "TaskCompleted",
"outcomes": ["failed", "timeout"]
},
"condition": {
"errorIncludes": ["timeout", "capacity", "connection"]
},
"cooldownMs": 90000,
"actions": [
{
"type": "spawn_fallback_agent",
"id": "fallback-agent",
"description": "Recover failed task execution",
"promptTemplate": "Recover failed task and produce a concise fix summary.\nTask=${TASK_SUBJECT}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
},
{
"type": "notify_team",
"id": "notify-team",
"recipients": ["*"],
"summary": "Recovery chain triggered",
"messageTemplate": "Recovery chain ${RULE_ID} fired.\nOutcome=${OUTCOME}\nTask=${TASK_SUBJECT}\nError=${ERROR}"
},
{
"type": "warm_remote_capacity",
"id": "warm-capacity",
"createDefaultEnvironmentIfMissing": false
}
]
}
]
}
```
## Template Variables
The following placeholders are supported by `promptTemplate`, `summary`, and `messageTemplate`:
- `${EVENT_NAME}`
- `${OUTCOME}`
- `${RULE_ID}`
- `${TASK_SUBJECT}`
- `${TASK_DESCRIPTION}`
- `${TASK_ID}`
- `${ERROR}`
- `${PAYLOAD_JSON}`
## Troubleshooting
### Rule never triggers
- Verify `trigger.event` and `trigger.outcome`/`trigger.outcomes` exactly match dispatched event data.
- Check `condition` filters (especially `toolNames` and `eventFieldEquals` dot-path keys).
- Confirm the config file is valid JSON and schema-valid.
### Actions show as skipped
Common skip reasons:
- `action disabled`
- `rule cooldown active ...`
- `dedup window active ...`
- `max chain depth reached ...`
- `No team context is available ...`
- `Team file not found ...`
- `Remote sessions are blocked by policy`
- `Bridge is not active; warm_remote_capacity is a safe no-op`
- `No fallback agent launcher is registered in runtime context`
### Config changes not reflected
- Loader uses memoization by file mtime/size.
- Ensure your editor writes the file fully and updates mtime.
- If needed, force reload from the caller side with `forceReloadConfig: true`.
### Existing workflows changed unexpectedly
- Set `"enabled": false` at top-level.
- Or globally disable with `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0`.
- Re-enable gradually after validating one rule at a time.

View File

@@ -1,6 +1,6 @@
{ {
"name": "@gitlawb/openclaude", "name": "@gitlawb/openclaude",
"version": "0.6.0", "version": "0.5.2",
"description": "Claude Code opened to any LLM — OpenAI, Gemini, DeepSeek, Ollama, and 200+ models", "description": "Claude Code opened to any LLM — OpenAI, Gemini, DeepSeek, Ollama, and 200+ models",
"type": "module", "type": "module",
"bin": { "bin": {

View File

@@ -19,46 +19,30 @@ const version = pkg.version
// Most Anthropic-internal features stay off; open-build features can be // Most Anthropic-internal features stay off; open-build features can be
// selectively enabled here when their full source exists in the mirror. // selectively enabled here when their full source exists in the mirror.
const featureFlags: Record<string, boolean> = { const featureFlags: Record<string, boolean> = {
// ── Disabled: require Anthropic infrastructure or missing source ───── VOICE_MODE: false,
VOICE_MODE: false, // Push-to-talk STT via claude.ai OAuth endpoint PROACTIVE: false,
PROACTIVE: false, // Autonomous agent mode (missing proactive/ module) KAIROS: false,
KAIROS: false, // Persistent assistant/session mode (cloud backend) BRIDGE_MODE: false,
BRIDGE_MODE: false, // Remote desktop bridge via CCR infrastructure DAEMON: false,
DAEMON: false, // Background daemon process (stubbed in open build) AGENT_TRIGGERS: false,
AGENT_TRIGGERS: false, // Scheduled remote agent triggers MONITOR_TOOL: true,
ABLATION_BASELINE: false, // A/B testing harness for eval experiments ABLATION_BASELINE: false,
CONTEXT_COLLAPSE: false, // Context collapsing optimization (stubbed) DUMP_SYSTEM_PROMPT: false,
COMMIT_ATTRIBUTION: false, // Co-Authored-By metadata in git commits CACHED_MICROCOMPACT: false,
UDS_INBOX: false, // Unix Domain Socket inter-session messaging COORDINATOR_MODE: true,
BG_SESSIONS: false, // Background sessions via tmux (stubbed) BUILTIN_EXPLORE_PLAN_AGENTS: true,
WEB_BROWSER_TOOL: false, // Built-in browser automation (source not mirrored) CONTEXT_COLLAPSE: false,
CHICAGO_MCP: false, // Computer-use MCP (native Swift modules stubbed) COMMIT_ATTRIBUTION: false,
COWORKER_TYPE_TELEMETRY: false, // Telemetry for agent/coworker type classification TEAMMEM: true,
MCP_SKILLS: false, // Dynamic MCP skill discovery (src/skills/mcpSkills.ts not mirrored; enabling this causes "fetchMcpSkillsForClient is not a function" when MCP servers with resources connect — see #856) UDS_INBOX: false,
BG_SESSIONS: false,
// ── Enabled: upstream defaults ────────────────────────────────────── AWAY_SUMMARY: false,
COORDINATOR_MODE: true, // Multi-agent coordinator with worker delegation TRANSCRIPT_CLASSIFIER: false,
BUILTIN_EXPLORE_PLAN_AGENTS: true, // Built-in Explore/Plan specialized subagents WEB_BROWSER_TOOL: false,
BUDDY: true, // Buddy mode for paired programming MESSAGE_ACTIONS: true,
MONITOR_TOOL: true, // MCP server monitoring/streaming tool BUDDY: true,
TEAMMEM: true, // Team memory management CHICAGO_MCP: false,
MESSAGE_ACTIONS: true, // Message action buttons in the UI COWORKER_TYPE_TELEMETRY: false,
// ── Enabled: new activations ────────────────────────────────────────
DUMP_SYSTEM_PROMPT: true, // --dump-system-prompt CLI flag for debugging
CACHED_MICROCOMPACT: true, // Cache-aware tool result truncation optimization
AWAY_SUMMARY: true, // "While you were away" recap after 5min blur
TRANSCRIPT_CLASSIFIER: true, // Auto-approval classifier for safe tool uses
ULTRATHINK: true, // Deep thinking mode — type "ultrathink" to boost reasoning
TOKEN_BUDGET: true, // Token budget tracking with usage warnings
HISTORY_PICKER: true, // Enhanced interactive prompt history picker
QUICK_SEARCH: true, // Ctrl+G quick search across prompts
SHOT_STATS: true, // Shot distribution stats in session summary
EXTRACT_MEMORIES: true, // Auto-extract durable memories from conversations
FORK_SUBAGENT: true, // Implicit context-forking when omitting subagent_type
VERIFICATION_AGENT: true, // Built-in read-only agent for test/verification
PROMPT_CACHE_BREAK_DETECTION: true, // Detect & log unexpected prompt cache invalidations
HOOK_PROMPTS: true, // Allow tools to request interactive user prompts
} }
// ── Pre-process: replace feature() calls with boolean literals ────── // ── Pre-process: replace feature() calls with boolean literals ──────

View File

@@ -1,47 +0,0 @@
import { existsSync, readFileSync } from 'fs'
import { join } from 'path'
import { expect, test } from 'bun:test'
// Regression guard for #856. Several build feature flags require source files
// that are not mirrored into the open build. When such a flag is set to `true`
// without the source present, the bundler falls back to a missing-module stub
// that only exports `default`, which causes runtime errors like
// `fetchMcpSkillsForClient is not a function` when downstream code reaches
// through the `require()` to a named export.
//
// This test fails fast at test-time if someone re-enables one of these flags
// without first mirroring the corresponding source file.
const BUILD_SCRIPT = join(import.meta.dir, 'build.ts')
const REPO_ROOT = join(import.meta.dir, '..')
type FlagGuard = {
flag: string
source: string // path relative to repo root
}
const FLAG_REQUIRES_SOURCE: FlagGuard[] = [
{ flag: 'MCP_SKILLS', source: 'src/skills/mcpSkills.ts' },
]
test('build feature flags are not enabled without their source files', () => {
const buildScript = readFileSync(BUILD_SCRIPT, 'utf-8')
for (const { flag, source } of FLAG_REQUIRES_SOURCE) {
const enabledRe = new RegExp(`^\\s*${flag}\\s*:\\s*true\\b`, 'm')
const isEnabled = enabledRe.test(buildScript)
const sourceExists = existsSync(join(REPO_ROOT, source))
if (isEnabled && !sourceExists) {
throw new Error(
`Feature flag ${flag} is enabled in scripts/build.ts, but its required source file "${source}" does not exist. ` +
`Enabling this flag without the source will cause runtime errors (missing named exports from the missing-module stub). ` +
`Either mirror the source file or set ${flag}: false.`,
)
}
// When the source IS present, the flag can be either true or false; either
// is fine. We only care about the "enabled but missing" combination.
expect(true).toBe(true)
}
})

View File

@@ -50,23 +50,6 @@ describe('growthbook stub — local feature flag overrides', () => {
expect(stub.getAllGrowthBookFeatures()).toEqual({}) expect(stub.getAllGrowthBookFeatures()).toEqual({})
}) })
// ── Open-build defaults (_openBuildDefaults) ────────────────────
test('returns open-build default when flags file is absent', () => {
// tengu_passport_quail is in _openBuildDefaults as true; without a
// flags file the stub should return the open-build override, not
// the call-site defaultValue.
expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', false)).toBe(true)
expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_coral_fern', false)).toBe(true)
})
test('flags file overrides open-build defaults', () => {
// User-provided feature-flags.json takes priority over _openBuildDefaults.
writeFileSync(flagsFile, JSON.stringify({ tengu_passport_quail: false }))
expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', true)).toBe(false)
})
// ── Valid JSON object ──────────────────────────────────────────── // ── Valid JSON object ────────────────────────────────────────────
test('loads and returns values from a valid JSON file', () => { test('loads and returns values from a valid JSON file', () => {

View File

@@ -40,151 +40,6 @@ import _os from 'node:os';
let _flags = undefined; let _flags = undefined;
// ── Open-build GrowthBook overrides ───────────────────────────────────
// Override upstream defaultValue for runtime gates tied to build-time
// features. Only keys that DIFFER from upstream belong here — the
// catalog below is pure documentation and does NOT affect resolution.
//
// Priority: ~/.claude/feature-flags.json > _openBuildDefaults > defaultValue
//
// To override at runtime, create ~/.claude/feature-flags.json:
// { "tengu_some_flag": true }
const _openBuildDefaults = {
'tengu_sedge_lantern': true, // AWAY_SUMMARY — "while you were away" recap (upstream: false)
'tengu_hive_evidence': true, // VERIFICATION_AGENT — read-only test/verification agent (upstream: false)
'tengu_passport_quail': true, // EXTRACT_MEMORIES — enable memory extraction (upstream: false)
'tengu_coral_fern': true, // EXTRACT_MEMORIES — enable memory search in past context (upstream: false)
};
/* ── Known runtime feature keys (reference only) ───────────────────────
* This catalog does NOT participate in flag resolution. It documents
* the known GrowthBook keys and their upstream default values, scraped
* from src/ call sites. It is NOT exhaustive — new keys may be added
* upstream between catalog updates.
*
* Some keys have different defaults at different call sites — this is
* intentional upstream (the server unifies the value at runtime).
*
* To activate any of these, add them to ~/.claude/feature-flags.json
* or to _openBuildDefaults above.
*
* ── Reasoning & thinking ──────────────────────────────────────────────
* tengu_turtle_carbon = true ULTRATHINK deep thinking runtime gate
* tengu_thinkback = gate /thinkback replay command
*
* ── Agents & orchestration ────────────────────────────────────────────
* tengu_amber_flint = true Agent swarms coordination
* tengu_amber_stoat = true Built-in agent availability (Explore, Plan, etc.)
* tengu_agent_list_attach = true Attach file context to agent list
* tengu_auto_background_agents = false Auto-spawn background agents
* tengu_slim_subagent_claudemd = true Lighter ClaudeMD for subagents
* tengu_hive_evidence = false Verification agent / evidence tracking (4 call sites)
* tengu_ultraplan_model = model cfg ULTRAPLAN model selection (dynamic config)
*
* ── Memory & context ──────────────────────────────────────────────────
* tengu_passport_quail = false EXTRACT_MEMORIES main gate (isExtractModeActive)
* tengu_coral_fern = false EXTRACT_MEMORIES search in past context
* tengu_slate_thimble = false Memory dir paths (non-interactive sessions)
* tengu_herring_clock = true/false Team memory paths (varies by call site)
* tengu_bramble_lintel = null Extract memories throttle (null → every turn)
* tengu_sedge_lantern = false AWAY_SUMMARY "while you were away" recap
* tengu_session_memory = false Session memory service
* tengu_sm_config = {} Session memory config (dynamic)
* tengu_sm_compact_config = {} Session memory compaction config (dynamic)
* tengu_cobalt_raccoon = false Reactive compaction (suppress auto-compact)
* tengu_pebble_leaf_prune = false Session storage pruning
*
* ── Kairos & cron ─────────────────────────────────────────────────────
* tengu_kairos_brief = false Brief layout mode (KAIROS)
* tengu_kairos_brief_config = {} Brief config (dynamic)
* tengu_kairos_cron = true Cron scheduler enable
* tengu_kairos_cron_durable = true Durable (disk-persistent) cron tasks
* tengu_kairos_cron_config = {} Cron jitter config (dynamic)
*
* ── Bridge & remote (require Anthropic infra) ─────────────────────────
* tengu_ccr_bridge = false CCR bridge connection
* tengu_ccr_bridge_multi_session = gate Multi-session spawn mode
* tengu_ccr_mirror = false CCR session mirroring
* tengu_ccr_bundle_seed_enabled = gate Git bundle seeding for CCR
* tengu_ccr_bundle_max_bytes = null Bundle size limit (null → default)
* tengu_bridge_repl_v2 = false Environment-less REPL bridge v2
* tengu_bridge_repl_v2_cse_shim_enabled = true CSE→Session tag retag shim
* tengu_bridge_min_version = {min:'0'} Min CLI version for bridge (dynamic)
* tengu_bridge_initial_history_cap = 200 Initial history cap for bridge
* tengu_bridge_system_init = false Bridge system initialization
* tengu_cobalt_harbor = false Auto-connect CCR at startup
* tengu_cobalt_lantern = false Remote setup preconditions
* tengu_remote_backend = false Remote TUI backend
* tengu_surreal_dali = false Remote agent tasks / triggers
*
* ── Prompt & API ──────────────────────────────────────────────────────
* tengu_attribution_header = true Attribution header in API requests
* tengu_basalt_3kr = true MCP instructions delta
* tengu_slate_prism = true/false Message formatting (varies by call site)
* tengu_amber_prism = false Message content formatting
* tengu_amber_json_tools = false JSON format for tool schemas
* tengu_fgts = false API feature gates
* tengu_otk_slot_v1 = false One-time key slots for API auth
* tengu_cicada_nap_ms = 0 Background GrowthBook refresh throttle (ms)
* tengu_miraculo_the_bard = false Service initialization gate
* tengu_immediate_model_command = false Immediate /model command execution
* tengu_chomp_inflection = false Prompt suggestions after responses
* tengu_tool_pear = gate API betas for tool use
* tengu-off-switch = {act:false} Service kill switch (dynamic; uses dash)
*
* ── Permissions & security ────────────────────────────────────────────
* tengu_birch_trellis = true Bash auto-mode permissions config
* tengu_auto_mode_config = {} Auto-mode configuration (dynamic, many call sites)
* tengu_iron_gate_closed = true Permission iron gate (with refresh)
* tengu_destructive_command_warning = false Warning for destructive bash commands
* tengu_disable_bypass_permissions_mode = security Security killswitch (always false in open build)
*
* ── UI & UX ───────────────────────────────────────────────────────────
* tengu_willow_mode = 'off' REPL rendering mode
* tengu_terminal_panel = false Terminal panel keybinding
* tengu_terminal_sidebar = false Terminal sidebar in REPL/config
* tengu_marble_sandcastle = false Fast mode gate
* tengu_jade_anvil_4 = false Rate limit options UI ordering
* tengu_collage_kaleidoscope = true Native clipboard image paste (macOS)
* tengu_lapis_finch = false Plugin/hint recommendation
* tengu_lodestone_enabled = false Deep links claude-cli:// protocol
* tengu_copper_panda = false Skill improvement suggestions
* tengu_desktop_upsell = {} Desktop app upsell config (dynamic)
* tengu-top-of-feed-tip = {} Emergency tip of feed (dynamic; uses dash)
*
* ── File operations ───────────────────────────────────────────────────
* tengu_quartz_lantern = false File read/write dedup optimization
* tengu_moth_copse = false Attachments handling (variant A)
* tengu_marble_fox = false Attachments handling (variant B)
* tengu_scratch = gate Scratchpad filesystem access / coordinator
*
* ── MCP & plugins ─────────────────────────────────────────────────────
* tengu_harbor = false MCP channel allowlist verification
* tengu_harbor_permissions = false MCP channel permissions enforcement
* tengu_copper_bridge = false Chrome MCP bridge
* tengu_chrome_auto_enable = false Auto-enable Chrome MCP on startup
* tengu_glacier_2xr = false Enhanced tool search / ToolSearchTool
* tengu_malort_pedway = {} Computer-use (Chicago) config (dynamic)
*
* ── VSCode / IDE ──────────────────────────────────────────────────────
* tengu_quiet_fern = false VSCode browser support
* tengu_vscode_cc_auth = false VSCode in-band OAuth via claude_authenticate
* tengu_vscode_review_upsell = gate VSCode review upsell
* tengu_vscode_onboarding = gate VSCode onboarding experience
*
* ── Voice ─────────────────────────────────────────────────────────────
* tengu_amber_quartz_disabled = false VOICE_MODE kill-switch (false = voice allowed)
*
* ── Auto-updater (stubbed in open build) ──────────────────────────────
* tengu_version_config = {min:'0'} Min version enforcement (dynamic)
* tengu_max_version_config = {} Max version / deprecation config (dynamic)
*
* ── Telemetry & tracing ───────────────────────────────────────────────
* tengu_trace_lantern = false Beta session tracing
* tengu_chair_sermon = gate Analytics / message formatting gate
* tengu_strap_foyer = false Settings sync to cloud
*/
function _loadFlags() { function _loadFlags() {
if (_flags !== undefined) return; if (_flags !== undefined) return;
try { try {
@@ -200,7 +55,6 @@ function _loadFlags() {
function _getFlagValue(key, defaultValue) { function _getFlagValue(key, defaultValue) {
_loadFlags(); _loadFlags();
if (_flags != null && Object.hasOwn(_flags, key)) return _flags[key]; if (_flags != null && Object.hasOwn(_flags, key)) return _flags[key];
if (Object.hasOwn(_openBuildDefaults, key)) return _openBuildDefaults[key];
return defaultValue; return defaultValue;
} }

View File

@@ -249,11 +249,6 @@ export type ToolUseContext = {
/** When true, canUseTool must always be called even when hooks auto-approve. /** When true, canUseTool must always be called even when hooks auto-approve.
* Used by speculation for overlay file path rewriting. */ * Used by speculation for overlay file path rewriting. */
requireCanUseTool?: boolean requireCanUseTool?: boolean
/**
* Optional callback used by hook-chain fallback actions that launch
* AgentTool from hook runtime paths.
*/
hookChainsCanUseTool?: CanUseToolFn
messages: Message[] messages: Message[]
fileReadingLimits?: { fileReadingLimits?: {
maxTokens?: number maxTokens?: number

View File

@@ -169,14 +169,6 @@ describe('Web search result count improvements', () => {
expect(content).toMatch(/max_uses:\s*15/) expect(content).toMatch(/max_uses:\s*15/)
}) })
test('codex web search path guarantees a non-empty result body', async () => {
const content = await file(
'tools/WebSearchTool/WebSearchTool.ts',
).text()
expect(content).toContain("results.push('No results found.')")
})
}) })
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------

View File

@@ -1,56 +0,0 @@
import type { ToolUseContext } from '../Tool.js'
import type { Command } from '../types/command.js'
import {
benchmarkModel,
benchmarkMultipleModels,
formatBenchmarkResults,
isBenchmarkSupported,
} from '../utils/model/benchmark.js'
import { getOllamaModelOptions } from '../utils/model/ollamaModels.js'
async function runBenchmark(
model?: string,
context?: ToolUseContext,
): Promise<void> {
if (!isBenchmarkSupported()) {
context?.stdout?.write(
'Benchmark not supported for this provider.\n' +
'Supported: OpenAI-compatible endpoints (Ollama, NVIDIA NIM, MiniMax)\n',
)
return
}
let modelsToBenchmark: string[]
if (model) {
modelsToBenchmark = [model]
} else {
const ollamaModels = getOllamaModelOptions()
modelsToBenchmark = ollamaModels.slice(0, 3).map((m) => m.value)
}
context?.stdout?.write(`Benchmarking ${modelsToBenchmark.length} model(s)...\n`)
const results = await benchmarkMultipleModels(
modelsToBenchmark,
(completed, total, result) => {
context?.stdout?.write(
`[${completed}/${total}] ${result.model}: ` +
`${result.success ? result.tokensPerSecond.toFixed(1) + ' tps' : 'FAILED'}\n`,
)
},
)
context?.stdout?.write('\n' + formatBenchmarkResults(results) + '\n')
}
export const benchmark: Command = {
name: 'benchmark',
async onExecute(context: ToolUseContext): Promise<void> {
const args = context.args ?? {}
const model = args.model as string | undefined
await runBenchmark(model, context)
},
}

View File

@@ -112,10 +112,8 @@ test('third-party provider branch opens the first-run provider manager', async (
) )
expect(output).toContain('Set up provider') expect(output).toContain('Set up provider')
// Use alphabetically-early sentinels so they remain visible in the
// 13-row test frame after the provider list was sorted A→Z.
expect(output).toContain('Anthropic') expect(output).toContain('Anthropic')
expect(output).toContain('Azure OpenAI') expect(output).toContain('OpenAI')
expect(output).toContain('DeepSeek') expect(output).toContain('Ollama')
expect(output).toContain('Google Gemini') expect(output).toContain('LM Studio')
}) })

View File

@@ -97,47 +97,6 @@ async function waitForCondition(
throw new Error('Timed out waiting for ProviderManager test condition') throw new Error('Timed out waiting for ProviderManager test condition')
} }
// Provider list is sorted alphabetically by label in the preset picker, so
// reaching a given provider takes more keypresses than it used to. Keep the
// target-by-label indirection here so these tests survive future list edits
// without further churn.
//
// Order matches ProviderManager.renderPresetSelection() when
// canUseCodexOAuth === true (default in mocked tests).
const PRESET_ORDER = [
'Alibaba Coding Plan',
'Alibaba Coding Plan (China)',
'Anthropic',
'Atomic Chat',
'Azure OpenAI',
'Codex OAuth',
'DeepSeek',
'Google Gemini',
'Groq',
'LM Studio',
'MiniMax',
'Mistral',
'Moonshot AI',
'NVIDIA NIM',
'Ollama',
'OpenAI',
'OpenRouter',
'Together AI',
'Custom',
] as const
async function navigateToPreset(
stdin: { write: (data: string) => void },
label: (typeof PRESET_ORDER)[number],
): Promise<void> {
const index = PRESET_ORDER.indexOf(label)
if (index < 0) throw new Error(`Unknown preset label: ${label}`)
for (let i = 0; i < index; i++) {
stdin.write('j')
await Bun.sleep(25)
}
}
function createDeferred<T>(): { function createDeferred<T>(): {
promise: Promise<T> promise: Promise<T>
resolve: (value: T) => void resolve: (value: T) => void
@@ -532,10 +491,11 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
await waitForFrameOutput( await waitForFrameOutput(
mounted.getOutput, mounted.getOutput,
frame => frame.includes('Set up provider'), frame => frame.includes('Set up provider') && frame.includes('Ollama'),
) )
await navigateToPreset(mounted.stdin, 'Ollama') mounted.stdin.write('j')
await Bun.sleep(50)
mounted.stdin.write('\r') mounted.stdin.write('\r')
const modelFrame = await waitForFrameOutput( const modelFrame = await waitForFrameOutput(
@@ -630,7 +590,12 @@ test('ProviderManager first-run Codex OAuth switches the current session after l
frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'), frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
) )
await navigateToPreset(mounted.stdin, 'Codex OAuth') mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('\r') mounted.stdin.write('\r')
await waitForCondition(() => onDone.mock.calls.length > 0) await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -722,7 +687,12 @@ test('ProviderManager first-run Codex OAuth reports next-startup fallback when s
frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'), frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
) )
await navigateToPreset(mounted.stdin, 'Codex OAuth') mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('\r') mounted.stdin.write('\r')
await waitForCondition(() => onDone.mock.calls.length > 0) await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -816,7 +786,12 @@ test('ProviderManager does not hijack a manual Codex profile when OAuth credenti
frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'), frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
) )
await navigateToPreset(mounted.stdin, 'Codex OAuth') mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('\r') mounted.stdin.write('\r')
await waitForCondition(() => onDone.mock.calls.length > 0) await waitForCondition(() => onDone.mock.calls.length > 0)

View File

@@ -37,9 +37,7 @@ import {
readGithubModelsTokenAsync, readGithubModelsTokenAsync,
} from '../utils/githubModelsCredentials.js' } from '../utils/githubModelsCredentials.js'
import { import {
probeAtomicChatReadiness,
probeOllamaGenerationReadiness, probeOllamaGenerationReadiness,
type AtomicChatReadiness,
type OllamaGenerationReadiness, type OllamaGenerationReadiness,
} from '../utils/providerDiscovery.js' } from '../utils/providerDiscovery.js'
import { import {
@@ -71,7 +69,6 @@ type Screen =
| 'menu' | 'menu'
| 'select-preset' | 'select-preset'
| 'select-ollama-model' | 'select-ollama-model'
| 'select-atomic-chat-model'
| 'codex-oauth' | 'codex-oauth'
| 'form' | 'form'
| 'select-active' | 'select-active'
@@ -92,16 +89,6 @@ type OllamaSelectionState =
} }
| { state: 'unavailable'; message: string } | { state: 'unavailable'; message: string }
type AtomicChatSelectionState =
| { state: 'idle' }
| { state: 'loading' }
| {
state: 'ready'
options: OptionWithDescription<string>[]
defaultValue?: string
}
| { state: 'unavailable'; message: string }
const FORM_STEPS: Array<{ const FORM_STEPS: Array<{
key: DraftField key: DraftField
label: string label: string
@@ -235,21 +222,6 @@ function getGithubProviderSummary(
return `github-models · ${GITHUB_PROVIDER_DEFAULT_BASE_URL} · ${getGithubProviderModel(processEnv)} · ${credentialSummary}${activeSuffix}` return `github-models · ${GITHUB_PROVIDER_DEFAULT_BASE_URL} · ${getGithubProviderModel(processEnv)} · ${credentialSummary}${activeSuffix}`
} }
function describeAtomicChatSelectionIssue(
readiness: AtomicChatReadiness,
baseUrl: string,
): string {
if (readiness.state === 'unreachable') {
return `Could not reach Atomic Chat at ${redactUrlForDisplay(baseUrl)}. Start the Atomic Chat app first, or enter the endpoint manually.`
}
if (readiness.state === 'no_models') {
return 'Atomic Chat is running, but no models are loaded. Download and load a model inside the Atomic Chat app first, or enter details manually.'
}
return ''
}
function describeOllamaSelectionIssue( function describeOllamaSelectionIssue(
readiness: OllamaGenerationReadiness, readiness: OllamaGenerationReadiness,
baseUrl: string, baseUrl: string,
@@ -384,12 +356,10 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
const initialIsGithubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) const initialIsGithubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
const initialHasGithubCredential = initialGithubCredentialSource !== 'none' const initialHasGithubCredential = initialGithubCredentialSource !== 'none'
// Deferred initialization: useState initializers run synchronously during const [profiles, setProfiles] = React.useState(() => getProviderProfiles())
// render, so getProviderProfiles() and getActiveProviderProfile() would block const [activeProfileId, setActiveProfileId] = React.useState(
// the UI on first mount (sync file I/O). Use empty initial values and load () => getActiveProviderProfile()?.id,
// asynchronously in useEffect with queueMicrotask to keep UI responsive. )
const [profiles, setProfiles] = React.useState<ProviderProfile[]>([])
const [activeProfileId, setActiveProfileId] = React.useState<string | undefined>()
const [githubProviderAvailable, setGithubProviderAvailable] = React.useState( const [githubProviderAvailable, setGithubProviderAvailable] = React.useState(
() => isGithubProviderAvailable(initialGithubCredentialSource), () => isGithubProviderAvailable(initialGithubCredentialSource),
) )
@@ -423,88 +393,11 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
const [ollamaSelection, setOllamaSelection] = React.useState<OllamaSelectionState>({ const [ollamaSelection, setOllamaSelection] = React.useState<OllamaSelectionState>({
state: 'idle', state: 'idle',
}) })
const [atomicChatSelection, setAtomicChatSelection] =
React.useState<AtomicChatSelectionState>({ state: 'idle' })
// Deferred initialization: useState initializers run synchronously during
// render, so getProviderProfiles() and getActiveProviderProfile() would block
// the UI (sync file I/O). Defer to queueMicrotask after first render.
// In test environment, skip defer to avoid timing issues with mocks.
const [isInitializing, setIsInitializing] = React.useState(
process.env.NODE_ENV !== 'test',
)
const [isActivating, setIsActivating] = React.useState(false)
const isRefreshingRef = React.useRef(false)
React.useEffect(() => {
// Skip deferred initialization in test environment (mocks are synchronous)
if (process.env.NODE_ENV === 'test') {
setProfiles(getProviderProfiles())
setActiveProfileId(getActiveProviderProfile()?.id)
setIsInitializing(false)
return
}
queueMicrotask(() => {
const profilesData = getProviderProfiles()
const activeId = getActiveProviderProfile()?.id
setProfiles(profilesData)
setActiveProfileId(activeId)
setIsInitializing(false)
})
}, [])
const currentStep = FORM_STEPS[formStepIndex] ?? FORM_STEPS[0] const currentStep = FORM_STEPS[formStepIndex] ?? FORM_STEPS[0]
const currentStepKey = currentStep.key const currentStepKey = currentStep.key
const currentValue = draft[currentStepKey] const currentValue = draft[currentStepKey]
// Memoize menu options to prevent unnecessary re-renders when navigating
// the select menu. Without this, each arrow key press creates a new options
// array reference, causing Select to re-render and feel sluggish.
const hasProfiles = profiles.length > 0
const hasSelectableProviders = hasProfiles || githubProviderAvailable
const menuOptions = React.useMemo(
() => [
{
value: 'add',
label: 'Add provider',
description: 'Create a new provider profile',
},
{
value: 'activate',
label: 'Set active provider',
description: 'Switch the active provider profile',
disabled: !hasSelectableProviders,
},
{
value: 'edit',
label: 'Edit provider',
description: 'Update URL, model, or key',
disabled: !hasProfiles,
},
{
value: 'delete',
label: 'Delete provider',
description: 'Remove a provider profile',
disabled: !hasSelectableProviders,
},
...(hasStoredCodexOAuthCredentials
? [
{
value: 'logout-codex-oauth',
label: 'Log out Codex OAuth',
description: 'Clear securely stored Codex OAuth credentials',
},
]
: []),
{
value: 'done',
label: 'Done',
description: 'Return to chat',
},
],
[hasSelectableProviders, hasProfiles, hasStoredCodexOAuthCredentials],
)
const refreshGithubProviderState = React.useCallback((): void => { const refreshGithubProviderState = React.useCallback((): void => {
const envCredentialSource = getGithubCredentialSourceFromEnv() const envCredentialSource = getGithubCredentialSourceFromEnv()
const githubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) const githubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
@@ -613,61 +506,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
} }
}, [draft.baseUrl, screen]) }, [draft.baseUrl, screen])
React.useEffect(() => {
if (screen !== 'select-atomic-chat-model') {
return
}
let cancelled = false
setAtomicChatSelection({ state: 'loading' })
void (async () => {
const readiness = await probeAtomicChatReadiness({
baseUrl: draft.baseUrl,
})
if (readiness.state !== 'ready') {
if (!cancelled) {
setAtomicChatSelection({
state: 'unavailable',
message: describeAtomicChatSelectionIssue(readiness, draft.baseUrl),
})
}
return
}
if (!cancelled) {
setAtomicChatSelection({
state: 'ready',
defaultValue: readiness.models[0],
options: readiness.models.map(model => ({
label: model,
value: model,
})),
})
}
})()
return () => {
cancelled = true
}
}, [draft.baseUrl, screen])
function refreshProfiles(): void { function refreshProfiles(): void {
// Defer sync I/O to next microtask to prevent UI freeze. const nextProfiles = getProviderProfiles()
// getProviderProfiles() and getActiveProviderProfile() read config files setProfiles(nextProfiles)
// synchronously, which can block the main thread on Windows (antivirus, disk cache). setActiveProfileId(getActiveProviderProfile()?.id)
// queueMicrotask ensures the current render completes first. refreshGithubProviderState()
if (isRefreshingRef.current) return refreshCodexOAuthCredentialState()
isRefreshingRef.current = true
queueMicrotask(() => {
const nextProfiles = getProviderProfiles()
setProfiles(nextProfiles)
setActiveProfileId(getActiveProviderProfile()?.id)
refreshGithubProviderState()
refreshCodexOAuthCredentialState()
isRefreshingRef.current = false
})
} }
function clearStartupProviderOverrideFromUserSettings(): string | null { function clearStartupProviderOverrideFromUserSettings(): string | null {
@@ -740,24 +584,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
async function activateSelectedProvider(profileId: string): Promise<void> { async function activateSelectedProvider(profileId: string): Promise<void> {
let providerLabel = 'provider' let providerLabel = 'provider'
// Set loading state before sync I/O to keep UI responsive
setIsActivating(true)
setStatusMessage('Activating provider...')
try { try {
// Defer sync I/O to next microtask - UI renders loading state first.
// setActiveProviderProfile(), activateGithubProvider(), and
// clearStartupProviderOverrideFromUserSettings() all perform sync file writes
// (saveGlobalConfig, saveProfileFile, updateSettingsForSource) which can
// block the main thread on Windows (antivirus, disk cache, NTFS metadata).
await new Promise<void>(resolve => queueMicrotask(resolve))
if (profileId === GITHUB_PROVIDER_ID) { if (profileId === GITHUB_PROVIDER_ID) {
providerLabel = GITHUB_PROVIDER_LABEL providerLabel = GITHUB_PROVIDER_LABEL
const githubError = activateGithubProvider() const githubError = activateGithubProvider()
if (githubError) { if (githubError) {
setErrorMessage(`Could not activate GitHub provider: ${githubError}`) setErrorMessage(`Could not activate GitHub provider: ${githubError}`)
setIsActivating(false)
returnToMenu() returnToMenu()
return return
} }
@@ -773,7 +605,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
mainLoopModel: GITHUB_PROVIDER_DEFAULT_MODEL, mainLoopModel: GITHUB_PROVIDER_DEFAULT_MODEL,
})) }))
setStatusMessage(`Active provider: ${GITHUB_PROVIDER_LABEL}`) setStatusMessage(`Active provider: ${GITHUB_PROVIDER_LABEL}`)
setIsActivating(false)
returnToMenu() returnToMenu()
return return
} }
@@ -781,7 +612,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
const active = setActiveProviderProfile(profileId) const active = setActiveProviderProfile(profileId)
if (!active) { if (!active) {
setErrorMessage('Could not change active provider.') setErrorMessage('Could not change active provider.')
setIsActivating(false)
returnToMenu() returnToMenu()
return return
} }
@@ -829,12 +659,10 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
? `Active provider: ${active.name}. Warning: could not clear startup provider override (${settingsOverrideError}).` ? `Active provider: ${active.name}. Warning: could not clear startup provider override (${settingsOverrideError}).`
: `Active provider: ${active.name}`, : `Active provider: ${active.name}`,
) )
setIsActivating(false)
returnToMenu() returnToMenu()
} catch (error) { } catch (error) {
refreshProfiles() refreshProfiles()
setStatusMessage(undefined) setStatusMessage(undefined)
setIsActivating(false)
const detail = error instanceof Error ? error.message : String(error) const detail = error instanceof Error ? error.message : String(error)
setErrorMessage(`Could not finish activating ${providerLabel}: ${detail}`) setErrorMessage(`Could not finish activating ${providerLabel}: ${detail}`)
returnToMenu() returnToMenu()
@@ -958,12 +786,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
return return
} }
if (preset === 'atomic-chat') {
setAtomicChatSelection({ state: 'loading' })
setScreen('select-atomic-chat-model')
return
}
setScreen('form') setScreen('form')
} }
@@ -1039,86 +861,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
returnToMenu() returnToMenu()
} }
function renderAtomicChatSelection(): React.ReactNode {
if (
atomicChatSelection.state === 'loading' ||
atomicChatSelection.state === 'idle'
) {
return (
<Box flexDirection="column" gap={1}>
<Text color="remember" bold>
Checking Atomic Chat
</Text>
<Text dimColor>Looking for loaded Atomic Chat models...</Text>
</Box>
)
}
if (atomicChatSelection.state === 'unavailable') {
return (
<Box flexDirection="column" gap={1}>
<Text color="remember" bold>
Atomic Chat setup
</Text>
<Text dimColor>{atomicChatSelection.message}</Text>
<Select
options={[
{
value: 'manual',
label: 'Enter manually',
description: 'Fill in the base URL and model yourself',
},
{
value: 'back',
label: 'Back',
description: 'Choose another provider preset',
},
]}
onChange={(value: string) => {
if (value === 'manual') {
setFormStepIndex(0)
setCursorOffset(draft.name.length)
setScreen('form')
return
}
setScreen('select-preset')
}}
onCancel={() => setScreen('select-preset')}
visibleOptionCount={2}
/>
</Box>
)
}
return (
<Box flexDirection="column" gap={1}>
<Text color="remember" bold>
Choose an Atomic Chat model
</Text>
<Text dimColor>
Pick one of the models loaded in Atomic Chat to save into a local
provider profile.
</Text>
<Select
options={atomicChatSelection.options}
defaultValue={atomicChatSelection.defaultValue}
defaultFocusValue={atomicChatSelection.defaultValue}
inlineDescriptions
visibleOptionCount={Math.min(8, atomicChatSelection.options.length)}
onChange={(value: string) => {
const nextDraft = {
...draft,
model: value,
}
setDraft(nextDraft)
persistDraft(nextDraft)
}}
onCancel={() => setScreen('select-preset')}
/>
</Box>
)
}
function renderOllamaSelection(): React.ReactNode { function renderOllamaSelection(): React.ReactNode {
if (ollamaSelection.state === 'loading' || ollamaSelection.state === 'idle') { if (ollamaSelection.state === 'loading' || ollamaSelection.state === 'idle') {
return ( return (
@@ -1249,35 +991,21 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
function renderPresetSelection(): React.ReactNode { function renderPresetSelection(): React.ReactNode {
const canUseCodexOAuth = !isBareMode() const canUseCodexOAuth = !isBareMode()
// Providers sorted alphabetically by label. `Custom` is pinned to the end
// because it's the catch-all / escape hatch — users scanning the list
// should always find known providers first. `Skip for now` (first-run
// only) comes last, after Custom.
const options = [ const options = [
{
value: 'dashscope-intl',
label: 'Alibaba Coding Plan',
description: 'Alibaba DashScope International endpoint',
},
{
value: 'dashscope-cn',
label: 'Alibaba Coding Plan (China)',
description: 'Alibaba DashScope China endpoint',
},
{ {
value: 'anthropic', value: 'anthropic',
label: 'Anthropic', label: 'Anthropic',
description: 'Native Claude API (x-api-key auth)', description: 'Native Claude API (x-api-key auth)',
}, },
{ {
value: 'atomic-chat', value: 'ollama',
label: 'Atomic Chat', label: 'Ollama',
description: 'Local Model Provider', description: 'Local or remote Ollama endpoint',
}, },
{ {
value: 'azure-openai', value: 'openai',
label: 'Azure OpenAI', label: 'OpenAI',
description: 'Azure OpenAI endpoint (model=deployment name)', description: 'OpenAI API with API key',
}, },
...(canUseCodexOAuth ...(canUseCodexOAuth
? [ ? [
@@ -1289,6 +1017,11 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
}, },
] ]
: []), : []),
{
value: 'moonshotai',
label: 'Moonshot AI',
description: 'Kimi OpenAI-compatible endpoint',
},
{ {
value: 'deepseek', value: 'deepseek',
label: 'DeepSeek', label: 'DeepSeek',
@@ -1299,45 +1032,25 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
label: 'Google Gemini', label: 'Google Gemini',
description: 'Gemini OpenAI-compatible endpoint', description: 'Gemini OpenAI-compatible endpoint',
}, },
{
value: 'together',
label: 'Together AI',
description: 'Together chat/completions endpoint',
},
{ {
value: 'groq', value: 'groq',
label: 'Groq', label: 'Groq',
description: 'Groq OpenAI-compatible endpoint', description: 'Groq OpenAI-compatible endpoint',
}, },
{
value: 'lmstudio',
label: 'LM Studio',
description: 'Local LM Studio endpoint',
},
{
value: 'minimax',
label: 'MiniMax',
description: 'MiniMax API endpoint',
},
{ {
value: 'mistral', value: 'mistral',
label: 'Mistral', label: 'Mistral',
description: 'Mistral OpenAI-compatible endpoint', description: 'Mistral OpenAI-compatible endpoint',
}, },
{ {
value: 'moonshotai', value: 'azure-openai',
label: 'Moonshot AI', label: 'Azure OpenAI',
description: 'Kimi OpenAI-compatible endpoint', description: 'Azure OpenAI endpoint (model=deployment name)',
},
{
value: 'nvidia-nim',
label: 'NVIDIA NIM',
description: 'NVIDIA NIM endpoint',
},
{
value: 'ollama',
label: 'Ollama',
description: 'Local or remote Ollama endpoint',
},
{
value: 'openai',
label: 'OpenAI',
description: 'OpenAI API with API key',
}, },
{ {
value: 'openrouter', value: 'openrouter',
@@ -1345,15 +1058,35 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
description: 'OpenRouter OpenAI-compatible endpoint', description: 'OpenRouter OpenAI-compatible endpoint',
}, },
{ {
value: 'together', value: 'lmstudio',
label: 'Together AI', label: 'LM Studio',
description: 'Together chat/completions endpoint', description: 'Local LM Studio endpoint',
},
{
value: 'dashscope-cn',
label: 'Alibaba Coding Plan (China)',
description: 'Alibaba DashScope China endpoint',
},
{
value: 'dashscope-intl',
label: 'Alibaba Coding Plan',
description: 'Alibaba DashScope International endpoint',
}, },
{ {
value: 'custom', value: 'custom',
label: 'Custom', label: 'Custom',
description: 'Any OpenAI-compatible provider', description: 'Any OpenAI-compatible provider',
}, },
{
value: 'nvidia-nim',
label: 'NVIDIA NIM',
description: 'NVIDIA NIM endpoint',
},
{
value: 'minimax',
label: 'MiniMax',
description: 'MiniMax API endpoint',
},
...(mode === 'first-run' ...(mode === 'first-run'
? [ ? [
{ {
@@ -1444,10 +1177,49 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
} }
function renderMenu(): React.ReactNode { function renderMenu(): React.ReactNode {
// Use memoized menuOptions from component scope
const hasProfiles = profiles.length > 0 const hasProfiles = profiles.length > 0
const hasSelectableProviders = hasProfiles || githubProviderAvailable const hasSelectableProviders = hasProfiles || githubProviderAvailable
const options = [
{
value: 'add',
label: 'Add provider',
description: 'Create a new provider profile',
},
{
value: 'activate',
label: 'Set active provider',
description: 'Switch the active provider profile',
disabled: !hasSelectableProviders,
},
{
value: 'edit',
label: 'Edit provider',
description: 'Update URL, model, or key',
disabled: !hasProfiles,
},
{
value: 'delete',
label: 'Delete provider',
description: 'Remove a provider profile',
disabled: !hasSelectableProviders,
},
...(hasStoredCodexOAuthCredentials
? [
{
value: 'logout-codex-oauth',
label: 'Log out Codex OAuth',
description: 'Clear securely stored Codex OAuth credentials',
},
]
: []),
{
value: 'done',
label: 'Done',
description: 'Return to chat',
},
]
return ( return (
<Box flexDirection="column" gap={1}> <Box flexDirection="column" gap={1}>
<Text color="remember" bold> <Text color="remember" bold>
@@ -1484,7 +1256,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
)} )}
</Box> </Box>
<Select <Select
options={menuOptions} options={options}
onChange={(value: string) => { onChange={(value: string) => {
setErrorMessage(undefined) setErrorMessage(undefined)
switch (value) { switch (value) {
@@ -1497,7 +1269,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
} }
break break
case 'edit': case 'edit':
if (hasProfiles) { if (profiles.length > 0) {
setScreen('select-edit') setScreen('select-edit')
} }
break break
@@ -1554,7 +1326,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
}} }}
onCancel={() => closeWithCancelled('Provider manager closed')} onCancel={() => closeWithCancelled('Provider manager closed')}
defaultFocusValue={menuFocusValue} defaultFocusValue={menuFocusValue}
visibleOptionCount={menuOptions.length} visibleOptionCount={options.length}
/> />
</Box> </Box>
) )
@@ -1633,9 +1405,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
case 'select-ollama-model': case 'select-ollama-model':
content = renderOllamaSelection() content = renderOllamaSelection()
break break
case 'select-atomic-chat-model':
content = renderAtomicChatSelection()
break
case 'codex-oauth': case 'codex-oauth':
content = ( content = (
<CodexOAuthSetup <CodexOAuthSetup
@@ -1793,21 +1562,5 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
break break
} }
return ( return <Pane color="permission">{content}</Pane>
<Pane color="permission">
{isInitializing ? (
<Box flexDirection="column" gap={1}>
<Text color="remember" bold>Loading providers...</Text>
<Text dimColor>Reading provider profiles from disk.</Text>
</Box>
) : isActivating ? (
<Box flexDirection="column" gap={1}>
<Text color="remember" bold>Activating provider...</Text>
<Text dimColor>Please wait while the provider is being configured.</Text>
</Box>
) : (
content
)}
</Pane>
)
} }

View File

@@ -281,24 +281,6 @@ export function Config({
enabled: autoCompactEnabled enabled: autoCompactEnabled
}); });
} }
}, {
id: 'toolHistoryCompressionEnabled',
label: 'Tool history compression',
value: globalConfig.toolHistoryCompressionEnabled,
type: 'boolean' as const,
onChange(toolHistoryCompressionEnabled: boolean) {
saveGlobalConfig(current => ({
...current,
toolHistoryCompressionEnabled
}));
setGlobalConfig({
...getGlobalConfig(),
toolHistoryCompressionEnabled
});
logEvent('tengu_tool_history_compression_setting_changed', {
enabled: toolHistoryCompressionEnabled
});
}
}, { }, {
id: 'spinnerTipsEnabled', id: 'spinnerTipsEnabled',
label: 'Show tips', label: 'Show tips',
@@ -1176,9 +1158,6 @@ export function Config({
if (globalConfig.autoCompactEnabled !== initialConfig.current.autoCompactEnabled) { if (globalConfig.autoCompactEnabled !== initialConfig.current.autoCompactEnabled) {
formattedChanges.push(`${globalConfig.autoCompactEnabled ? 'Enabled' : 'Disabled'} auto-compact`); formattedChanges.push(`${globalConfig.autoCompactEnabled ? 'Enabled' : 'Disabled'} auto-compact`);
} }
if (globalConfig.toolHistoryCompressionEnabled !== initialConfig.current.toolHistoryCompressionEnabled) {
formattedChanges.push(`${globalConfig.toolHistoryCompressionEnabled ? 'Enabled' : 'Disabled'} tool history compression`);
}
if (globalConfig.respectGitignore !== initialConfig.current.respectGitignore) { if (globalConfig.respectGitignore !== initialConfig.current.respectGitignore) {
formattedChanges.push(`${globalConfig.respectGitignore ? 'Enabled' : 'Disabled'} respect .gitignore in file picker`); formattedChanges.push(`${globalConfig.respectGitignore ? 'Enabled' : 'Disabled'} respect .gitignore in file picker`);
} }

View File

@@ -1,158 +0,0 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { detectProvider } from './StartupScreen.js'
const ENV_KEYS = [
'CLAUDE_CODE_USE_OPENAI',
'CLAUDE_CODE_USE_GEMINI',
'CLAUDE_CODE_USE_GITHUB',
'CLAUDE_CODE_USE_BEDROCK',
'CLAUDE_CODE_USE_VERTEX',
'CLAUDE_CODE_USE_MISTRAL',
'OPENAI_BASE_URL',
'OPENAI_API_KEY',
'OPENAI_MODEL',
'GEMINI_MODEL',
'MISTRAL_MODEL',
'ANTHROPIC_MODEL',
'NVIDIA_NIM',
'MINIMAX_API_KEY',
]
const originalEnv: Record<string, string | undefined> = {}
beforeEach(() => {
for (const key of ENV_KEYS) {
originalEnv[key] = process.env[key]
delete process.env[key]
}
})
afterEach(() => {
for (const key of ENV_KEYS) {
if (originalEnv[key] === undefined) {
delete process.env[key]
} else {
process.env[key] = originalEnv[key]
}
}
})
function setupOpenAIMode(baseUrl: string, model: string): void {
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_BASE_URL = baseUrl
process.env.OPENAI_MODEL = model
process.env.OPENAI_API_KEY = 'test-key'
}
// --- Issue #855: aggregator URL must win over vendor-prefixed model name ---
describe('detectProvider — aggregator URL authoritative over model-name substring (#855)', () => {
test('OpenRouter + deepseek/deepseek-chat labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'deepseek/deepseek-chat')
expect(detectProvider().name).toBe('OpenRouter')
})
test('OpenRouter + moonshotai/kimi-k2 labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'moonshotai/kimi-k2')
expect(detectProvider().name).toBe('OpenRouter')
})
test('OpenRouter + mistralai/mistral-large labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'mistralai/mistral-large')
expect(detectProvider().name).toBe('OpenRouter')
})
test('OpenRouter + meta-llama/llama-3.3 labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'meta-llama/llama-3.3-70b-instruct')
expect(detectProvider().name).toBe('OpenRouter')
})
test('Together + deepseek-ai/DeepSeek-V3 labels as Together AI', () => {
setupOpenAIMode('https://api.together.xyz/v1', 'deepseek-ai/DeepSeek-V3')
expect(detectProvider().name).toBe('Together AI')
})
test('Together + meta-llama/Llama-3.3 labels as Together AI', () => {
setupOpenAIMode('https://api.together.xyz/v1', 'meta-llama/Llama-3.3-70B-Instruct-Turbo')
expect(detectProvider().name).toBe('Together AI')
})
test('Groq + deepseek-r1-distill-llama-70b labels as Groq', () => {
setupOpenAIMode('https://api.groq.com/openai/v1', 'deepseek-r1-distill-llama-70b')
expect(detectProvider().name).toBe('Groq')
})
test('Groq + llama-3.3-70b-versatile labels as Groq', () => {
setupOpenAIMode('https://api.groq.com/openai/v1', 'llama-3.3-70b-versatile')
expect(detectProvider().name).toBe('Groq')
})
test('Azure + any deepseek deployment labels as Azure OpenAI', () => {
setupOpenAIMode('https://my-resource.openai.azure.com/', 'deepseek-chat')
expect(detectProvider().name).toBe('Azure OpenAI')
})
})
// --- Direct vendor endpoints still label correctly (regression) ---
describe('detectProvider — direct vendor endpoints', () => {
test('api.deepseek.com labels as DeepSeek', () => {
setupOpenAIMode('https://api.deepseek.com/v1', 'deepseek-chat')
expect(detectProvider().name).toBe('DeepSeek')
})
test('api.moonshot.cn labels as Moonshot (Kimi)', () => {
setupOpenAIMode('https://api.moonshot.cn/v1', 'moonshot-v1-8k')
expect(detectProvider().name).toBe('Moonshot (Kimi)')
})
test('api.mistral.ai labels as Mistral', () => {
setupOpenAIMode('https://api.mistral.ai/v1', 'mistral-large-latest')
expect(detectProvider().name).toBe('Mistral')
})
test('default OpenAI URL + gpt-4o labels as OpenAI', () => {
setupOpenAIMode('https://api.openai.com/v1', 'gpt-4o')
expect(detectProvider().name).toBe('OpenAI')
})
})
// --- rawModel fallback for generic/custom endpoints ---
describe('detectProvider — rawModel fallback when URL is generic', () => {
test('custom proxy + deepseek-chat falls back to DeepSeek', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'deepseek-chat')
expect(detectProvider().name).toBe('DeepSeek')
})
test('custom proxy + kimi-k2 falls back to Moonshot (Kimi)', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'kimi-k2-instruct')
expect(detectProvider().name).toBe('Moonshot (Kimi)')
})
test('custom proxy + llama-3.3 falls back to Meta Llama', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'llama-3.3-70b')
expect(detectProvider().name).toBe('Meta Llama')
})
test('custom proxy + mistral-large falls back to Mistral', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'mistral-large-latest')
expect(detectProvider().name).toBe('Mistral')
})
})
// --- Explicit env flags win over URL heuristics ---
describe('detectProvider — explicit dedicated-provider env flags', () => {
test('NVIDIA_NIM=1 overrides aggregator URL', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'some-nim-model')
process.env.NVIDIA_NIM = '1'
expect(detectProvider().name).toBe('NVIDIA NIM')
})
test('MINIMAX_API_KEY overrides aggregator URL', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'any-model')
process.env.MINIMAX_API_KEY = 'test-key'
expect(detectProvider().name).toBe('MiniMax')
})
})

View File

@@ -83,7 +83,7 @@ const LOGO_CLAUDE = [
// ─── Provider detection ─────────────────────────────────────────────────────── // ─── Provider detection ───────────────────────────────────────────────────────
export function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } { function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
const useGemini = process.env.CLAUDE_CODE_USE_GEMINI === '1' || process.env.CLAUDE_CODE_USE_GEMINI === 'true' const useGemini = process.env.CLAUDE_CODE_USE_GEMINI === '1' || process.env.CLAUDE_CODE_USE_GEMINI === 'true'
const useGithub = process.env.CLAUDE_CODE_USE_GITHUB === '1' || process.env.CLAUDE_CODE_USE_GITHUB === 'true' const useGithub = process.env.CLAUDE_CODE_USE_GITHUB === '1' || process.env.CLAUDE_CODE_USE_GITHUB === 'true'
const useOpenAI = process.env.CLAUDE_CODE_USE_OPENAI === '1' || process.env.CLAUDE_CODE_USE_OPENAI === 'true' const useOpenAI = process.env.CLAUDE_CODE_USE_OPENAI === '1' || process.env.CLAUDE_CODE_USE_OPENAI === 'true'
@@ -117,34 +117,28 @@ export function detectProvider(): { name: string; model: string; baseUrl: string
const baseUrl = resolvedRequest.baseUrl const baseUrl = resolvedRequest.baseUrl
const isLocal = isLocalProviderUrl(baseUrl) const isLocal = isLocalProviderUrl(baseUrl)
let name = 'OpenAI' let name = 'OpenAI'
// Explicit dedicated-provider env flags win. if (/nvidia/i.test(baseUrl) || /nvidia/i.test(rawModel) || process.env.NVIDIA_NIM)
if (process.env.NVIDIA_NIM) name = 'NVIDIA NIM' name = 'NVIDIA NIM'
else if (process.env.MINIMAX_API_KEY) name = 'MiniMax' else if (/minimax/i.test(baseUrl) || /minimax/i.test(rawModel) || process.env.MINIMAX_API_KEY)
else if ( name = 'MiniMax'
resolvedRequest.transport === 'codex_responses' || else if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex'))
baseUrl.includes('chatgpt.com/backend-api/codex')
)
name = 'Codex' name = 'Codex'
// Base URL is authoritative — must precede rawModel checks so aggregators else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))
// (OpenRouter/Together/Groq) aren't mislabelled as DeepSeek/Kimi/etc. name = 'DeepSeek'
// when routed to models whose IDs contain a vendor prefix. See issue #855. else if (/openrouter/i.test(baseUrl))
else if (/openrouter/i.test(baseUrl)) name = 'OpenRouter' name = 'OpenRouter'
else if (/together/i.test(baseUrl)) name = 'Together AI' else if (/together/i.test(baseUrl))
else if (/groq/i.test(baseUrl)) name = 'Groq' name = 'Together AI'
else if (/azure/i.test(baseUrl)) name = 'Azure OpenAI' else if (/groq/i.test(baseUrl))
else if (/nvidia/i.test(baseUrl)) name = 'NVIDIA NIM' name = 'Groq'
else if (/minimax/i.test(baseUrl)) name = 'MiniMax' else if (/mistral/i.test(baseUrl) || /mistral/i.test(rawModel))
else if (/moonshot/i.test(baseUrl)) name = 'Moonshot (Kimi)' name = 'Mistral'
else if (/deepseek/i.test(baseUrl)) name = 'DeepSeek' else if (/azure/i.test(baseUrl))
else if (/mistral/i.test(baseUrl)) name = 'Mistral' name = 'Azure OpenAI'
// rawModel fallback — fires only when base URL is generic/custom. else if (/llama/i.test(rawModel))
else if (/nvidia/i.test(rawModel)) name = 'NVIDIA NIM' name = 'Meta Llama'
else if (/minimax/i.test(rawModel)) name = 'MiniMax' else if (isLocal)
else if (/kimi/i.test(rawModel)) name = 'Moonshot (Kimi)' name = getLocalOpenAICompatibleProviderLabel(baseUrl)
else if (/deepseek/i.test(rawModel)) name = 'DeepSeek'
else if (/mistral/i.test(rawModel)) name = 'Mistral'
else if (/llama/i.test(rawModel)) name = 'Meta Llama'
else if (isLocal) name = getLocalOpenAICompatibleProviderLabel(baseUrl)
// Resolve model alias to actual model name + reasoning effort // Resolve model alias to actual model name + reasoning effort
let displayModel = resolvedRequest.resolvedModel let displayModel = resolvedRequest.resolvedModel

View File

@@ -823,11 +823,6 @@ function getFunctionResultClearingSection(model: string): string | null {
return null return null
} }
const config = getCachedMCConfigForFRC() const config = getCachedMCConfigForFRC()
if (!config) {
// External/stub builds return null from getCachedMCConfig — abort the
// section rather than trying to read .supportedModels off null.
return null
}
const isModelSupported = config.supportedModels?.some(pattern => const isModelSupported = config.supportedModels?.some(pattern =>
model.includes(pattern), model.includes(pattern),
) )

View File

@@ -1,8 +1,5 @@
import { expect, test } from 'bun:test' import { expect, test } from 'bun:test'
import { import { supportsClipboardImageFallback } from './usePasteHandler.ts'
shouldHandleInputAsPaste,
supportsClipboardImageFallback,
} from './usePasteHandler.ts'
test('supports clipboard image fallback on Windows', () => { test('supports clipboard image fallback on Windows', () => {
expect(supportsClipboardImageFallback('windows')).toBe(true) expect(supportsClipboardImageFallback('windows')).toBe(true)
@@ -23,42 +20,3 @@ test('does not support clipboard image fallback on WSL', () => {
test('does not support clipboard image fallback on unknown platforms', () => { test('does not support clipboard image fallback on unknown platforms', () => {
expect(supportsClipboardImageFallback('unknown')).toBe(false) expect(supportsClipboardImageFallback('unknown')).toBe(false)
}) })
test('does not treat a bracketed paste as pending when no paste handlers are provided', () => {
expect(
shouldHandleInputAsPaste({
hasTextPasteHandler: false,
hasImagePasteHandler: false,
inputLength: 'kimi-k2.5'.length,
pastePending: false,
hasImageFilePath: false,
isFromPaste: true,
}),
).toBe(false)
})
test('treats bracketed text paste as pending when a text paste handler exists', () => {
expect(
shouldHandleInputAsPaste({
hasTextPasteHandler: true,
hasImagePasteHandler: false,
inputLength: 'kimi-k2.5'.length,
pastePending: false,
hasImageFilePath: false,
isFromPaste: true,
}),
).toBe(true)
})
test('treats image path paste as pending when only an image handler exists', () => {
expect(
shouldHandleInputAsPaste({
hasTextPasteHandler: false,
hasImagePasteHandler: true,
inputLength: 'C:\\Users\\jat\\image.png'.length,
pastePending: false,
hasImageFilePath: true,
isFromPaste: false,
}),
).toBe(true)
})

View File

@@ -35,24 +35,6 @@ type PasteHandlerProps = {
) => void ) => void
} }
export function shouldHandleInputAsPaste(options: {
hasTextPasteHandler: boolean
hasImagePasteHandler: boolean
inputLength: number
pastePending: boolean
hasImageFilePath: boolean
isFromPaste: boolean
}): boolean {
return (
(options.hasTextPasteHandler &&
(options.inputLength > PASTE_THRESHOLD ||
options.pastePending ||
options.hasImageFilePath ||
options.isFromPaste)) ||
(options.hasImagePasteHandler && options.hasImageFilePath)
)
}
export function usePasteHandler({ export function usePasteHandler({
onPaste, onPaste,
onInput, onInput,
@@ -254,6 +236,11 @@ export function usePasteHandler({
// The keypress parser sets isPasted=true for content within bracketed paste. // The keypress parser sets isPasted=true for content within bracketed paste.
const isFromPaste = event.keypress.isPasted const isFromPaste = event.keypress.isPasted
// If this is pasted content, set isPasting state for UI feedback
if (isFromPaste) {
setIsPasting(true)
}
// Handle large pastes (>PASTE_THRESHOLD chars) // Handle large pastes (>PASTE_THRESHOLD chars)
// Usually we get one or two input characters at a time. If we // Usually we get one or two input characters at a time. If we
// get more than the threshold, the user has probably pasted. // get more than the threshold, the user has probably pasted.
@@ -281,7 +268,6 @@ export function usePasteHandler({
canFallbackToClipboardImage && canFallbackToClipboardImage &&
onImagePaste onImagePaste
) { ) {
setIsPasting(true)
checkClipboardForImage() checkClipboardForImage()
// Reset isPasting since there's no text content to process // Reset isPasting since there's no text content to process
setIsPasting(false) setIsPasting(false)
@@ -289,17 +275,14 @@ export function usePasteHandler({
} }
// Check if we should handle as paste (from bracketed paste, large input, or continuation) // Check if we should handle as paste (from bracketed paste, large input, or continuation)
const shouldHandleAsPaste = shouldHandleInputAsPaste({ const shouldHandleAsPaste =
hasTextPasteHandler: Boolean(onPaste), onPaste &&
hasImagePasteHandler: Boolean(onImagePaste), (input.length > PASTE_THRESHOLD ||
inputLength: input.length, pastePendingRef.current ||
pastePending: pastePendingRef.current, hasImageFilePath ||
hasImageFilePath, isFromPaste)
isFromPaste,
})
if (shouldHandleAsPaste) { if (shouldHandleAsPaste) {
setIsPasting(true)
pastePendingRef.current = true pastePendingRef.current = true
setPasteState(({ chunks, timeoutId }) => { setPasteState(({ chunks, timeoutId }) => {
return { return {

View File

@@ -1217,7 +1217,7 @@ async function* queryModel(
cachedMCEnabled = featureEnabled && modelSupported cachedMCEnabled = featureEnabled && modelSupported
const config = getCachedMCConfig() const config = getCachedMCConfig()
logForDebugging( logForDebugging(
`Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config?.supportedModels)}`, `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config.supportedModels)}`,
) )
} }

View File

@@ -8,7 +8,6 @@ import {
convertCodexResponseToAnthropicMessage, convertCodexResponseToAnthropicMessage,
convertToolsToResponsesTools, convertToolsToResponsesTools,
} from './codexShim.js' } from './codexShim.js'
import { __test as webSearchToolTest } from '../../tools/WebSearchTool/WebSearchTool.js'
const tempDirs: string[] = [] const tempDirs: string[] = []
const originalEnv = { const originalEnv = {
@@ -610,164 +609,6 @@ describe('Codex request translation', () => {
]) ])
}) })
test('recovers Codex web search text and sources from sparse completed response', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
sources: [
{
title: 'OpenClaude repo',
url: 'https://github.com/example/openclaude',
},
],
},
{
type: 'message',
role: 'assistant',
content: [
{
type: 'text',
text: 'OpenClaude is available on GitHub.',
sources: [
{
title: 'Docs',
url: 'https://docs.example.com/openclaude',
},
],
},
],
},
],
},
'OpenClaude GitHub 2026',
0.42,
)
expect(output.results).toEqual([
'OpenClaude is available on GitHub.',
{
tool_use_id: 'codex-web-search',
content: [
{
title: 'OpenClaude repo',
url: 'https://github.com/example/openclaude',
},
{
title: 'Docs',
url: 'https://docs.example.com/openclaude',
},
],
},
])
})
test('falls back to a non-empty Codex web search result message', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{ output: [] },
'OpenClaude GitHub 2026',
0.11,
)
expect(output.results).toEqual(['No results found.'])
})
test('surfaces Codex web search failure reason with a message', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
error: { message: 'upstream search provider rate-limited' },
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual([
'Web search failed: upstream search provider rate-limited',
])
})
test('surfaces Codex web search failure reason nested under action.error', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
action: { error: { message: 'query blocked' } },
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual(['Web search failed: query blocked'])
})
test('handles Codex web search failure with no reason attached', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual(['Web search failed.'])
})
test('a failure item does not suppress sources from a later message item', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
error: { message: 'partial outage' },
},
{
type: 'message',
role: 'assistant',
content: [
{
type: 'output_text',
text: 'Partial results below.',
sources: [
{ title: 'Docs', url: 'https://docs.example.com/openclaude' },
],
},
],
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual([
'Web search failed: partial outage',
'Partial results below.',
{
tool_use_id: 'codex-web-search',
content: [
{ title: 'Docs', url: 'https://docs.example.com/openclaude' },
],
},
])
})
test('translates Codex SSE text stream into Anthropic events', async () => { test('translates Codex SSE text stream into Anthropic events', async () => {
const responseText = [ const responseText = [
'event: response.output_item.added', 'event: response.output_item.added',

View File

@@ -1,5 +1,4 @@
import { APIError } from '@anthropic-ai/sdk' import { APIError } from '@anthropic-ai/sdk'
import { compressToolHistory } from './compressToolHistory.js'
import { fetchWithProxyRetry } from './fetchWithProxyRetry.js' import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
import type { import type {
ResolvedCodexCredentials, ResolvedCodexCredentials,
@@ -485,15 +484,13 @@ export async function performCodexRequest(options: {
defaultHeaders: Record<string, string> defaultHeaders: Record<string, string>
signal?: AbortSignal signal?: AbortSignal
}): Promise<Response> { }): Promise<Response> {
const compressedMessages = compressToolHistory( const input = convertAnthropicMessagesToResponsesInput(
options.params.messages as Array<{ options.params.messages as Array<{
role?: string role?: string
message?: { role?: string; content?: unknown } message?: { role?: string; content?: unknown }
content?: unknown content?: unknown
}>, }>,
options.request.resolvedModel,
) )
const input = convertAnthropicMessagesToResponsesInput(compressedMessages)
const body: Record<string, unknown> = { const body: Record<string, unknown> = {
model: options.request.resolvedModel, model: options.request.resolvedModel,
input: input.length > 0 input: input.length > 0

View File

@@ -1,572 +0,0 @@
import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
import { compressToolHistory, getTiers } from './compressToolHistory.js'
// Mock the two dependencies so tests are deterministic and don't read disk config.
const mockState = {
enabled: true,
effectiveWindow: 100_000,
}
mock.module('../../utils/config.js', () => ({
getGlobalConfig: () => ({
toolHistoryCompressionEnabled: mockState.enabled,
}),
}))
mock.module('../compact/autoCompact.js', () => ({
getEffectiveContextWindowSize: () => mockState.effectiveWindow,
}))
beforeEach(() => {
mockState.enabled = true
mockState.effectiveWindow = 100_000
})
afterEach(() => {
mockState.enabled = true
mockState.effectiveWindow = 100_000
})
type Block = Record<string, unknown>
type Msg = { role: string; content: Block[] | string }
function bigText(n: number): string {
return 'x'.repeat(n)
}
function buildToolExchange(id: number, resultLength: number): Msg[] {
return [
{
role: 'assistant',
content: [
{
type: 'tool_use',
id: `toolu_${id}`,
name: 'Read',
input: { file_path: `/path/to/file${id}.ts` },
},
],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: `toolu_${id}`,
content: bigText(resultLength),
},
],
},
]
}
function buildConversation(numToolExchanges: number, resultLength = 5_000): Msg[] {
const out: Msg[] = [{ role: 'user', content: 'Initial request' }]
for (let i = 0; i < numToolExchanges; i++) {
out.push(...buildToolExchange(i, resultLength))
}
return out
}
function getResultMessages(messages: Msg[]): Msg[] {
return messages.filter(
m => Array.isArray(m.content) && m.content.some((b: any) => b.type === 'tool_result'),
)
}
function getResultBlock(msg: Msg): Block {
return (msg.content as Block[]).find((b: any) => b.type === 'tool_result') as Block
}
function getResultText(msg: Msg): string {
const block = getResultBlock(msg)
const c = block.content
if (typeof c === 'string') return c
if (Array.isArray(c)) {
return c
.filter((b: any) => b.type === 'text')
.map((b: any) => b.text)
.join('\n')
}
return ''
}
// ---------- getTiers ----------
test('getTiers: < 16k window → recent=2, mid=3', () => {
expect(getTiers(8_000)).toEqual({ recent: 2, mid: 3 })
})
test('getTiers: 16k32k → recent=3, mid=5', () => {
expect(getTiers(20_000)).toEqual({ recent: 3, mid: 5 })
})
test('getTiers: 32k64k → recent=4, mid=8', () => {
expect(getTiers(48_000)).toEqual({ recent: 4, mid: 8 })
})
test('getTiers: 64k128k (Copilot gpt-4o) → recent=5, mid=10', () => {
expect(getTiers(100_000)).toEqual({ recent: 5, mid: 10 })
})
test('getTiers: 128k256k (Copilot Claude) → recent=8, mid=15', () => {
expect(getTiers(200_000)).toEqual({ recent: 8, mid: 15 })
})
test('getTiers: 256k500k → recent=12, mid=25', () => {
expect(getTiers(400_000)).toEqual({ recent: 12, mid: 25 })
})
test('getTiers: ≥ 500k (gpt-4.1 1M) → recent=25, mid=50', () => {
expect(getTiers(1_000_000)).toEqual({ recent: 25, mid: 50 })
})
// ---------- master switch ----------
test('pass-through when toolHistoryCompressionEnabled is false', () => {
mockState.enabled = false
const messages = buildConversation(20)
const result = compressToolHistory(messages, 'gpt-4o')
expect(result).toBe(messages) // same reference (no transformation)
})
test('pass-through when total tool_results <= recent tier', () => {
// 100k effective → recent=5; only 4 exchanges → no compression
const messages = buildConversation(4)
const result = compressToolHistory(messages, 'gpt-4o')
expect(result).toBe(messages)
})
// ---------- per-tier behavior ----------
test('recent tier: tool_result content untouched', () => {
// 100k effective → recent=5, mid=10. With 6 exchanges, only the oldest is touched.
const messages = buildConversation(6, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// Last 5 should be untouched (full 5000 chars)
for (let i = resultMsgs.length - 5; i < resultMsgs.length; i++) {
expect(getResultText(resultMsgs[i]).length).toBe(5_000)
}
})
test('mid tier: long content truncated to MID_MAX_CHARS with marker', () => {
// 100k → recent=5, mid=10. 10 exchanges: 5 recent + 5 mid (none old).
const messages = buildConversation(10, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// First 5 are mid tier — should be truncated to ~2000 chars + marker
for (let i = 0; i < 5; i++) {
const text = getResultText(resultMsgs[i])
expect(text).toContain('[…truncated')
expect(text).toContain('chars from tool history]')
// Should be roughly 2000 chars + marker (under 2200)
expect(text.length).toBeLessThan(2_200)
expect(text.length).toBeGreaterThan(2_000)
}
})
test('mid tier: short content (< MID_MAX_CHARS) untouched', () => {
const messages = buildConversation(10, 500) // 500 < MID_MAX_CHARS
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
for (let i = 0; i < 5; i++) {
expect(getResultText(resultMsgs[i])).toBe(bigText(500))
}
})
test('old tier: content replaced with stub [name args={...} → N chars omitted]', () => {
// 100k → recent=5, mid=10, old=rest. 20 exchanges → 5 old + 10 mid + 5 recent.
const messages = buildConversation(20, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// First 5 are old tier — should be stubs
for (let i = 0; i < 5; i++) {
const text = getResultText(resultMsgs[i])
expect(text).toMatch(/^\[Read args=\{.*\} → 5000 chars omitted\]$/)
}
})
test('old tier: stub args truncated to 200 chars', () => {
const longArg = bigText(500)
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [
{
type: 'tool_use',
id: 'toolu_x',
name: 'Bash',
input: { command: longArg },
},
],
},
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'toolu_x', content: 'output' },
],
},
// Pad with enough recent exchanges to push the above into old tier
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
const text = getResultText(resultMsgs[0])
// Stub format: [Bash args=<json≤200chars> → N chars omitted]
// The args portion (between args= and →) must be ≤ 200 chars.
const argsMatch = text.match(/args=(.*?) →/)
expect(argsMatch).not.toBeNull()
expect(argsMatch![1].length).toBeLessThanOrEqual(200)
})
test('old tier: orphan tool_result (no matching tool_use) falls back to "tool"', () => {
const messages: Msg[] = [
{ role: 'user', content: 'start' },
// Orphan: tool_result without matching tool_use in history
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'orphan_id', content: 'data' },
],
},
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
const text = getResultText(resultMsgs[0])
expect(text).toMatch(/^\[tool args=\{\} → 4 chars omitted\]$/)
})
// ---------- structural preservation ----------
test('tool_use blocks always preserved', () => {
const messages = buildConversation(20, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const useCount = (msgs: Msg[]) =>
msgs.reduce((sum, m) => {
if (!Array.isArray(m.content)) return sum
return sum + m.content.filter((b: any) => b.type === 'tool_use').length
}, 0)
expect(useCount(result as Msg[])).toBe(useCount(messages))
})
test('text blocks always preserved', () => {
const messages: Msg[] = [
{ role: 'user', content: 'first' },
{
role: 'assistant',
content: [
{ type: 'text', text: 'reasoning before tool' },
{ type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
],
},
{
role: 'user',
content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
},
...buildConversation(20, 5_000).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const assistantMsg = (result as Msg[])[1]
const textBlock = (assistantMsg.content as Block[]).find((b: any) => b.type === 'text')
expect(textBlock).toEqual({ type: 'text', text: 'reasoning before tool' })
})
test('thinking blocks always preserved', () => {
const messages: Msg[] = [
{ role: 'user', content: 'first' },
{
role: 'assistant',
content: [
{ type: 'thinking', thinking: 'internal reasoning', signature: 'sig' },
{ type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
],
},
{
role: 'user',
content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
},
...buildConversation(20, 5_000).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const assistantMsg = (result as Msg[])[1]
const thinking = (assistantMsg.content as Block[]).find((b: any) => b.type === 'thinking')
expect(thinking).toEqual({
type: 'thinking',
thinking: 'internal reasoning',
signature: 'sig',
})
})
test('non-array content (string) handled gracefully', () => {
const messages: Msg[] = [
{ role: 'user', content: 'plain string content' },
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
expect((result as Msg[])[0].content).toBe('plain string content')
})
test('empty content array handled gracefully', () => {
const messages: Msg[] = [
{ role: 'user', content: [] },
...buildConversation(20, 100).slice(1),
]
expect(() => compressToolHistory(messages, 'gpt-4o')).not.toThrow()
})
// ---------- message shape compatibility ----------
test('wrapped shape ({ message: { role, content } }) handled', () => {
type WrappedMsg = { message: { role: string; content: Block[] | string } }
const wrap = (m: Msg): WrappedMsg => ({ message: { role: m.role, content: m.content } })
const messages = buildConversation(20, 5_000).map(wrap)
const result = compressToolHistory(messages as any, 'gpt-4o')
// First wrapped tool-result message should have stub content (old tier)
const firstResultMsg = (result as WrappedMsg[]).find(
m =>
Array.isArray(m.message.content) &&
m.message.content.some((b: any) => b.type === 'tool_result'),
)
const block = (firstResultMsg!.message.content as Block[]).find(
(b: any) => b.type === 'tool_result',
) as Block
const text = ((block.content as Block[])[0] as any).text
expect(text).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
})
test('flat shape ({ role, content }) handled', () => {
const messages = buildConversation(20, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
expect(getResultText(resultMsgs[0])).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
})
// ---------- tier boundary correctness ----------
test('tier boundaries: 6 exchanges → 1 mid + 5 recent (recent=5)', () => {
const messages = buildConversation(6, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// Oldest: mid (truncated)
expect(getResultText(resultMsgs[0])).toContain('[…truncated')
// Last 5: untouched
for (let i = 1; i < 6; i++) {
expect(getResultText(resultMsgs[i]).length).toBe(5_000)
}
})
test('tier boundaries: 16 exchanges → 1 old + 10 mid + 5 recent', () => {
const messages = buildConversation(16, 5_000)
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// Oldest 1: stub (old tier)
expect(getResultText(resultMsgs[0])).toMatch(/^\[Read .*chars omitted\]$/)
// Next 10: mid (truncated)
for (let i = 1; i < 11; i++) {
expect(getResultText(resultMsgs[i])).toContain('[…truncated')
}
// Last 5: untouched
for (let i = 11; i < 16; i++) {
expect(getResultText(resultMsgs[i]).length).toBe(5_000)
}
})
test('large window (1M) with 30 exchanges: all untouched (recent=25 ≥ 30 - 5)', () => {
// ≥500k → recent=25, mid=50. 30 exchanges → 5 mid + 25 recent. None old.
mockState.effectiveWindow = 1_000_000
const messages = buildConversation(30, 5_000)
const result = compressToolHistory(messages, 'gpt-4.1')
const resultMsgs = getResultMessages(result)
// Last 25: untouched
for (let i = 5; i < 30; i++) {
expect(getResultText(resultMsgs[i]).length).toBe(5_000)
}
})
// ---------- attribute preservation ----------
test('is_error flag preserved in mid tier', () => {
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'toolu_err',
is_error: true,
content: bigText(5_000),
},
],
},
// Pad with enough recent exchanges to push the above into MID tier
...buildConversation(10, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
expect(block.is_error).toBe(true)
expect(getResultText(resultMsgs[0])).toContain('[…truncated')
})
test('is_error flag preserved in old tier (stub)', () => {
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'toolu_err',
is_error: true,
content: bigText(5_000),
},
],
},
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
expect(block.is_error).toBe(true)
expect(getResultText(resultMsgs[0])).toMatch(/^\[Bash .*chars omitted\]$/)
})
// ---------- COMPACTABLE_TOOLS filter ----------
test('non-compactable tool (e.g. Task/Agent) is NEVER compressed', () => {
// Build conversation where the OLDEST exchange uses a non-compactable tool name
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [
{ type: 'tool_use', id: 'task_1', name: 'Task', input: { goal: 'plan' } },
],
},
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'task_1', content: bigText(5_000) },
],
},
// Pad with 20 compactable exchanges to push Task into old tier
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// First tool_result is for Task (non-compactable) → must remain full
expect(getResultText(resultMsgs[0]).length).toBe(5_000)
expect(getResultText(resultMsgs[0])).not.toContain('chars omitted')
expect(getResultText(resultMsgs[0])).not.toContain('[…truncated')
})
test('mcp__ prefixed tools ARE compactable (matches microCompact behavior)', () => {
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [
{ type: 'tool_use', id: 'mcp_1', name: 'mcp__github__get_issue', input: {} },
],
},
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'mcp_1', content: bigText(5_000) },
],
},
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// MCP tool result is compressed (gets stub since it's in old tier)
expect(getResultText(resultMsgs[0])).toMatch(/^\[mcp__github__get_issue .*chars omitted\]$/)
})
// ---------- skip already-cleared blocks ----------
test('blocks already cleared by microCompact are NOT re-compressed', () => {
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [{ type: 'tool_use', id: 'cleared_1', name: 'Read', input: {} }],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'cleared_1',
content: '[Old tool result content cleared]', // microCompact's marker
},
],
},
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
// Already-cleared marker survives untouched (no double processing)
expect(getResultText(resultMsgs[0])).toBe('[Old tool result content cleared]')
})
test('extra block attributes (e.g. cache_control) preserved across rewrites', () => {
const cacheControl = { type: 'ephemeral' }
const messages: Msg[] = [
{ role: 'user', content: 'start' },
{
role: 'assistant',
content: [{ type: 'tool_use', id: 'toolu_cc', name: 'Read', input: {} }],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'toolu_cc',
cache_control: cacheControl,
content: bigText(5_000),
},
],
},
...buildConversation(20, 100).slice(1),
]
const result = compressToolHistory(messages, 'gpt-4o')
const resultMsgs = getResultMessages(result)
const block = getResultBlock(resultMsgs[0]) as { cache_control?: unknown }
// The custom attribute survived the stub rewrite via ...block spread
expect(block.cache_control).toEqual(cacheControl)
})

View File

@@ -1,255 +0,0 @@
/**
* Compresses old tool_result content for stateless OpenAI-compatible providers
* (Copilot, Mistral, Ollama). Preserves all conversation structure — tool_use,
* tool_result pairing, text, thinking, and is_error all survive intact. Only
* the BULK text of older tool_results is shrunk to delay context saturation.
*
* Tier sizes scale with the model's effective context window via
* getEffectiveContextWindowSize() — same calculation used by auto-compact, so
* the two systems stay aligned.
*
* Complements (does not replace) microCompact.ts:
* - microCompact: time/cache-based, runs from query.ts, binary clear/keep,
* limited to Claude (cache editing) or idle gaps (time-based).
* - compressToolHistory: size-based, runs at the shim layer, tiered
* compression, covers the gap for active sessions on non-Claude providers.
*
* Reuses isCompactableTool from microCompact to avoid touching tools the
* project already classifies as unsafe to compress (e.g. Task, Agent).
* Skips blocks already cleared by microCompact (TOOL_RESULT_CLEARED_MESSAGE).
*
* Anthropic native bypasses both shims, so it is unaffected by this module.
*/
import { getEffectiveContextWindowSize } from '../compact/autoCompact.js'
import { isCompactableTool } from '../compact/microCompact.js'
import { TOOL_RESULT_CLEARED_MESSAGE } from '../../utils/toolResultStorage.js'
import { getGlobalConfig } from '../../utils/config.js'
// Mid-tier truncation budget. 2k chars ≈ 500 tokens, enough to preserve the
// shape of most tool outputs (file headers, command stderr, top grep hits)
// without ballooning context. Bump too high and the tier loses its purpose.
const MID_MAX_CHARS = 2_000
// Stub args budget. JSON.stringify of a typical tool input fits in 200 chars
// (file paths, short commands, small queries). Long inputs are rare and clamping
// here keeps the stub size bounded even when callers pass oversized arguments.
const STUB_ARGS_MAX_CHARS = 200
type AnyMessage = {
role?: string
message?: { role?: string; content?: unknown }
content?: unknown
}
type ToolResultBlock = {
type: 'tool_result'
tool_use_id?: string
is_error?: boolean
content?: unknown
}
type ToolUseBlock = {
type: 'tool_use'
id?: string
name?: string
input?: unknown
}
type Tiers = { recent: number; mid: number }
// Tier sizes scale with effective window. Targets roughly:
// - recent tier stays under ~25% of available window (full fidelity kept)
// - recent + mid tier stays under ~50% of available window (bounded bulk)
// - everything older collapses to ~15-token stubs
// Values assume ~5KB avg tool_result, which matches the Copilot default case
// (parallel_tool_calls=true means multiple Read/Bash outputs per turn). For
// ≥ 500k models the tiers are so generous that compression is effectively
// inert for any realistic session — see compressToolHistory.test.ts.
export function getTiers(effectiveWindow: number): Tiers {
if (effectiveWindow < 16_000) return { recent: 2, mid: 3 }
if (effectiveWindow < 32_000) return { recent: 3, mid: 5 }
if (effectiveWindow < 64_000) return { recent: 4, mid: 8 }
if (effectiveWindow < 128_000) return { recent: 5, mid: 10 }
if (effectiveWindow < 256_000) return { recent: 8, mid: 15 }
if (effectiveWindow < 500_000) return { recent: 12, mid: 25 }
return { recent: 25, mid: 50 }
}
function extractText(content: unknown): string {
if (typeof content === 'string') return content
if (Array.isArray(content)) {
return content
.filter(
(b: { type?: string; text?: string }) =>
b?.type === 'text' && typeof b.text === 'string',
)
.map((b: { text?: string }) => b.text ?? '')
.join('\n')
}
return ''
}
// Old-tier compression strategy. Replaces content entirely with a one-line
// metadata marker ~10× more token-efficient than a 500-char truncation AND
// unambiguous — partial truncations can look authoritative to the model. The
// stub format encodes tool name + args so the model can re-invoke the same
// tool if it needs the omitted output back.
function buildStub(
block: ToolResultBlock,
toolUsesById: Map<string, ToolUseBlock>,
): ToolResultBlock {
const original = extractText(block.content)
const toolUse = toolUsesById.get(block.tool_use_id ?? '')
const name = toolUse?.name ?? 'tool'
const args = toolUse?.input
? JSON.stringify(toolUse.input).slice(0, STUB_ARGS_MAX_CHARS)
: '{}'
return {
...block,
content: [
{
type: 'text',
text: `[${name} args=${args}${original.length} chars omitted]`,
},
],
}
}
// Mid-tier compression. The trailing marker is load-bearing: without it, the
// model can't distinguish "tool returned 2000 chars" from "tool returned 20k
// chars that we cut to 2000". Distinguishing those matters for the model's
// decision to re-invoke the tool.
function truncateBlock(
block: ToolResultBlock,
maxChars: number,
): ToolResultBlock {
const text = extractText(block.content)
if (text.length <= maxChars) return block
const omitted = text.length - maxChars
return {
...block,
content: [
{
type: 'text',
text: `${text.slice(0, maxChars)}\n[…truncated ${omitted} chars from tool history]`,
},
],
}
}
function getInner(msg: AnyMessage): { role?: string; content?: unknown } {
return (msg.message ?? msg) as { role?: string; content?: unknown }
}
function indexToolUses(messages: AnyMessage[]): Map<string, ToolUseBlock> {
const map = new Map<string, ToolUseBlock>()
for (const msg of messages) {
const content = getInner(msg).content
if (!Array.isArray(content)) continue
for (const b of content as Array<{ type?: string; id?: string }>) {
if (b?.type === 'tool_use' && b.id) {
map.set(b.id, b as ToolUseBlock)
}
}
}
return map
}
function indexToolResultMessages(messages: AnyMessage[]): number[] {
const indices: number[] = []
for (let i = 0; i < messages.length; i++) {
const inner = getInner(messages[i])
const role = inner.role ?? messages[i].role
const content = inner.content
if (
role === 'user' &&
Array.isArray(content) &&
content.some((b: { type?: string }) => b?.type === 'tool_result')
) {
indices.push(i)
}
}
return indices
}
function rewriteMessage<T extends AnyMessage>(
msg: T,
newContent: unknown[],
): T {
if (msg.message) {
return { ...msg, message: { ...msg.message, content: newContent } }
}
return { ...msg, content: newContent }
}
// microCompact.maybeTimeBasedMicrocompact may have already replaced old
// tool_result content with TOOL_RESULT_CLEARED_MESSAGE before we see it.
// Re-compressing produces a stub over a marker (e.g. `[Read args={} → 40
// chars omitted]`), wasteful and less informative than the canonical marker.
function isAlreadyCleared(block: ToolResultBlock): boolean {
const text = extractText(block.content)
return text === TOOL_RESULT_CLEARED_MESSAGE
}
function shouldCompressBlock(
block: ToolResultBlock,
toolUsesById: Map<string, ToolUseBlock>,
): boolean {
if (isAlreadyCleared(block)) return false
const toolUse = toolUsesById.get(block.tool_use_id ?? '')
// Unknown tool name (orphan tool_result with no matching tool_use) falls
// through to compression with a generic "tool" stub. Safer default: the
// original tool_use vanished so there's no downstream use for the output.
if (!toolUse?.name) return true
// Respect microCompact's curated safe-to-compress set (Read/Bash/Grep/…/
// mcp__*) so user-facing flow tools (Task, Agent, custom) stay intact.
return isCompactableTool(toolUse.name)
}
export function compressToolHistory<T extends AnyMessage>(
messages: T[],
model: string,
): T[] {
// Master kill-switch. Returns the original reference so callers skip a
// defensive copy when the feature is disabled.
if (!getGlobalConfig().toolHistoryCompressionEnabled) return messages
const tiers = getTiers(getEffectiveContextWindowSize(model))
const toolResultIndices = indexToolResultMessages(messages)
const total = toolResultIndices.length
// If every tool-result fits in the recent tier, no boundary crosses; return
// the same reference for the same copy-elision reason.
if (total <= tiers.recent) return messages
// O(1) lookup: messageIndex → tool-result position (0 = oldest). Replaces
// the naive Array.indexOf(i) that was O(n²) across the .map below.
const positionByIndex = new Map<number, number>()
for (let pos = 0; pos < toolResultIndices.length; pos++) {
positionByIndex.set(toolResultIndices[pos], pos)
}
const toolUsesById = indexToolUses(messages)
return messages.map((msg, i) => {
const pos = positionByIndex.get(i)
if (pos === undefined) return msg
const fromEnd = total - 1 - pos
if (fromEnd < tiers.recent) return msg
const inMidWindow = fromEnd < tiers.recent + tiers.mid
const content = getInner(msg).content as unknown[]
const newContent = content.map(block => {
const b = block as { type?: string }
if (b?.type !== 'tool_result') return block
const tr = block as ToolResultBlock
if (!shouldCompressBlock(tr, toolUsesById)) return block
return inMidWindow
? truncateBlock(tr, MID_MAX_CHARS)
: buildStub(tr, toolUsesById)
})
return rewriteMessage(msg, newContent)
})
}

View File

@@ -1,317 +0,0 @@
import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
import { createOpenAIShimClient } from './openaiShim.js'
type FetchType = typeof globalThis.fetch
const originalFetch = globalThis.fetch
const originalEnv = {
OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
OPENAI_API_KEY: process.env.OPENAI_API_KEY,
OPENAI_MODEL: process.env.OPENAI_MODEL,
}
// Mock config + autoCompact so the shim sees deterministic state.
const mockState = {
enabled: true,
effectiveWindow: 100_000, // Copilot gpt-4o tier
}
mock.module('../../utils/config.js', () => ({
getGlobalConfig: () => ({
toolHistoryCompressionEnabled: mockState.enabled,
autoCompactEnabled: false,
}),
}))
mock.module('../compact/autoCompact.js', () => ({
getEffectiveContextWindowSize: () => mockState.effectiveWindow,
}))
type OpenAIShimClient = {
beta: {
messages: {
create: (
params: Record<string, unknown>,
options?: Record<string, unknown>,
) => Promise<unknown>
}
}
}
function bigText(n: number): string {
return 'A'.repeat(n)
}
function buildToolExchange(id: number, resultLength: number) {
return [
{
role: 'assistant',
content: [
{
type: 'tool_use',
id: `toolu_${id}`,
name: 'Read',
input: { file_path: `/path/to/file${id}.ts` },
},
],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: `toolu_${id}`,
content: bigText(resultLength),
},
],
},
]
}
function buildLongConversation(numExchanges: number, resultLength = 5_000) {
const out: Array<{ role: string; content: unknown }> = [
{ role: 'user', content: 'start the work' },
]
for (let i = 0; i < numExchanges; i++) {
out.push(...buildToolExchange(i, resultLength))
}
return out
}
function makeFakeResponse(): Response {
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'gpt-4o',
choices: [
{
message: { role: 'assistant', content: 'done' },
finish_reason: 'stop',
},
],
usage: { prompt_tokens: 8, completion_tokens: 2, total_tokens: 10 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}
beforeEach(() => {
process.env.OPENAI_BASE_URL = 'http://example.test/v1'
process.env.OPENAI_API_KEY = 'test-key'
delete process.env.OPENAI_MODEL
mockState.enabled = true
mockState.effectiveWindow = 100_000
})
afterEach(() => {
if (originalEnv.OPENAI_BASE_URL === undefined) delete process.env.OPENAI_BASE_URL
else process.env.OPENAI_BASE_URL = originalEnv.OPENAI_BASE_URL
if (originalEnv.OPENAI_API_KEY === undefined) delete process.env.OPENAI_API_KEY
else process.env.OPENAI_API_KEY = originalEnv.OPENAI_API_KEY
if (originalEnv.OPENAI_MODEL === undefined) delete process.env.OPENAI_MODEL
else process.env.OPENAI_MODEL = originalEnv.OPENAI_MODEL
globalThis.fetch = originalFetch
})
async function captureRequestBody(
messages: Array<{ role: string; content: unknown }>,
model: string,
): Promise<Record<string, unknown>> {
let captured: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
captured = JSON.parse(String(init?.body))
return makeFakeResponse()
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model,
system: 'system prompt',
messages,
})
if (!captured) throw new Error('request not captured')
return captured
}
function getToolMessages(body: Record<string, unknown>): Array<{ content: string }> {
const messages = body.messages as Array<{ role: string; content: string }>
return messages.filter(m => m.role === 'tool')
}
function getAssistantToolCalls(body: Record<string, unknown>): unknown[] {
const messages = body.messages as Array<{
role: string
tool_calls?: unknown[]
}>
return messages
.filter(m => m.role === 'assistant' && Array.isArray(m.tool_calls))
.flatMap(m => m.tool_calls ?? [])
}
// ============================================================================
// BUG REPRO: without compression, full tool history is resent every turn
// ============================================================================
test('BUG REPRO: without compression, all 30 tool results are sent at full size', async () => {
mockState.enabled = false
const messages = buildLongConversation(30, 5_000)
const body = await captureRequestBody(messages, 'gpt-4o')
const toolMessages = getToolMessages(body)
const payloadSize = JSON.stringify(body).length
// All 30 tool results present, none truncated
expect(toolMessages.length).toBe(30)
for (const m of toolMessages) {
expect(m.content.length).toBeGreaterThanOrEqual(5_000)
expect(m.content).not.toContain('[…truncated')
expect(m.content).not.toContain('chars omitted')
}
// Total payload is large (~150KB raw) — this is the cost being paid every turn
expect(payloadSize).toBeGreaterThan(150_000)
})
// ============================================================================
// FIX: with compression, recent kept full, mid truncated, old stubbed
// ============================================================================
test('FIX: with compression on Copilot gpt-4o (tier 5/10/rest), 30 turns shrinks dramatically', async () => {
mockState.enabled = true
mockState.effectiveWindow = 100_000 // 64128k → recent=5, mid=10
const messages = buildLongConversation(30, 5_000)
const body = await captureRequestBody(messages, 'gpt-4o')
const toolMessages = getToolMessages(body)
const payloadSize = JSON.stringify(body).length
// Structure preserved: still 30 tool messages, no orphan tool_calls
expect(toolMessages.length).toBe(30)
expect(getAssistantToolCalls(body).length).toBe(30)
// Tier breakdown (oldest → newest):
// indices 0..14 → old tier (stubs)
// indices 15..24 → mid tier (truncated)
// indices 25..29 → recent (full)
for (let i = 0; i <= 14; i++) {
expect(toolMessages[i].content).toMatch(/^\[Read args=.*chars omitted\]$/)
}
for (let i = 15; i <= 24; i++) {
expect(toolMessages[i].content).toContain('[…truncated')
}
for (let i = 25; i <= 29; i++) {
expect(toolMessages[i].content.length).toBe(5_000)
expect(toolMessages[i].content).not.toContain('[…truncated')
expect(toolMessages[i].content).not.toContain('chars omitted')
}
// Significant reduction: from ~150KB to <60KB (10 mid×2KB + structure overhead)
expect(payloadSize).toBeLessThan(60_000)
})
// ============================================================================
// FIX: large-context model gets generous tiers — compression effectively inert
// ============================================================================
test('FIX: gpt-4.1 (1M context) with 25 exchanges keeps all full (recent tier=25)', async () => {
mockState.enabled = true
mockState.effectiveWindow = 1_000_000 // ≥500k → recent=25, mid=50
const messages = buildLongConversation(25, 5_000)
const body = await captureRequestBody(messages, 'gpt-4.1')
const toolMessages = getToolMessages(body)
expect(toolMessages.length).toBe(25)
for (const m of toolMessages) {
expect(m.content.length).toBe(5_000)
expect(m.content).not.toContain('[…truncated')
expect(m.content).not.toContain('chars omitted')
}
})
test('FIX: gpt-4.1 (1M context) with 30 exchanges → only first 5 mid-truncated', async () => {
mockState.enabled = true
mockState.effectiveWindow = 1_000_000 // recent=25, mid=50
const messages = buildLongConversation(30, 5_000)
const body = await captureRequestBody(messages, 'gpt-4.1')
const toolMessages = getToolMessages(body)
// 30 total: indices 0..4 mid, indices 5..29 recent
for (let i = 0; i < 5; i++) {
expect(toolMessages[i].content).toContain('[…truncated')
}
for (let i = 5; i < 30; i++) {
expect(toolMessages[i].content.length).toBe(5_000)
}
})
// ============================================================================
// FIX: stub preserves tool name and args — model can re-invoke if needed
// ============================================================================
test('FIX: stub format includes original tool name and arguments', async () => {
mockState.enabled = true
mockState.effectiveWindow = 100_000
const messages = buildLongConversation(30, 5_000)
const body = await captureRequestBody(messages, 'gpt-4o')
const toolMessages = getToolMessages(body)
const oldestStub = toolMessages[0].content
// Format: [<tool_name> args=<json> → <N> chars omitted]
expect(oldestStub).toMatch(/^\[Read /)
expect(oldestStub).toMatch(/file_path/)
expect(oldestStub).toMatch(/→ 5000 chars omitted\]$/)
})
// ============================================================================
// FIX: tool_use blocks (assistant tool_calls) are never modified
// ============================================================================
test('FIX: every tool_call retains its full id, name, and arguments', async () => {
mockState.enabled = true
mockState.effectiveWindow = 100_000
const messages = buildLongConversation(30, 5_000)
const body = await captureRequestBody(messages, 'gpt-4o')
const toolCalls = getAssistantToolCalls(body) as Array<{
id: string
function: { name: string; arguments: string }
}>
expect(toolCalls.length).toBe(30)
for (let i = 0; i < toolCalls.length; i++) {
expect(toolCalls[i].id).toBe(`toolu_${i}`)
expect(toolCalls[i].function.name).toBe('Read')
expect(JSON.parse(toolCalls[i].function.arguments)).toEqual({
file_path: `/path/to/file${i}.ts`,
})
}
})
// ============================================================================
// FIX: small-context provider (Mistral 32k) gets aggressive compression
// ============================================================================
test('FIX: 32k window (Mistral tier) → recent=3 keeps last 3 only', async () => {
mockState.enabled = true
mockState.effectiveWindow = 24_000 // 1632k → recent=3, mid=5
const messages = buildLongConversation(15, 3_000)
const body = await captureRequestBody(messages, 'mistral-large-latest')
const toolMessages = getToolMessages(body)
// 15 total: indices 0..6 old, 7..11 mid, 12..14 recent
for (let i = 0; i <= 6; i++) {
expect(toolMessages[i].content).toContain('chars omitted')
}
for (let i = 7; i <= 11; i++) {
expect(toolMessages[i].content).toContain('[…truncated')
}
for (let i = 12; i <= 14; i++) {
expect(toolMessages[i].content.length).toBe(3_000)
}
})

View File

@@ -3216,516 +3216,4 @@ test('preserves valid tool_result and drops orphan tool_result', async () => {
const orphanMessage = toolMessages.find(m => m.tool_call_id === 'orphan_call_2') const orphanMessage = toolMessages.find(m => m.tool_call_id === 'orphan_call_2')
expect(orphanMessage).toBeUndefined() expect(orphanMessage).toBeUndefined()
// Actually, the semantic message IS injected here because the user block with orphan
// tool result is converted to:
// 1. Tool result (valid_call_1) -> role 'tool'
// 2. User content ("What happened?") -> role 'user'
// This triggers the tool -> assistant injection.
const assistantMessages = messages.filter(m => m.role === 'assistant')
expect(assistantMessages.some(m => m.content === '[Tool execution interrupted by user]')).toBe(true)
})
test('drops empty assistant message when only thinking block was present and stripped', async () => {
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(JSON.stringify({
id: 'chatcmpl-1',
object: 'chat.completion',
created: 123456789,
model: 'mistral-large-latest',
choices: [{ message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }
}), { headers: { 'Content-Type': 'application/json' } })
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'mistral-large-latest',
messages: [
{ role: 'user', content: 'Initial' },
{ role: 'assistant', content: [{ type: 'thinking', thinking: 'I am thinking...', signature: 'sig' }] },
{ role: 'user', content: 'Interrupting query' },
],
max_tokens: 64,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
// The assistant msg is dropped because thinking is stripped.
// The two user messages are coalesced.
expect(messages.length).toBe(1)
expect(messages[0].role).toBe('user')
expect(String(messages[0].content)).toContain('Initial')
expect(String(messages[0].content)).toContain('Interrupting query')
})
test('injects semantic assistant message when tool result is followed by user message', async () => {
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(JSON.stringify({
id: 'chatcmpl-2',
object: 'chat.completion',
created: 123456789,
model: 'mistral-large-latest',
choices: [{ message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }
}), { headers: { 'Content-Type': 'application/json' } })
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'mistral-large-latest',
messages: [
{
role: 'assistant',
content: [{ type: 'tool_use', id: 'call_1', name: 'search', input: {} }]
},
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'call_1', content: 'Result' }
]
},
{ role: 'user', content: 'Next user query' },
],
max_tokens: 64,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
// Roles should be: assistant (tool_calls) -> tool -> assistant (semantic) -> user
const roles = messages.map(m => m.role)
expect(roles).toEqual(['assistant', 'tool', 'assistant', 'user'])
const semanticMsg = messages[2]
expect(semanticMsg.role).toBe('assistant')
expect(semanticMsg.content).toBe('[Tool execution interrupted by user]')
})
test('Moonshot: uses max_tokens (not max_completion_tokens) and strips store', async () => {
process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'kimi-k2.6',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'kimi-k2.6',
system: 'you are kimi',
messages: [{ role: 'user', content: 'hi' }],
max_tokens: 256,
stream: false,
})
expect(requestBody?.max_tokens).toBe(256)
expect(requestBody?.max_completion_tokens).toBeUndefined()
expect(requestBody?.store).toBeUndefined()
})
test('Moonshot: echoes reasoning_content on assistant tool-call messages', async () => {
// Regression for: "API Error: 400 {"error":{"message":"thinking is enabled
// but reasoning_content is missing in assistant tool call message at index
// N"}}" when the agent sends a prior-turn assistant response back to Kimi.
// The thinking block captured from the inbound response must round-trip
// as reasoning_content on the outgoing echoed assistant message.
process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'kimi-k2.6',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'kimi-k2.6',
system: 'you are kimi',
messages: [
{ role: 'user', content: 'check the logs' },
{
role: 'assistant',
content: [
{
type: 'thinking',
thinking: 'Need to inspect logs via Bash; running a cat.',
},
{ type: 'text', text: "I'll inspect the logs." },
{
type: 'tool_use',
id: 'call_bash_1',
name: 'Bash',
input: { command: 'cat /tmp/app.log' },
},
],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'call_bash_1',
content: 'log line 1\nlog line 2',
},
],
},
],
max_tokens: 256,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
const assistantWithToolCall = messages.find(
m => m.role === 'assistant' && Array.isArray(m.tool_calls),
)
expect(assistantWithToolCall).toBeDefined()
expect(assistantWithToolCall?.reasoning_content).toBe(
'Need to inspect logs via Bash; running a cat.',
)
})
test('non-Moonshot providers do NOT receive reasoning_content on assistant messages', async () => {
// Guard: only Moonshot opts in. DeepSeek/OpenRouter/etc. receive the
// outgoing assistant message without reasoning_content to avoid
// unknown-field rejections from strict servers.
process.env.OPENAI_BASE_URL = 'https://api.deepseek.com/v1'
process.env.OPENAI_API_KEY = 'sk-deepseek'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'deepseek-chat',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'deepseek-chat',
system: 'test',
messages: [
{ role: 'user', content: 'hi' },
{
role: 'assistant',
content: [
{ type: 'thinking', thinking: 'thought' },
{ type: 'text', text: 'hello' },
{
type: 'tool_use',
id: 'call_1',
name: 'Bash',
input: { command: 'ls' },
},
],
},
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'call_1', content: 'files' },
],
},
],
max_tokens: 32,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
const assistantWithToolCall = messages.find(
m => m.role === 'assistant' && Array.isArray(m.tool_calls),
)
expect(assistantWithToolCall).toBeDefined()
expect(assistantWithToolCall?.reasoning_content).toBeUndefined()
})
test('Moonshot: cn host is also detected', async () => {
process.env.OPENAI_BASE_URL = 'https://api.moonshot.cn/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'kimi-k2.6',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'kimi-k2.6',
system: 'you are kimi',
messages: [{ role: 'user', content: 'hi' }],
max_tokens: 256,
stream: false,
})
expect(requestBody?.store).toBeUndefined()
})
test('collapses multiple text blocks in tool_result to string for DeepSeek compatibility (issue #774)', async () => {
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'deepseek-reasoner',
choices: [
{
message: {
role: 'assistant',
content: 'done',
},
finish_reason: 'stop',
},
],
usage: {
prompt_tokens: 12,
completion_tokens: 4,
total_tokens: 16,
},
}),
{
headers: {
'Content-Type': 'application/json',
},
},
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'deepseek-reasoner',
system: 'test system',
messages: [
{ role: 'user', content: 'Run ls' },
{
role: 'assistant',
content: [
{
type: 'tool_use',
id: 'call_1',
name: 'Bash',
input: { command: 'ls' },
},
],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'call_1',
content: [
{ type: 'text', text: 'line one' },
{ type: 'text', text: 'line two' },
],
},
],
},
],
max_tokens: 64,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
const toolMessages = messages.filter(m => m.role === 'tool')
expect(toolMessages.length).toBe(1)
expect(toolMessages[0].tool_call_id).toBe('call_1')
expect(typeof toolMessages[0].content).toBe('string')
expect(toolMessages[0].content).toBe('line one\n\nline two')
})
test('collapses multiple text blocks into a single string for DeepSeek compatibility (issue #774)', async () => {
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'deepseek-reasoner',
choices: [
{
message: {
role: 'assistant',
content: 'done',
},
finish_reason: 'stop',
},
],
usage: {
prompt_tokens: 12,
completion_tokens: 4,
total_tokens: 16,
},
}),
{
headers: {
'Content-Type': 'application/json',
},
},
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'deepseek-reasoner',
system: 'test system',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Hello!' },
{ type: 'text', text: 'How are you?' },
],
},
],
max_tokens: 64,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
expect(messages.length).toBe(2) // system + user
expect(messages[1].role).toBe('user')
expect(typeof messages[1].content).toBe('string')
expect(messages[1].content).toBe('Hello!\n\nHow are you?')
})
test('preserves mixed text and image tool results as multipart content', async () => {
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'gpt-4o',
choices: [
{
message: {
role: 'assistant',
content: 'done',
},
finish_reason: 'stop',
},
],
usage: {
prompt_tokens: 12,
completion_tokens: 4,
total_tokens: 16,
},
}),
{
headers: {
'Content-Type': 'application/json',
},
},
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'gpt-4o',
system: 'test system',
messages: [
{ role: 'user', content: 'Show me' },
{
role: 'assistant',
content: [
{
type: 'tool_use',
id: 'call_1',
name: 'Bash',
input: { command: 'cat image.png' },
},
],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'call_1',
content: [
{ type: 'text', text: 'Here is the image:' },
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/png',
data: 'iVBORw0KGgo=',
},
},
],
},
],
},
],
max_tokens: 64,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
const toolMessages = messages.filter(m => m.role === 'tool')
expect(toolMessages.length).toBe(1)
expect(Array.isArray(toolMessages[0].content)).toBe(true)
const content = toolMessages[0].content as Array<Record<string, unknown>>
expect(content.length).toBe(2)
expect(content[0].type).toBe('text')
expect(content[1].type).toBe('image_url')
}) })

View File

@@ -46,7 +46,6 @@ import {
type AnthropicUsage, type AnthropicUsage,
type ShimCreateParams, type ShimCreateParams,
} from './codexShim.js' } from './codexShim.js'
import { compressToolHistory } from './compressToolHistory.js'
import { fetchWithProxyRetry } from './fetchWithProxyRetry.js' import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
import { import {
getLocalProviderRetryBaseUrls, getLocalProviderRetryBaseUrls,
@@ -67,8 +66,6 @@ import {
normalizeToolArguments, normalizeToolArguments,
hasToolFieldMapping, hasToolFieldMapping,
} from './toolArgumentNormalization.js' } from './toolArgumentNormalization.js'
import { logApiCallStart, logApiCallEnd } from '../../utils/requestLogging.js'
import { createStreamState, processStreamChunk, getStreamStats } from '../../utils/streamingOptimizer.js'
type SecretValueSource = Partial<{ type SecretValueSource = Partial<{
OPENAI_API_KEY: string OPENAI_API_KEY: string
@@ -84,10 +81,6 @@ const GITHUB_429_MAX_RETRIES = 3
const GITHUB_429_BASE_DELAY_SEC = 1 const GITHUB_429_BASE_DELAY_SEC = 1
const GITHUB_429_MAX_DELAY_SEC = 32 const GITHUB_429_MAX_DELAY_SEC = 32
const GEMINI_API_HOST = 'generativelanguage.googleapis.com' const GEMINI_API_HOST = 'generativelanguage.googleapis.com'
const MOONSHOT_API_HOSTS = new Set([
'api.moonshot.ai',
'api.moonshot.cn',
])
const COPILOT_HEADERS: Record<string, string> = { const COPILOT_HEADERS: Record<string, string> = {
'User-Agent': 'GitHubCopilotChat/0.26.7', 'User-Agent': 'GitHubCopilotChat/0.26.7',
@@ -153,15 +146,6 @@ function hasGeminiApiHost(baseUrl: string | undefined): boolean {
} }
} }
function isMoonshotBaseUrl(baseUrl: string | undefined): boolean {
if (!baseUrl) return false
try {
return MOONSHOT_API_HOSTS.has(new URL(baseUrl).hostname.toLowerCase())
} catch {
return false
}
}
function formatRetryAfterHint(response: Response): string { function formatRetryAfterHint(response: Response): string {
const ra = response.headers.get('retry-after') const ra = response.headers.get('retry-after')
return ra ? ` (Retry-After: ${ra})` : '' return ra ? ` (Retry-After: ${ra})` : ''
@@ -218,14 +202,6 @@ interface OpenAIMessage {
}> }>
tool_call_id?: string tool_call_id?: string
name?: string name?: string
/**
* Per-assistant-message chain-of-thought, attached when echoing an
* assistant message back to providers that require it (notably Moonshot:
* "thinking is enabled but reasoning_content is missing in assistant
* tool call message at index N" 400). Derived from the Anthropic thinking
* block captured when the original response was translated.
*/
reasoning_content?: string
} }
interface OpenAITool { interface OpenAITool {
@@ -301,15 +277,6 @@ function convertToolResultContent(
const text = parts[0].text ?? '' const text = parts[0].text ?? ''
return isError ? `Error: ${text}` : text return isError ? `Error: ${text}` : text
} }
// Collapse arrays of only text blocks into a single string for DeepSeek
// compatibility (issue #774). DeepSeek rejects arrays in role: "tool" messages.
const allText = parts.every(p => p.type === 'text')
if (allText) {
const text = parts.map(p => p.text ?? '').join('\n\n')
return isError ? `Error: ${text}` : text
}
if (isError && parts[0]?.type === 'text') { if (isError && parts[0]?.type === 'text') {
parts[0] = { ...parts[0], text: `Error: ${parts[0].text ?? ''}` } parts[0] = { ...parts[0], text: `Error: ${parts[0].text ?? ''}` }
} else if (isError) { } else if (isError) {
@@ -368,14 +335,6 @@ function convertContentBlocks(
if (parts.length === 0) return '' if (parts.length === 0) return ''
if (parts.length === 1 && parts[0].type === 'text') return parts[0].text ?? '' if (parts.length === 1 && parts[0].type === 'text') return parts[0].text ?? ''
// Collapse arrays of only text blocks into a single string for DeepSeek
// compatibility (issue #774).
const allText = parts.every(p => p.type === 'text')
if (allText) {
return parts.map(p => p.text ?? '').join('\n\n')
}
return parts return parts
} }
@@ -387,45 +346,19 @@ function isGeminiMode(): boolean {
} }
function convertMessages( function convertMessages(
messages: Array<{ messages: Array<{ role: string; message?: { role?: string; content?: unknown }; content?: unknown }>,
role: string
message?: { role?: string; content?: unknown }
content?: unknown
}>,
system: unknown, system: unknown,
options?: { preserveReasoningContent?: boolean },
): OpenAIMessage[] { ): OpenAIMessage[] {
const preserveReasoningContent = options?.preserveReasoningContent === true
const result: OpenAIMessage[] = [] const result: OpenAIMessage[] = []
const knownToolCallIds = new Set<string>() const knownToolCallIds = new Set<string>()
// Pre-scan for all tool results in the history to identify valid tool calls
const toolResultIds = new Set<string>()
for (const msg of messages) {
const inner = msg.message ?? msg
const content = (inner as { content?: unknown }).content
if (Array.isArray(content)) {
for (const block of content) {
if (
(block as { type?: string }).type === 'tool_result' &&
(block as { tool_use_id?: string }).tool_use_id
) {
toolResultIds.add((block as { tool_use_id: string }).tool_use_id)
}
}
}
}
// System message first // System message first
const sysText = convertSystemPrompt(system) const sysText = convertSystemPrompt(system)
if (sysText) { if (sysText) {
result.push({ role: 'system', content: sysText }) result.push({ role: 'system', content: sysText })
} }
for (let i = 0; i < messages.length; i++) { for (const msg of messages) {
const msg = messages[i]
const isLastInHistory = i === messages.length - 1
// Claude Code wraps messages in { role, message: { role, content } } // Claude Code wraps messages in { role, message: { role, content } }
const inner = msg.message ?? msg const inner = msg.message ?? msg
const role = (inner as { role?: string }).role ?? msg.role const role = (inner as { role?: string }).role ?? msg.role
@@ -434,12 +367,8 @@ function convertMessages(
if (role === 'user') { if (role === 'user') {
// Check for tool_result blocks in user messages // Check for tool_result blocks in user messages
if (Array.isArray(content)) { if (Array.isArray(content)) {
const toolResults = content.filter( const toolResults = content.filter((b: { type?: string }) => b.type === 'tool_result')
(b: { type?: string }) => b.type === 'tool_result', const otherContent = content.filter((b: { type?: string }) => b.type !== 'tool_result')
)
const otherContent = content.filter(
(b: { type?: string }) => b.type !== 'tool_result',
)
// Emit tool results as tool messages, but ONLY if we have a matching tool_use ID. // Emit tool results as tool messages, but ONLY if we have a matching tool_use ID.
// Mistral/OpenAI strictly require tool messages to follow an assistant message with tool_calls. // Mistral/OpenAI strictly require tool messages to follow an assistant message with tool_calls.
@@ -454,9 +383,7 @@ function convertMessages(
content: convertToolResultContent(tr.content, tr.is_error), content: convertToolResultContent(tr.content, tr.is_error),
}) })
} else { } else {
logForDebugging( logForDebugging(`Dropping orphan tool_result for ID: ${id} to prevent API error`)
`Dropping orphan tool_result for ID: ${id} to prevent API error`,
)
} }
} }
@@ -476,12 +403,8 @@ function convertMessages(
} else if (role === 'assistant') { } else if (role === 'assistant') {
// Check for tool_use blocks // Check for tool_use blocks
if (Array.isArray(content)) { if (Array.isArray(content)) {
const toolUses = content.filter( const toolUses = content.filter((b: { type?: string }) => b.type === 'tool_use')
(b: { type?: string }) => b.type === 'tool_use', const thinkingBlock = content.find((b: { type?: string }) => b.type === 'thinking')
)
const thinkingBlock = content.find(
(b: { type?: string }) => b.type === 'thinking',
)
const textContent = content.filter( const textContent = content.filter(
(b: { type?: string }) => b.type !== 'tool_use' && b.type !== 'thinking', (b: { type?: string }) => b.type !== 'tool_use' && b.type !== 'thinking',
) )
@@ -490,123 +413,70 @@ function convertMessages(
role: 'assistant', role: 'assistant',
content: (() => { content: (() => {
const c = convertContentBlocks(textContent) const c = convertContentBlocks(textContent)
return typeof c === 'string' return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
? c
: Array.isArray(c)
? c.map((p: { text?: string }) => p.text ?? '').join('')
: ''
})(), })(),
} }
// Providers that validate reasoning continuity (Moonshot: "thinking
// is enabled but reasoning_content is missing in assistant tool call
// message at index N" 400) need the original chain-of-thought echoed
// back on each assistant message that carries a tool_call. We kept
// the thinking block on the Anthropic side; re-attach it here as the
// `reasoning_content` field on the outgoing OpenAI-shaped message.
// Gated per-provider because other endpoints either ignore the field
// (harmless) or strict-reject unknown fields (harmful).
if (preserveReasoningContent) {
const thinkingText = (thinkingBlock as { thinking?: string } | undefined)?.thinking
if (typeof thinkingText === 'string' && thinkingText.trim().length > 0) {
assistantMsg.reasoning_content = thinkingText
}
}
if (toolUses.length > 0) { if (toolUses.length > 0) {
const mappedToolCalls = toolUses assistantMsg.tool_calls = toolUses.map(
.map( (tu: {
(tu: { id?: string
id?: string name?: string
name?: string input?: unknown
input?: unknown extra_content?: Record<string, unknown>
extra_content?: Record<string, unknown> signature?: string
signature?: string }) => {
}) => { const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`
const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}` knownToolCallIds.add(id)
const toolCall: NonNullable<OpenAIMessage['tool_calls']>[number] = {
id,
type: 'function' as const,
function: {
name: tu.name ?? 'unknown',
arguments:
typeof tu.input === 'string'
? tu.input
: JSON.stringify(tu.input ?? {}),
},
}
// Only keep tool calls that have a corresponding result in the history, // Preserve existing extra_content if present
// or if it's the last message (prefill scenario). if (tu.extra_content) {
// Orphaned tool calls (e.g. from user interruption) cause 400 errors. toolCall.extra_content = { ...tu.extra_content }
if (!toolResultIds.has(id) && !isLastInHistory) { }
return null
}
knownToolCallIds.add(id) // Handle Gemini thought_signature
const toolCall: NonNullable< if (isGeminiMode()) {
OpenAIMessage['tool_calls'] // If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
>[number] = { // Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
id, // The API requires the same signature on every replayed function call part in a parallel set.
type: 'function' as const, const signature = tu.signature ?? (thinkingBlock as any)?.signature
function: {
name: tu.name ?? 'unknown',
arguments:
typeof tu.input === 'string'
? tu.input
: JSON.stringify(tu.input ?? {}),
},
}
// Preserve existing extra_content if present // Merge into existing google-specific metadata if present
if (tu.extra_content) { const existingGoogle = (toolCall.extra_content?.google as Record<string, unknown>) ?? {}
toolCall.extra_content = { ...tu.extra_content } toolCall.extra_content = {
} ...toolCall.extra_content,
google: {
// Handle Gemini thought_signature ...existingGoogle,
if (isGeminiMode()) { thought_signature: signature ?? "skip_thought_signature_validator"
// If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
// Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
// The API requires the same signature on every replayed function call part in a parallel set.
const signature =
tu.signature ?? (thinkingBlock as any)?.signature
// Merge into existing google-specific metadata if present
const existingGoogle =
(toolCall.extra_content?.google as Record<
string,
unknown
>) ?? {}
toolCall.extra_content = {
...toolCall.extra_content,
google: {
...existingGoogle,
thought_signature:
signature ?? 'skip_thought_signature_validator',
},
} }
} }
}
return toolCall return toolCall
}, },
) )
.filter((tc): tc is NonNullable<typeof tc> => tc !== null)
if (mappedToolCalls.length > 0) {
assistantMsg.tool_calls = mappedToolCalls
}
} }
// Only push assistant message if it has content or tool calls. result.push(assistantMsg)
// Stripped thinking-only blocks from user interruptions are empty and cause 400s.
if (assistantMsg.content || assistantMsg.tool_calls?.length) {
result.push(assistantMsg)
}
} else { } else {
const assistantMsg: OpenAIMessage = { result.push({
role: 'assistant', role: 'assistant',
content: (() => { content: (() => {
const c = convertContentBlocks(content) const c = convertContentBlocks(content)
return typeof c === 'string' return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
? c
: Array.isArray(c)
? c.map((p: { text?: string }) => p.text ?? '').join('')
: ''
})(), })(),
} })
if (assistantMsg.content) {
result.push(assistantMsg)
}
} }
} }
} }
@@ -620,56 +490,25 @@ function convertMessages(
for (const msg of result) { for (const msg of result) {
const prev = coalesced[coalesced.length - 1] const prev = coalesced[coalesced.length - 1]
// Mistral/Devstral: 'tool' message must be followed by an 'assistant' message. if (prev && prev.role === msg.role && msg.role !== 'tool' && msg.role !== 'system') {
// If a 'tool' result is followed by a 'user' message, we must inject a semantic const prevContent = prev.content
// assistant response to satisfy the strict role sequence:
// ... -> assistant (calls) -> tool (results) -> assistant (semantic) -> user (next)
if (prev && prev.role === 'tool' && msg.role === 'user') {
coalesced.push({
role: 'assistant',
content: '[Tool execution interrupted by user]',
})
}
const lastAfterPossibleInjection = coalesced[coalesced.length - 1]
if (
lastAfterPossibleInjection &&
lastAfterPossibleInjection.role === msg.role &&
msg.role !== 'tool' &&
msg.role !== 'system'
) {
const prevContent = lastAfterPossibleInjection.content
const curContent = msg.content const curContent = msg.content
if (typeof prevContent === 'string' && typeof curContent === 'string') { if (typeof prevContent === 'string' && typeof curContent === 'string') {
lastAfterPossibleInjection.content = prev.content = prevContent + (prevContent && curContent ? '\n' : '') + curContent
prevContent + (prevContent && curContent ? '\n' : '') + curContent
} else { } else {
const toArray = ( const toArray = (
c: c: string | Array<{ type: string; text?: string; image_url?: { url: string } }> | undefined,
| string ): Array<{ type: string; text?: string; image_url?: { url: string } }> => {
| Array<{ type: string; text?: string; image_url?: { url: string } }>
| undefined,
): Array<{
type: string
text?: string
image_url?: { url: string }
}> => {
if (!c) return [] if (!c) return []
if (typeof c === 'string') return c ? [{ type: 'text', text: c }] : [] if (typeof c === 'string') return c ? [{ type: 'text', text: c }] : []
return c return c
} }
lastAfterPossibleInjection.content = [ prev.content = [...toArray(prevContent), ...toArray(curContent)]
...toArray(prevContent),
...toArray(curContent),
]
} }
if (msg.tool_calls?.length) { if (msg.tool_calls?.length) {
lastAfterPossibleInjection.tool_calls = [ prev.tool_calls = [...(prev.tool_calls ?? []), ...msg.tool_calls]
...(lastAfterPossibleInjection.tool_calls ?? []),
...msg.tool_calls,
]
} }
} else { } else {
coalesced.push(msg) coalesced.push(msg)
@@ -884,7 +723,6 @@ async function* openaiStreamToAnthropic(
let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null
let hasEmittedFinalUsage = false let hasEmittedFinalUsage = false
let hasProcessedFinishReason = false let hasProcessedFinishReason = false
const streamState = createStreamState()
// Emit message_start // Emit message_start
yield { yield {
@@ -1048,7 +886,6 @@ async function* openaiStreamToAnthropic(
delta: { type: 'text_delta', text: visible }, delta: { type: 'text_delta', text: visible },
} }
} }
processStreamChunk(streamState, delta.content)
} }
// Tool calls // Tool calls
@@ -1068,7 +905,6 @@ async function* openaiStreamToAnthropic(
const toolBlockIndex = contentBlockIndex const toolBlockIndex = contentBlockIndex
const initialArguments = tc.function.arguments ?? '' const initialArguments = tc.function.arguments ?? ''
const normalizeAtStop = hasToolFieldMapping(tc.function.name) const normalizeAtStop = hasToolFieldMapping(tc.function.name)
processStreamChunk(streamState, tc.function.arguments ?? '')
activeToolCalls.set(tc.index, { activeToolCalls.set(tc.index, {
id: tc.id, id: tc.id,
name: tc.function.name, name: tc.function.name,
@@ -1266,20 +1102,6 @@ async function* openaiStreamToAnthropic(
reader.releaseLock() reader.releaseLock()
} }
const stats = getStreamStats(streamState)
if (stats.totalChunks > 0) {
logForDebugging(
JSON.stringify({
type: 'stream_stats',
model,
total_chunks: stats.totalChunks,
first_token_ms: stats.firstTokenMs,
duration_ms: stats.durationMs,
}),
{ level: 'debug' },
)
}
yield { type: 'message_stop' } yield { type: 'message_stop' }
} }
@@ -1477,20 +1299,14 @@ class OpenAIShimMessages {
params: ShimCreateParams, params: ShimCreateParams,
options?: { signal?: AbortSignal; headers?: Record<string, string> }, options?: { signal?: AbortSignal; headers?: Record<string, string> },
): Promise<Response> { ): Promise<Response> {
const compressedMessages = compressToolHistory( const openaiMessages = convertMessages(
params.messages as Array<{ params.messages as Array<{
role: string role: string
message?: { role?: string; content?: unknown } message?: { role?: string; content?: unknown }
content?: unknown content?: unknown
}>, }>,
request.resolvedModel, params.system,
) )
const openaiMessages = convertMessages(compressedMessages, params.system, {
// Moonshot requires every assistant tool-call message to carry
// reasoning_content when its thinking feature is active. Echo it back
// from the thinking block we captured on the inbound response.
preserveReasoningContent: isMoonshotBaseUrl(request.baseUrl),
})
const body: Record<string, unknown> = { const body: Record<string, unknown> = {
model: request.resolvedModel, model: request.resolvedModel,
@@ -1526,19 +1342,14 @@ class OpenAIShimMessages {
const isGithubCopilot = isGithub && githubEndpointType === 'copilot' const isGithubCopilot = isGithub && githubEndpointType === 'copilot'
const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom') const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom')
const isMoonshot = isMoonshotBaseUrl(request.baseUrl) if ((isGithub || isMistral || isLocal) && body.max_completion_tokens !== undefined) {
if ((isGithub || isMistral || isLocal || isMoonshot) && body.max_completion_tokens !== undefined) {
body.max_tokens = body.max_completion_tokens body.max_tokens = body.max_completion_tokens
delete body.max_completion_tokens delete body.max_completion_tokens
} }
// mistral and gemini don't recognize body.store — Gemini returns 400 // mistral and gemini don't recognize body.store — Gemini returns 400
// "Invalid JSON payload received. Unknown name 'store': Cannot find field." // "Invalid JSON payload received. Unknown name 'store': Cannot find field."
// Moonshot (api.moonshot.ai/.cn) has not published support for the if (isMistral || isGeminiMode()) {
// parameter either; strip it preemptively to avoid the same class of
// error on strict-parse providers.
if (isMistral || isGeminiMode() || isMoonshot) {
delete body.store delete body.store
} }
@@ -1764,12 +1575,6 @@ class OpenAIShimMessages {
} }
let response: Response | undefined let response: Response | undefined
const provider = request.baseUrl.includes('nvidia') ? 'nvidia-nim'
: request.baseUrl.includes('minimax') ? 'minimax'
: request.baseUrl.includes('localhost:11434') || request.baseUrl.includes('localhost:11435') ? 'ollama'
: request.baseUrl.includes('anthropic') ? 'anthropic'
: 'openai'
const { correlationId, startTime } = logApiCallStart(provider, request.resolvedModel)
for (let attempt = 0; attempt < maxAttempts; attempt++) { for (let attempt = 0; attempt < maxAttempts; attempt++) {
try { try {
response = await fetchWithProxyRetry( response = await fetchWithProxyRetry(
@@ -1807,20 +1612,6 @@ class OpenAIShimMessages {
} }
if (response.ok) { if (response.ok) {
let tokensIn = 0
let tokensOut = 0
// Skip clone() for streaming responses - it blocks until full body is received,
// defeating the purpose of streaming. Usage data is already sent via
// stream_options: { include_usage: true } and can be extracted from the stream.
if (!params.stream) {
try {
const clone = response.clone()
const data = await clone.json()
tokensIn = data.usage?.prompt_tokens ?? 0
tokensOut = data.usage?.completion_tokens ?? 0
} catch { /* ignore */ }
}
logApiCallEnd(correlationId, startTime, request.resolvedModel, 'success', tokensIn, tokensOut, false)
return response return response
} }

View File

@@ -1,191 +0,0 @@
import { describe, expect, test } from 'bun:test'
import {
routeModel,
type SmartRoutingConfig,
} from './smartModelRouting.ts'
const ENABLED: SmartRoutingConfig = {
enabled: true,
simpleModel: 'claude-haiku-4-5',
strongModel: 'claude-opus-4-7',
}
describe('routeModel — disabled / misconfigured', () => {
test('disabled config routes to strong', () => {
const decision = routeModel(
{ userText: 'hi' },
{ ...ENABLED, enabled: false },
)
expect(decision.model).toBe('claude-opus-4-7')
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('disabled')
})
test('missing simpleModel falls back to strong', () => {
const decision = routeModel(
{ userText: 'hi' },
{ ...ENABLED, simpleModel: '' },
)
expect(decision.model).toBe('claude-opus-4-7')
expect(decision.complexity).toBe('strong')
})
test('simpleModel === strongModel routes to strong (no-op)', () => {
const decision = routeModel(
{ userText: 'hi' },
{ ...ENABLED, simpleModel: 'claude-opus-4-7' },
)
expect(decision.model).toBe('claude-opus-4-7')
expect(decision.complexity).toBe('strong')
})
})
describe('routeModel — simple path', () => {
test('short greeting routes to simple', () => {
const decision = routeModel({ userText: 'thanks!', turnNumber: 5 }, ENABLED)
expect(decision.model).toBe('claude-haiku-4-5')
expect(decision.complexity).toBe('simple')
})
test('empty input routes to simple', () => {
const decision = routeModel({ userText: ' ' }, ENABLED)
expect(decision.model).toBe('claude-haiku-4-5')
expect(decision.complexity).toBe('simple')
})
test('mid-length chatter routes to simple', () => {
const decision = routeModel(
{ userText: 'yep looks good, go ahead', turnNumber: 10 },
ENABLED,
)
expect(decision.complexity).toBe('simple')
})
})
describe('routeModel — strong path', () => {
test('first turn always routes to strong, even when short', () => {
const decision = routeModel(
{ userText: 'fix the bug', turnNumber: 1 },
ENABLED,
)
expect(decision.model).toBe('claude-opus-4-7')
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('first turn')
})
test('code fence routes to strong', () => {
const decision = routeModel(
{
userText: 'change this:\n```\nfoo()\n```',
turnNumber: 5,
},
ENABLED,
)
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('code')
})
test('inline code span routes to strong', () => {
const decision = routeModel(
{ userText: 'rename `foo` to `bar`', turnNumber: 5 },
ENABLED,
)
expect(decision.complexity).toBe('strong')
})
test('reasoning keyword "plan" routes to strong even when short', () => {
const decision = routeModel(
{ userText: 'plan the refactor', turnNumber: 5 },
ENABLED,
)
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('keyword')
})
test('reasoning keyword "debug" routes to strong', () => {
const decision = routeModel(
{ userText: 'debug the test', turnNumber: 5 },
ENABLED,
)
expect(decision.complexity).toBe('strong')
})
test('"root cause" multi-word keyword routes to strong', () => {
const decision = routeModel(
{ userText: 'find the root cause', turnNumber: 5 },
ENABLED,
)
expect(decision.complexity).toBe('strong')
})
test('multi-paragraph input routes to strong', () => {
const decision = routeModel(
{
userText: 'first thought.\n\nsecond thought.',
turnNumber: 5,
},
ENABLED,
)
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('multi-paragraph')
})
test('over-long input routes to strong', () => {
const long = 'ok '.repeat(100) // ~300 chars, 100 words
const decision = routeModel(
{ userText: long, turnNumber: 5 },
ENABLED,
)
expect(decision.complexity).toBe('strong')
})
test('exactly at the boundary stays simple', () => {
const text = 'a'.repeat(160)
const decision = routeModel(
{ userText: text, turnNumber: 5 },
{ ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
)
expect(decision.complexity).toBe('simple')
})
test('one char over the boundary routes to strong', () => {
const text = 'a'.repeat(161)
const decision = routeModel(
{ userText: text, turnNumber: 5 },
{ ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
)
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('160 chars')
})
})
describe('routeModel — config overrides', () => {
test('custom simpleMaxChars is honored', () => {
const decision = routeModel(
{ userText: 'abcdefghijklmnop', turnNumber: 5 },
{ ...ENABLED, simpleMaxChars: 10 },
)
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('10 chars')
})
test('custom simpleMaxWords is honored', () => {
const decision = routeModel(
{ userText: 'one two three four five', turnNumber: 5 },
{ ...ENABLED, simpleMaxWords: 3 },
)
expect(decision.complexity).toBe('strong')
expect(decision.reason).toContain('3 words')
})
})
describe('routeModel — reason strings', () => {
test('simple decisions include char + word counts', () => {
const decision = routeModel(
{ userText: 'sounds good', turnNumber: 5 },
ENABLED,
)
expect(decision.reason).toMatch(/\d+ chars, \d+ words/)
})
})

View File

@@ -1,215 +0,0 @@
/**
* Smart model routing — cheap-for-simple, strong-for-hard.
*
* For everyday short chatter ("ok", "thanks", "what does this do?") the
* incremental quality of Opus/GPT-5 over Haiku/Mini is negligible while the
* cost and latency are an order of magnitude worse. Smart routing opts a
* user into routing such "obviously simple" turns to a cheaper model while
* keeping the strong model for the anything-non-trivial path.
*
* This module is a pure primitive: it takes a turn description (the user's
* text + light context) and returns which model to use, based on config.
* It never reads env vars or state directly — caller supplies everything.
*
* Off by default. Users opt in via settings.smartRouting.enabled. Intent:
* make this a copy-paste-small config block rather than a hidden heuristic,
* so the tradeoff is visible and the user controls it.
*/
export type SmartRoutingConfig = {
enabled: boolean
/** Model to use for turns classified as "simple". */
simpleModel: string
/** Model to use for turns classified as "strong" (or when unsure). */
strongModel: string
/** Max characters in user input to qualify as "simple". Default 160. */
simpleMaxChars?: number
/** Max whitespace-separated words to qualify as "simple". Default 28. */
simpleMaxWords?: number
}
export type RoutingDecision = {
model: string
complexity: 'simple' | 'strong'
/** Human-readable reason — useful for the UI indicator and debug logs. */
reason: string
}
export type RoutingInput = {
/** The user's message text for this turn. */
userText: string
/**
* Optional: how many tool-use blocks the assistant has emitted in the
* recent conversation. High values correlate with "continue this work"
* follow-ups that can still be cheap, UNLESS the user also typed code
* or strong-keyword text.
*/
recentToolUses?: number
/**
* Optional: turn number within the current session (1-indexed). The first
* turn is often task-setup and benefits from the strong model even if
* short — a bare "build X" opens the whole task.
*/
turnNumber?: number
}
const DEFAULT_SIMPLE_MAX_CHARS = 160
const DEFAULT_SIMPLE_MAX_WORDS = 28
// Keywords that strongly suggest reasoning/planning/design work.
// Matching is word-boundary / case-insensitive. Must include enough anchors
// that short prompts like "plan the refactor" route to strong even under
// the char/word cutoff.
const STRONG_KEYWORDS = [
'plan',
'design',
'architect',
'architecture',
'refactor',
'debug',
'investigate',
'analyze',
'analyse',
'implement',
'optimize',
'optimise',
'review',
'audit',
'diagnose',
'root cause',
'root-cause',
'why does',
'why is',
'how should',
'why did',
'propose',
'trace',
'reproduce',
]
const STRONG_KEYWORD_RE = new RegExp(
`\\b(?:${STRONG_KEYWORDS.map(k => k.replace(/[-]/g, '[-\\s]')).join('|')})\\b`,
'i',
)
const CODE_FENCE_RE = /```[\s\S]*?```|`[^`\n]+`/
function countWords(text: string): number {
const trimmed = text.trim()
if (!trimmed) return 0
return trimmed.split(/\s+/).length
}
function hasMultiParagraph(text: string): boolean {
return /\n\s*\n/.test(text)
}
function hasCode(text: string): boolean {
return CODE_FENCE_RE.test(text)
}
function hasStrongKeyword(text: string): boolean {
return STRONG_KEYWORD_RE.test(text)
}
/**
* Decide whether to route to the simple or strong model based on heuristics.
* Returns the chosen model + a reason. When routing is disabled or both
* models match, the strong model is used (safe default).
*/
export function routeModel(
input: RoutingInput,
config: SmartRoutingConfig,
): RoutingDecision {
if (!config.enabled) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'smart-routing disabled',
}
}
if (!config.simpleModel || !config.strongModel) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'simpleModel or strongModel missing from config',
}
}
if (config.simpleModel === config.strongModel) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'simpleModel equals strongModel',
}
}
const text = input.userText ?? ''
const trimmed = text.trim()
if (!trimmed) {
// Empty input (e.g. resuming a tool-use chain) — cheap by default.
return {
model: config.simpleModel,
complexity: 'simple',
reason: 'empty user text',
}
}
// First turn of a session is task-setup — always use strong.
if (input.turnNumber === 1) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'first turn of session',
}
}
const maxChars = config.simpleMaxChars ?? DEFAULT_SIMPLE_MAX_CHARS
const maxWords = config.simpleMaxWords ?? DEFAULT_SIMPLE_MAX_WORDS
if (hasCode(trimmed)) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'contains code block or inline code',
}
}
if (hasStrongKeyword(trimmed)) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'contains reasoning/planning keyword',
}
}
if (hasMultiParagraph(trimmed)) {
return {
model: config.strongModel,
complexity: 'strong',
reason: 'multi-paragraph input',
}
}
if (trimmed.length > maxChars) {
return {
model: config.strongModel,
complexity: 'strong',
reason: `input > ${maxChars} chars`,
}
}
if (countWords(trimmed) > maxWords) {
return {
model: config.strongModel,
complexity: 'strong',
reason: `input > ${maxWords} words`,
}
}
return {
model: config.simpleModel,
complexity: 'simple',
reason: `short (${trimmed.length} chars, ${countWords(trimmed)} words)`,
}
}

View File

@@ -16,21 +16,12 @@ describe('getEffectiveContextWindowSize', () => {
// 8k minus 20k summary reservation = -12k, causing infinite auto-compact. // 8k minus 20k summary reservation = -12k, causing infinite auto-compact.
// Now the fallback is 128k and there's a floor, so effective is always // Now the fallback is 128k and there's a floor, so effective is always
// at least reservedTokensForSummary + buffer. // at least reservedTokensForSummary + buffer.
//
// The exact floor depends on the max-output-tokens slot-reservation cap
// (tengu_otk_slot_v1 GrowthBook flag). With cap enabled, the model's
// default output cap drops to CAPPED_DEFAULT_MAX_TOKENS (8k), so the
// summary reservation is 8k and the floor is 8k + 13k = 21k. With cap
// disabled it's 20k + 13k = 33k. Assert the worst case so the test is
// stable regardless of flag state in CI vs local.
process.env.CLAUDE_CODE_USE_OPENAI = '1' process.env.CLAUDE_CODE_USE_OPENAI = '1'
try { try {
const effective = getEffectiveContextWindowSize('some-unknown-3p-model') const effective = getEffectiveContextWindowSize('some-unknown-3p-model')
expect(effective).toBeGreaterThan(0) expect(effective).toBeGreaterThan(0)
// 21k = CAPPED_DEFAULT_MAX_TOKENS (8k) + AUTOCOMPACT_BUFFER_TOKENS (13k). // Must be at least summary reservation (20k) + buffer (13k) = 33k
// Covers the anti-regression intent of issue #635 without assuming expect(effective).toBeGreaterThanOrEqual(33_000)
// the GrowthBook flag state.
expect(effective).toBeGreaterThanOrEqual(21_000)
} finally { } finally {
delete process.env.CLAUDE_CODE_USE_OPENAI delete process.env.CLAUDE_CODE_USE_OPENAI
} }

View File

@@ -38,7 +38,7 @@ export const TIME_BASED_MC_CLEARED_MESSAGE = '[Old tool result content cleared]'
const IMAGE_MAX_TOKEN_SIZE = 2000 const IMAGE_MAX_TOKEN_SIZE = 2000
// Only compact these built-in tools (MCP tools are also compactable via prefix match) // Only compact these built-in tools (MCP tools are also compactable via prefix match)
export const COMPACTABLE_TOOLS = new Set<string>([ const COMPACTABLE_TOOLS = new Set<string>([
FILE_READ_TOOL_NAME, FILE_READ_TOOL_NAME,
...SHELL_TOOL_NAMES, ...SHELL_TOOL_NAMES,
GREP_TOOL_NAME, GREP_TOOL_NAME,
@@ -51,7 +51,7 @@ export const COMPACTABLE_TOOLS = new Set<string>([
const MCP_TOOL_PREFIX = 'mcp__' const MCP_TOOL_PREFIX = 'mcp__'
export function isCompactableTool(name: string): boolean { function isCompactableTool(name: string): boolean {
return COMPACTABLE_TOOLS.has(name) || name.startsWith(MCP_TOOL_PREFIX) return COMPACTABLE_TOOLS.has(name) || name.startsWith(MCP_TOOL_PREFIX)
} }

View File

@@ -223,49 +223,6 @@ export function bytesPerTokenForFileType(fileExtension: string): number {
} }
} }
/**
* Tokenizer ratio by model family.
* Different models have different encodings.
*/
export interface ModelTokenizerConfig {
modelFamily: string
bytesPerToken: number
supportsJson: boolean
supportsCode: boolean
}
export const MODEL_TOKENIZER_CONFIGS: ModelTokenizerConfig[] = [
{ modelFamily: 'claude', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
{ modelFamily: 'gpt-4', bytesPerToken: 4, supportsJson: true, supportsCode: true },
{ modelFamily: 'gpt-3.5', bytesPerToken: 4, supportsJson: true, supportsCode: true },
{ modelFamily: 'gemini', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
{ modelFamily: 'llama', bytesPerToken: 3.8, supportsJson: true, supportsCode: true },
{ modelFamily: 'deepseek', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
{ modelFamily: 'minimax', bytesPerToken: 3.2, supportsJson: true, supportsCode: true },
]
/**
* Get tokenizer config for a model.
*/
export function getTokenizerConfig(model: string): ModelTokenizerConfig {
const lower = model.toLowerCase()
for (const config of MODEL_TOKENIZER_CONFIGS) {
if (lower.includes(config.modelFamily)) {
return config
}
}
return { modelFamily: 'unknown', bytesPerToken: 4, supportsJson: true, supportsCode: true }
}
/**
* Get bytes-per-token ratio for a model.
*/
export function getBytesPerTokenForModel(model: string): number {
return getTokenizerConfig(model).bytesPerToken
}
/** /**
* Like {@link roughTokenCountEstimation} but uses a more accurate * Like {@link roughTokenCountEstimation} but uses a more accurate
* bytes-per-token ratio when the file type is known. * bytes-per-token ratio when the file type is known.
@@ -284,106 +241,6 @@ export function roughTokenCountEstimationForFileType(
) )
} }
/**
* Content type classification for compression ratio.
*/
export type ContentType =
| 'json' | 'code' | 'prose' | 'technical'
| 'list' | 'table' | 'mixed'
/**
* Compression ratio by content type.
* Measured empirically - denser content = lower ratio.
*/
export const COMPRESSION_RATIOS: Record<ContentType, { min: number; max: number; typical: number }> = {
json: { min: 1.5, max: 2.5, typical: 2 },
code: { min: 3, max: 4.5, typical: 3.5 },
prose: { min: 3.5, max: 4.5, typical: 4 },
technical: { min: 2.5, max: 3.5, typical: 3 },
list: { min: 2, max: 3, typical: 2.5 },
table: { min: 1.8, max: 2.8, typical: 2.2 },
mixed: { min: 3, max: 4, typical: 3.5 },
}
/**
* Detect content type from content.
*/
export function detectContentType(content: string): ContentType {
const trimmed = content.trim()
// JSON
if ((trimmed.startsWith('{') && trimmed.endsWith('}')) ||
(trimmed.startsWith('[') && trimmed.endsWith(']'))) {
try {
JSON.parse(trimmed)
return 'json'
} catch { /* not valid json */ }
}
// Table (tabs or consistent delimiters)
const lines = trimmed.split('\n')
if (lines.length > 2) {
const hasTabs = lines[0].includes('\t')
const hasCommas = lines[0].includes(',')
if (hasTabs || hasCommas) {
const consistent = lines.slice(1).every(l => l.includes('\t') || l.includes(','))
if (consistent) return 'table'
}
}
// List
if (/^[\d\-\*\•]/.test(trimmed) || /^[\d\-\*\•]/.test(lines[0])) {
return 'list'
}
// Code (high density of special chars)
const codeChars = (content.match(/[{}()\[\];=]/g) || []).length
const codeRatio = codeChars / content.length
if (codeRatio > 0.05) return 'code'
// Technical (has numbers and units)
if (/\d+\s*(px|em|rem|%|ms|s|kb|mb|gb)/i.test(content)) {
return 'technical'
}
// Prose (default - natural language)
return 'prose'
}
/**
* Get compression ratio for content.
*/
export function getCompressionRatio(content: string, type?: ContentType): { ratio: number; min: number; max: number } {
const detectedType = type ?? detectContentType(content)
const { min, max, typical } = COMPRESSION_RATIOS[detectedType]
// Adjust based on actual content length
// Shorter content = higher variance
const lengthBonus = content.length < 100 ? 0.5 : 0
return {
ratio: typical,
min: min + lengthBonus,
max: max + lengthBonus,
}
}
/**
* Estimate tokens with confidence bounds.
*/
export function estimateWithBounds(
content: string,
type?: ContentType,
): { estimate: number; min: number; max: number } {
const { ratio, min: minRatio, max: maxRatio } = getCompressionRatio(content, type)
const estimate = roughTokenCountEstimation(content, ratio)
const min = roughTokenCountEstimation(content, maxRatio)
const max = roughTokenCountEstimation(content, minRatio)
return { estimate, min, max }
}
/** /**
* Estimates token count for a Message object by extracting and analyzing its text content. * Estimates token count for a Message object by extracting and analyzing its text content.
* This provides a more reliable estimate than getTokenUsage for messages that may have been compacted. * This provides a more reliable estimate than getTokenUsage for messages that may have been compacted.

View File

@@ -1,100 +0,0 @@
import { describe, expect, it } from 'bun:test'
import {
getTokenizerConfig,
getBytesPerTokenForModel,
detectContentType,
getCompressionRatio,
estimateWithBounds,
} from './tokenEstimation.js'
describe('Model Tokenizers', () => {
describe('getTokenizerConfig', () => {
it('returns config for claude models', () => {
const config = getTokenizerConfig('claude-sonnet-4-5-20250514')
expect(config.modelFamily).toBe('claude')
expect(config.bytesPerToken).toBe(3.5)
})
it('returns config for gpt models', () => {
const config = getTokenizerConfig('gpt-4')
expect(config.modelFamily).toBe('gpt-4')
expect(config.bytesPerToken).toBe(4)
})
it('returns default for unknown models', () => {
const config = getTokenizerConfig('unknown-model')
expect(config.modelFamily).toBe('unknown')
expect(config.bytesPerToken).toBe(4)
})
})
describe('getBytesPerTokenForModel', () => {
it('returns bytes per token for model', () => {
expect(getBytesPerTokenForModel('claude-opus-3-5-20250214')).toBe(3.5)
expect(getBytesPerTokenForModel('gpt-4o')).toBe(4)
expect(getBytesPerTokenForModel('deepseek-chat')).toBe(3.5)
expect(getBytesPerTokenForModel('minimax-M2.7')).toBe(3.2)
})
})
})
describe('Content Type Detection', () => {
describe('detectContentType', () => {
it('detects JSON', () => {
expect(detectContentType('{"key": "value"}')).toBe('json')
expect(detectContentType('[1, 2, 3]')).toBe('json')
})
it('detects code', () => {
expect(detectContentType('function test() { return 1 + 2; }')).toBe('code')
expect(detectContentType('const x = () => {}')).toBe('code')
})
it('detects prose', () => {
expect(detectContentType('This is a natural language response.')).toBe('prose')
expect(detectContentType('Hello world how are you?')).toBe('prose')
})
it('detects code-like technical', () => {
// Has both code chars and technical - higher code char ratio wins
expect(detectContentType('margin: 10px; padding: 5px;')).toBe('code')
})
it('detects list', () => {
expect(detectContentType('- item 1\n- item 2')).toBe('list')
expect(detectContentType('1. first\n2. second')).toBe('list')
})
it('detects prose by default', () => {
// Single column with newlines = prose
expect(detectContentType('a b c\n1 2 3')).toBe('prose')
})
})
})
describe('Compression Ratio', () => {
describe('getCompressionRatio', () => {
it('returns appropriate ratios', () => {
expect(getCompressionRatio('{"a":1}').ratio).toBe(2)
expect(getCompressionRatio('code here {} []').ratio).toBe(3.5)
expect(getCompressionRatio('Hello world').ratio).toBe(4)
})
})
describe('estimateWithBounds', () => {
it('returns estimate with bounds', () => {
const result = estimateWithBounds('Hello world')
expect(result.min).toBeLessThanOrEqual(result.estimate)
expect(result.max).toBeGreaterThanOrEqual(result.estimate)
expect(result.min).toBeLessThan(result.max)
})
it('handles JSON with tighter bounds', () => {
const result = estimateWithBounds('{"key": "value"}')
// JSON has smaller ratio range
expect(result.max).toBeLessThan(10)
})
})
})

View File

@@ -1241,7 +1241,6 @@ async function checkPermissionsAndCallTool(
{ {
...toolUseContext, ...toolUseContext,
toolUseId: toolUseID, toolUseId: toolUseID,
hookChainsCanUseTool: canUseTool,
userModified: permissionDecision.userModified ?? false, userModified: permissionDecision.userModified ?? false,
}, },
canUseTool, canUseTool,
@@ -1730,29 +1729,19 @@ async function checkPermissionsAndCallTool(
const hookMessages: MessageUpdateLazy< const hookMessages: MessageUpdateLazy<
AttachmentMessage | ProgressMessage<HookProgress> AttachmentMessage | ProgressMessage<HookProgress>
>[] = [] >[] = []
const hookChainsContext = toolUseContext as ToolUseContext & { for await (const hookResult of runPostToolUseFailureHooks(
hookChainsCanUseTool?: CanUseToolFn toolUseContext,
} tool,
hookChainsContext.hookChainsCanUseTool = canUseTool toolUseID,
try { messageId,
for await (const hookResult of runPostToolUseFailureHooks( processedInput,
toolUseContext, content,
tool, isInterrupt,
toolUseID, requestId,
messageId, mcpServerType,
processedInput, mcpServerBaseUrl,
content, )) {
isInterrupt, hookMessages.push(hookResult)
requestId,
mcpServerType,
mcpServerBaseUrl,
)) {
hookMessages.push(hookResult)
}
} finally {
if (hookChainsContext.hookChainsCanUseTool === canUseTool) {
delete hookChainsContext.hookChainsCanUseTool
}
} }
return [ return [

View File

@@ -284,7 +284,6 @@ export async function* runPostToolUseFailureHooks<Input extends AnyObject>(
isInterrupt, isInterrupt,
permissionMode, permissionMode,
toolUseContext.abortController.signal, toolUseContext.abortController.signal,
undefined,
)) { )) {
try { try {
// Check if we were aborted during hook execution // Check if we were aborted during hook execution

View File

@@ -733,9 +733,6 @@ export const CYBER_RISK_MITIGATION_REMINDER =
const MITIGATION_EXEMPT_MODELS = new Set(['claude-opus-4-6']) const MITIGATION_EXEMPT_MODELS = new Set(['claude-opus-4-6'])
function shouldIncludeFileReadMitigation(): boolean { function shouldIncludeFileReadMitigation(): boolean {
if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
return false
}
const shortName = getCanonicalName(getMainLoopModel()) const shortName = getCanonicalName(getMainLoopModel())
return !MITIGATION_EXEMPT_MODELS.has(shortName) return !MITIGATION_EXEMPT_MODELS.has(shortName)
} }

View File

@@ -1,87 +0,0 @@
import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
// Mock the Anthropic-API-side before importing the module under test, so
// queryHaiku resolves into whatever the individual test wants (slow, failing,
// or successful). We preserve every other export from claude.js so unrelated
// transitive imports still work.
const haikuMock = mock()
beforeEach(async () => {
haikuMock.mockReset()
const actual = await import('../../services/api/claude.js')
mock.module('../../services/api/claude.js', () => ({
...actual,
queryHaiku: haikuMock,
}))
})
afterEach(() => {
mock.restore()
})
async function runApply(markdown = 'Hello world.', signal?: AbortSignal): Promise<string> {
const nonce = `${Date.now()}-${Math.random()}`
const { applyPromptToMarkdown } =
await import(`./utils.js?ts=${nonce}`)
const ctrl = new AbortController()
return applyPromptToMarkdown(
'summarize',
markdown,
signal ?? ctrl.signal,
false,
false,
)
}
test('returns raw truncated markdown when queryHaiku throws', async () => {
haikuMock.mockImplementation(async () => {
throw new Error('MiniMax rejected the model name')
})
const output = await runApply('Gitlawb homepage content.')
expect(output).toContain('[Secondary-model summarization unavailable')
expect(output).toContain('Gitlawb homepage content.')
})
test('returns raw truncated markdown when queryHaiku simulates a timeout', async () => {
// Simulating raceWithTimeout's rejection path directly — we can't actually
// wait 45s in a test. The error shape matches what raceWithTimeout produces.
haikuMock.mockImplementation(async () => {
const err = new Error('Secondary-model summarization timed out after 45000ms')
;(err as NodeJS.ErrnoException).code = 'SECONDARY_MODEL_TIMEOUT'
throw err
})
const output = await runApply('Slow provider content.')
expect(output).toContain('[Secondary-model summarization unavailable')
expect(output).toContain('Slow provider content.')
})
test('returns the model response when queryHaiku succeeds', async () => {
haikuMock.mockImplementation(async () => ({
message: {
content: [{ type: 'text', text: 'This page is about GitLawb, an AI legal platform.' }],
},
}))
const output = await runApply('some page content')
expect(output).toBe('This page is about GitLawb, an AI legal platform.')
})
test('returns fallback when queryHaiku resolves with empty content', async () => {
haikuMock.mockImplementation(async () => ({ message: { content: [] } }))
const output = await runApply('some page content')
expect(output).toContain('[Secondary-model summarization unavailable')
expect(output).toContain('some page content')
})
test('propagates AbortError from the caller signal', async () => {
const ctrl = new AbortController()
haikuMock.mockImplementation(async () => {
ctrl.abort()
return new Promise(() => {})
})
await expect(runApply('content', ctrl.signal)).rejects.toThrow()
})

View File

@@ -20,11 +20,8 @@ afterEach(() => {
describe('checkDomainBlocklist', () => { describe('checkDomainBlocklist', () => {
test('returns allowed without API call in OpenAI mode', async () => { test('returns allowed without API call in OpenAI mode', async () => {
process.env.CLAUDE_CODE_USE_OPENAI = '1' process.env.CLAUDE_CODE_USE_OPENAI = '1'
const actual = await import('../../utils/model/providers.js')
mock.module('../../utils/model/providers.js', () => ({ mock.module('../../utils/model/providers.js', () => ({
...actual,
getAPIProvider: () => 'openai', getAPIProvider: () => 'openai',
isFirstPartyAnthropicBaseUrl: () => false,
})) }))
const getSpy = mock(() => const getSpy = mock(() =>
Promise.resolve({ status: 200, data: { can_fetch: true } }), Promise.resolve({ status: 200, data: { can_fetch: true } }),
@@ -40,11 +37,8 @@ describe('checkDomainBlocklist', () => {
test('returns allowed without API call in Gemini mode', async () => { test('returns allowed without API call in Gemini mode', async () => {
process.env.CLAUDE_CODE_USE_GEMINI = '1' process.env.CLAUDE_CODE_USE_GEMINI = '1'
const actual = await import('../../utils/model/providers.js')
mock.module('../../utils/model/providers.js', () => ({ mock.module('../../utils/model/providers.js', () => ({
...actual,
getAPIProvider: () => 'gemini', getAPIProvider: () => 'gemini',
isFirstPartyAnthropicBaseUrl: () => false,
})) }))
const getSpy = mock(() => const getSpy = mock(() =>
Promise.resolve({ status: 200, data: { can_fetch: true } }), Promise.resolve({ status: 200, data: { can_fetch: true } }),
@@ -63,11 +57,8 @@ describe('checkDomainBlocklist', () => {
delete process.env.CLAUDE_CODE_USE_GEMINI delete process.env.CLAUDE_CODE_USE_GEMINI
delete process.env.CLAUDE_CODE_USE_GITHUB delete process.env.CLAUDE_CODE_USE_GITHUB
const actual = await import('../../utils/model/providers.js')
mock.module('../../utils/model/providers.js', () => ({ mock.module('../../utils/model/providers.js', () => ({
...actual,
getAPIProvider: () => 'firstParty', getAPIProvider: () => 'firstParty',
isFirstPartyAnthropicBaseUrl: () => true,
})) }))
const getSpy = mock(() => const getSpy = mock(() =>
Promise.resolve({ status: 200, data: { can_fetch: true } }), Promise.resolve({ status: 200, data: { can_fetch: true } }),

View File

@@ -275,76 +275,20 @@ export async function getWithPermittedRedirects(
if (depth > MAX_REDIRECTS) { if (depth > MAX_REDIRECTS) {
throw new Error(`Too many redirects (exceeded ${MAX_REDIRECTS})`) throw new Error(`Too many redirects (exceeded ${MAX_REDIRECTS})`)
} }
const axiosConfig = {
signal,
timeout: FETCH_TIMEOUT_MS,
maxRedirects: 0,
responseType: 'arraybuffer' as const,
maxContentLength: MAX_HTTP_CONTENT_LENGTH,
lookup: ssrfGuardedLookup,
headers: {
Accept: 'text/markdown, text/html, */*',
'User-Agent': getWebFetchUserAgent(),
},
}
try { try {
return await axios.get(url, axiosConfig) return await axios.get(url, {
signal,
timeout: FETCH_TIMEOUT_MS,
maxRedirects: 0,
responseType: 'arraybuffer',
maxContentLength: MAX_HTTP_CONTENT_LENGTH,
lookup: ssrfGuardedLookup,
headers: {
Accept: 'text/markdown, text/html, */*',
'User-Agent': getWebFetchUserAgent(),
},
})
} catch (error) { } catch (error) {
// Try native fetch as a fallback for timeout / network errors
// (Bun/Node bundled contexts occasionally hang with axios + custom lookup.)
const isTimeoutLike =
axios.isAxiosError(error) &&
(!error.response &&
(error.code === 'ECONNABORTED' ||
error.code === 'ETIMEDOUT' ||
error.message?.toLowerCase().includes('timeout')))
if (isTimeoutLike && !signal.aborted) {
try {
const fetchResponse = await fetch(url, {
signal,
redirect: 'manual',
headers: axiosConfig.headers,
})
// Handle redirects manually
if ([301, 302, 307, 308].includes(fetchResponse.status)) {
const redirectLocation = fetchResponse.headers.get('location')
if (!redirectLocation) {
throw new Error('Redirect missing Location header')
}
const redirectUrl = new URL(redirectLocation, url).toString()
if (redirectChecker(url, redirectUrl)) {
return getWithPermittedRedirects(
redirectUrl,
signal,
redirectChecker,
depth + 1,
)
} else {
return {
type: 'redirect' as const,
originalUrl: url,
redirectUrl,
statusCode: fetchResponse.status,
}
}
}
const arrayBuffer = await fetchResponse.arrayBuffer()
// Build an AxiosResponse-like shape so downstream code stays happy
return {
data: new Uint8Array(arrayBuffer),
status: fetchResponse.status,
statusText: fetchResponse.statusText,
headers: Object.fromEntries(fetchResponse.headers.entries()),
config: axiosConfig,
request: undefined,
} as unknown as AxiosResponse<ArrayBuffer>
} catch {
// Fall through to original error handling
}
}
if ( if (
axios.isAxiosError(error) && axios.isAxiosError(error) &&
error.response && error.response &&
@@ -545,58 +489,6 @@ export async function getURLMarkdownContent(
return entry return entry
} }
// Budget for the secondary-model summarization after fetch. If the small-
// fast model is slow (e.g. a 200k-context third-party running a reasoning
// pass over ~100KB of markdown), we'd rather fall back to raw truncated
// markdown than hang the tool. Also keeps the worst-case WebFetch bounded
// to FETCH_TIMEOUT_MS + SECONDARY_MODEL_TIMEOUT_MS regardless of provider.
const SECONDARY_MODEL_TIMEOUT_MS = 45_000
function raceWithTimeout<T>(
promise: Promise<T>,
timeoutMs: number,
signal: AbortSignal,
): Promise<T> {
return new Promise<T>((resolve, reject) => {
const timer = setTimeout(() => {
const err = new Error(`Secondary-model summarization timed out after ${timeoutMs}ms`)
;(err as NodeJS.ErrnoException).code = 'SECONDARY_MODEL_TIMEOUT'
reject(err)
}, timeoutMs)
const onAbort = () => {
clearTimeout(timer)
reject(new AbortError())
}
if (signal.aborted) {
clearTimeout(timer)
reject(new AbortError())
return
}
signal.addEventListener('abort', onAbort, { once: true })
promise.then(
value => {
clearTimeout(timer)
signal.removeEventListener('abort', onAbort)
resolve(value)
},
err => {
clearTimeout(timer)
signal.removeEventListener('abort', onAbort)
reject(err)
},
)
})
}
function buildFallbackMarkdownSummary(truncatedContent: string): string {
return [
'[Secondary-model summarization unavailable — returning raw fetched content.',
'This typically means the configured small-fast model took too long or errored.]',
'',
truncatedContent,
].join('\n')
}
export async function applyPromptToMarkdown( export async function applyPromptToMarkdown(
prompt: string, prompt: string,
markdownContent: string, markdownContent: string,
@@ -616,35 +508,18 @@ export async function applyPromptToMarkdown(
prompt, prompt,
isPreapprovedDomain, isPreapprovedDomain,
) )
let assistantMessage const assistantMessage = await queryHaiku({
try { systemPrompt: asSystemPrompt([]),
assistantMessage = await raceWithTimeout( userPrompt: modelPrompt,
queryHaiku({ signal,
systemPrompt: asSystemPrompt([]), options: {
userPrompt: modelPrompt, querySource: 'web_fetch_apply',
signal, agents: [],
options: { isNonInteractiveSession,
querySource: 'web_fetch_apply', hasAppendSystemPrompt: false,
agents: [], mcpTools: [],
isNonInteractiveSession, },
hasAppendSystemPrompt: false, })
mcpTools: [],
},
}),
SECONDARY_MODEL_TIMEOUT_MS,
signal,
)
} catch (err) {
// User interrupts and SIGINTs still propagate. Everything else (timeout,
// provider-side error, unsupported model on third-party endpoint) falls
// back to raw markdown so the user still gets usable content rather than
// a hang. Log so it's visible in debug traces.
if (err instanceof AbortError || (err as Error)?.name === 'AbortError') {
throw err
}
logError(err)
return buildFallbackMarkdownSummary(truncatedContent)
}
// We need to bubble this up, so that the tool call throws, causing us to return // We need to bubble this up, so that the tool call throws, causing us to return
// an is_error tool_use block to the server, and render a red dot in the UI. // an is_error tool_use block to the server, and render a red dot in the UI.
@@ -659,5 +534,5 @@ export async function applyPromptToMarkdown(
return contentBlock.text return contentBlock.text
} }
} }
return buildFallbackMarkdownSummary(truncatedContent) return 'No response from model'
} }

View File

@@ -203,61 +203,6 @@ function buildCodexWebSearchInstructions(): string {
].join(' ') ].join(' ')
} }
function pushCodexTextResult(
results: (SearchResult | string)[],
value: unknown,
): void {
if (typeof value !== 'string') return
const trimmed = value.trim()
if (trimmed) {
results.push(trimmed)
}
}
function addCodexSource(
sourceMap: Map<string, { title: string; url: string }>,
source: unknown,
): void {
if (typeof source?.url !== 'string' || !source.url) return
sourceMap.set(source.url, {
title:
typeof source.title === 'string' && source.title
? source.title
: source.url,
url: source.url,
})
}
function getCodexSources(item: Record<string, any>): unknown[] {
if (Array.isArray(item.action?.sources)) {
return item.action.sources
}
if (Array.isArray(item.sources)) {
return item.sources
}
if (Array.isArray(item.result?.sources)) {
return item.result.sources
}
return []
}
function extractCodexWebSearchFailure(item: Record<string, any>): string | undefined {
// Codex web_search_call items can carry a status field. When the tool
// call fails (rate limit, upstream error, model-side guardrail), the
// parser should surface a meaningful error rather than the generic
// "No results found." fallback. Shape observed across recent payloads:
// { type: 'web_search_call', status: 'failed', error: { message?: string } }
// { type: 'web_search_call', status: 'failed', action: { error?: { message?: string } } }
if (item?.status !== 'failed') return undefined
const reason =
(typeof item.error?.message === 'string' && item.error.message) ||
(typeof item.action?.error?.message === 'string' &&
item.action.error.message) ||
(typeof item.error === 'string' && item.error) ||
undefined
return reason ? `Web search failed: ${reason}` : 'Web search failed.'
}
function makeOutputFromCodexWebSearchResponse( function makeOutputFromCodexWebSearchResponse(
response: Record<string, unknown>, response: Record<string, unknown>,
query: string, query: string,
@@ -269,12 +214,18 @@ function makeOutputFromCodexWebSearchResponse(
for (const item of output) { for (const item of output) {
if (item?.type === 'web_search_call') { if (item?.type === 'web_search_call') {
const failure = extractCodexWebSearchFailure(item) const sources = Array.isArray(item.action?.sources)
if (failure) { ? item.action.sources
results.push(failure) : []
} for (const source of sources) {
for (const source of getCodexSources(item)) { if (typeof source?.url !== 'string' || !source.url) continue
addCodexSource(sourceMap, source) sourceMap.set(source.url, {
title:
typeof source.title === 'string' && source.title
? source.title
: source.url,
url: source.url,
})
} }
continue continue
} }
@@ -284,12 +235,11 @@ function makeOutputFromCodexWebSearchResponse(
} }
for (const part of item.content) { for (const part of item.content) {
if (part?.type === 'output_text' || part?.type === 'text') { if (part?.type === 'output_text' && typeof part.text === 'string') {
pushCodexTextResult(results, part.text) const trimmed = part.text.trim()
} if (trimmed) {
results.push(trimmed)
for (const source of getCodexSources(part)) { }
addCodexSource(sourceMap, source)
} }
const annotations = Array.isArray(part?.annotations) const annotations = Array.isArray(part?.annotations)
@@ -297,13 +247,23 @@ function makeOutputFromCodexWebSearchResponse(
: [] : []
for (const annotation of annotations) { for (const annotation of annotations) {
if (annotation?.type !== 'url_citation') continue if (annotation?.type !== 'url_citation') continue
addCodexSource(sourceMap, annotation) if (typeof annotation.url !== 'string' || !annotation.url) continue
sourceMap.set(annotation.url, {
title:
typeof annotation.title === 'string' && annotation.title
? annotation.title
: annotation.url,
url: annotation.url,
})
} }
} }
} }
if (results.length === 0) { if (results.length === 0 && typeof response.output_text === 'string') {
pushCodexTextResult(results, response.output_text) const trimmed = response.output_text.trim()
if (trimmed) {
results.push(trimmed)
}
} }
if (sourceMap.size > 0) { if (sourceMap.size > 0) {
@@ -313,10 +273,6 @@ function makeOutputFromCodexWebSearchResponse(
}) })
} }
if (results.length === 0) {
results.push('No results found.')
}
return { return {
query, query,
results, results,
@@ -324,10 +280,6 @@ function makeOutputFromCodexWebSearchResponse(
} }
} }
export const __test = {
makeOutputFromCodexWebSearchResponse,
}
async function runCodexWebSearch( async function runCodexWebSearch(
input: Input, input: Input,
signal: AbortSignal, signal: AbortSignal,
@@ -505,19 +457,6 @@ function shouldUseAdapterProvider(): boolean {
return getAvailableProviders().length > 0 return getAvailableProviders().length > 0
} }
/**
* Returns true when the current provider has a working native or Codex
* web-search fallback after an adapter failure. OpenAI shim providers
* (moonshot, minimax, nvidia-nim, openai, github, etc.) do NOT support
* Anthropic's web_search_20250305 tool, so falling through to the native
* path silently produces "Did 0 searches".
*/
function hasNativeSearchFallback(): boolean {
if (isCodexResponsesWebSearchEnabled()) return true
const provider = getAPIProvider()
return provider === 'firstParty' || provider === 'vertex' || provider === 'foundry'
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Tool export // Tool export
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -670,17 +609,6 @@ export const WebSearchTool = buildTool({
// Auto mode: only fall through on transient errors (network, timeout, 5xx). // Auto mode: only fall through on transient errors (network, timeout, 5xx).
// Config / guardrail errors (SSRF, HTTPS, bad URL, etc.) must surface. // Config / guardrail errors (SSRF, HTTPS, bad URL, etc.) must surface.
if (!isTransientError(err)) throw err if (!isTransientError(err)) throw err
// No viable fallback for this provider — surface the adapter error
// instead of falling through to a broken native path.
if (!hasNativeSearchFallback()) {
const provider = getAPIProvider()
const errMsg = err instanceof Error ? err.message : String(err)
throw new Error(
`Web search is unavailable for provider "${provider}". ` +
`The search adapter failed (${errMsg}). ` +
`Try switching to a provider with built-in web search (e.g. Anthropic, Codex) or try again later.`,
)
}
console.error( console.error(
`[web-search] Adapter failed, falling through to native: ${err}`, `[web-search] Adapter failed, falling through to native: ${err}`,
) )

View File

@@ -1,44 +1,6 @@
import type { SearchInput, SearchProvider } from './types.js' import type { SearchInput, SearchProvider } from './types.js'
import { applyDomainFilters, type ProviderOutput } from './types.js' import { applyDomainFilters, type ProviderOutput } from './types.js'
// DuckDuckGo's HTML scraper aggressively blocks datacenter / repeat IPs with
// an "anomaly in the request" response. When that happens we surface an
// actionable error instead of the opaque scraper message so users know how
// to configure a working backend.
const DDG_ANOMALY_HINT =
'DuckDuckGo scraping is rate-limited from this network. ' +
'Configure a search backend with one of: ' +
'FIRECRAWL_API_KEY, TAVILY_API_KEY, EXA_API_KEY, YOU_API_KEY, ' +
'JINA_API_KEY, BING_API_KEY, MOJEEK_API_KEY, LINKUP_API_KEY — ' +
'or use an Anthropic / Vertex / Foundry provider for native web search.'
const MAX_RETRIES = 3
const INITIAL_BACKOFF_MS = 1000
function isAnomalyError(message: string): boolean {
return /anomaly in the request|likely making requests too quickly/i.test(
message,
)
}
function isRetryableDDGError(err: unknown): boolean {
if (!(err instanceof Error)) return false
const msg = err.message.toLowerCase()
return (
msg.includes('anomaly') ||
msg.includes('too quickly') ||
msg.includes('rate limit') ||
msg.includes('timeout') ||
msg.includes('econnreset') ||
msg.includes('etimedout') ||
msg.includes('econnaborted')
)
}
function sleep(ms: number): Promise<void> {
return new Promise(r => setTimeout(r, ms))
}
export const duckduckgoProvider: SearchProvider = { export const duckduckgoProvider: SearchProvider = {
name: 'duckduckgo', name: 'duckduckgo',
@@ -57,44 +19,22 @@ export const duckduckgoProvider: SearchProvider = {
throw new Error('duck-duck-scrape package not installed. Run: npm install duck-duck-scrape') throw new Error('duck-duck-scrape package not installed. Run: npm install duck-duck-scrape')
} }
if (signal?.aborted) throw new DOMException('Aborted', 'AbortError') if (signal?.aborted) throw new DOMException('Aborted', 'AbortError')
// TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches
const response = await search(input.query, { safeSearch: SafeSearchType.STRICT })
let lastErr: unknown const hits = applyDomainFilters(
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) { response.results.map(r => ({
if (signal?.aborted) throw new DOMException('Aborted', 'AbortError') title: r.title || r.url,
try { url: r.url,
// TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches description: r.description ?? undefined,
const response = await search(input.query, { safeSearch: SafeSearchType.STRICT }) })),
input,
)
const hits = applyDomainFilters( return {
response.results.map(r => ({ hits,
title: r.title || r.url, providerName: 'duckduckgo',
url: r.url, durationSeconds: (performance.now() - start) / 1000,
description: r.description ?? undefined,
})),
input,
)
return {
hits,
providerName: 'duckduckgo',
durationSeconds: (performance.now() - start) / 1000,
}
} catch (err) {
lastErr = err
const msg = err instanceof Error ? err.message : String(err)
if (isAnomalyError(msg)) {
throw new Error(DDG_ANOMALY_HINT)
}
if (!isRetryableDDGError(err) || attempt === MAX_RETRIES - 1) {
throw err
}
// Exponential backoff with jitter: 1s, 2s, 4s +/- 20%
const baseDelay = INITIAL_BACKOFF_MS * Math.pow(2, attempt)
const jitter = baseDelay * 0.2 * (Math.random() * 2 - 1)
await sleep(baseDelay + jitter)
}
} }
throw lastErr
}, },
} }

View File

@@ -244,7 +244,6 @@ export type GlobalConfig = {
bypassPermissionsModeAccepted?: boolean bypassPermissionsModeAccepted?: boolean
hasUsedBackslashReturn?: boolean hasUsedBackslashReturn?: boolean
autoCompactEnabled: boolean // Controls whether auto-compact is enabled autoCompactEnabled: boolean // Controls whether auto-compact is enabled
toolHistoryCompressionEnabled: boolean // Compress old tool_result content for small-context providers
showTurnDuration: boolean // Controls whether to show turn duration message (e.g., "Cooked for 1m 6s") showTurnDuration: boolean // Controls whether to show turn duration message (e.g., "Cooked for 1m 6s")
/** /**
* @deprecated Use settings.env instead. * @deprecated Use settings.env instead.
@@ -623,7 +622,6 @@ function createDefaultGlobalConfig(): GlobalConfig {
verbose: false, verbose: false,
editorMode: 'normal', editorMode: 'normal',
autoCompactEnabled: true, autoCompactEnabled: true,
toolHistoryCompressionEnabled: true,
showTurnDuration: true, showTurnDuration: true,
hasSeenTasksHint: false, hasSeenTasksHint: false,
hasUsedStash: false, hasUsedStash: false,
@@ -670,7 +668,6 @@ export const GLOBAL_CONFIG_KEYS = [
'editorMode', 'editorMode',
'hasUsedBackslashReturn', 'hasUsedBackslashReturn',
'autoCompactEnabled', 'autoCompactEnabled',
'toolHistoryCompressionEnabled',
'showTurnDuration', 'showTurnDuration',
'diffTool', 'diffTool',
'env', 'env',

View File

@@ -12,12 +12,7 @@ export const MODEL_CONTEXT_WINDOW_DEFAULT = 200_000
// Fallback context window for unknown 3P models. Must be large enough that // Fallback context window for unknown 3P models. Must be large enough that
// the effective context (this minus output token reservation) stays positive, // the effective context (this minus output token reservation) stays positive,
// otherwise auto-compact fires on every message (issue #635). // otherwise auto-compact fires on every message (issue #635).
// Override via CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW env var to avoid export const OPENAI_FALLBACK_CONTEXT_WINDOW = 128_000
// hardcoding when deploying models not yet in openaiContextWindows.ts.
export const OPENAI_FALLBACK_CONTEXT_WINDOW = (() => {
const v = parseInt(process.env.CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW ?? '', 10)
return !isNaN(v) && v > 0 ? v : 128_000
})()
// Maximum output tokens for compact operations // Maximum output tokens for compact operations
export const COMPACT_MAX_OUTPUT_TOKENS = 20_000 export const COMPACT_MAX_OUTPUT_TOKENS = 20_000

View File

@@ -1,357 +0,0 @@
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import { mkdtemp, rm, writeFile } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
type HookChainsModule = typeof import('./hookChains.js')
type ImportHarnessOptions = {
allowRemoteSessions?: boolean
teamFile?:
| {
name: string
members: Array<{ name: string }>
}
| null
teamName?: string
senderName?: string
replBridgeHandle?: unknown
}
const tempDirs: string[] = []
const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
async function createConfigFile(config: unknown): Promise<string> {
const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-int-'))
tempDirs.push(dir)
const filePath = join(dir, 'hook-chains.json')
await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
return filePath
}
async function importHookChainsHarness(
options: ImportHarnessOptions = {},
): Promise<{
mod: HookChainsModule
writeToMailboxSpy: ReturnType<typeof mock>
agentToolCallSpy: ReturnType<typeof mock>
}> {
mock.restore()
const allowRemoteSessions = options.allowRemoteSessions ?? true
const teamName = options.teamName ?? 'mesh-team'
const senderName = options.senderName ?? 'mesh-lead'
const replBridgeHandle = options.replBridgeHandle ?? null
const writeToMailboxSpy = mock(async () => {})
const agentToolCallSpy = mock(async () => ({
data: {
status: 'async_launched',
agentId: 'agent-fallback-1',
},
}))
mock.module('../services/analytics/index.js', () => ({
logEvent: () => {},
}))
mock.module('./telemetry/events.js', () => ({
logOTelEvent: async () => {},
}))
mock.module('../services/policyLimits/index.js', () => ({
isPolicyAllowed: () => allowRemoteSessions,
}))
mock.module('./swarm/teamHelpers.js', () => ({
readTeamFileAsync: async () => options.teamFile ?? null,
}))
mock.module('./teammateMailbox.js', () => ({
writeToMailbox: writeToMailboxSpy,
}))
mock.module('./teammate.js', () => ({
getAgentName: () => senderName,
getTeamName: () => teamName,
getTeammateColor: () => 'blue',
// Keep parity with the real module's surface so later tests that
// run after this file (mock.module is process-global and mock.restore
// does not undo module mocks in Bun) do not see undefined members.
isTeammate: () => false,
isPlanModeRequired: () => false,
getAgentId: () => undefined,
getParentSessionId: () => undefined,
}))
mock.module('../bridge/replBridgeHandle.js', () => ({
getReplBridgeHandle: () => replBridgeHandle,
}))
// Integration mock target requested in the task: fallback action can route
// through this mocked tool launcher from runtime callback wiring.
mock.module('../tools/AgentTool/AgentTool.js', () => ({
AgentTool: {
call: agentToolCallSpy,
},
}))
const mod = await import(`./hookChains.js?integration=${Date.now()}-${Math.random()}`)
return { mod, writeToMailboxSpy, agentToolCallSpy }
}
beforeEach(() => {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
})
afterEach(async () => {
mock.restore()
if (originalHookChainsEnabled === undefined) {
delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
} else {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
}
await Promise.all(
tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
)
})
describe('hookChains integration dispatch', () => {
test('end-to-end rule evaluation + action dispatch on TaskCompleted failure', async () => {
const { mod } = await importHookChainsHarness({
teamName: 'mesh-team',
senderName: 'mesh-lead',
teamFile: {
name: 'mesh-team',
members: [{ name: 'mesh-lead' }, { name: 'worker-a' }, { name: 'worker-b' }],
},
})
const configPath = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'task-failure-recovery',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{ type: 'spawn_fallback_agent' },
{ type: 'notify_team' },
],
},
],
})
const spawnSpy = mock(async () => ({ launched: true, agentId: 'agent-e2e-1' }))
const notifySpy = mock(async () => ({ sent: true, recipientCount: 2 }))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: {
task_id: 'task-001',
task_subject: 'Patch flaky build',
error: 'CI timeout',
},
},
runtime: {
onSpawnFallbackAgent: spawnSpy,
onNotifyTeam: notifySpy,
},
})
expect(result.enabled).toBe(true)
expect(result.matchedRuleIds).toEqual(['task-failure-recovery'])
expect(result.actionResults).toHaveLength(2)
expect(result.actionResults[0]?.status).toBe('executed')
expect(result.actionResults[1]?.status).toBe('executed')
expect(spawnSpy).toHaveBeenCalledTimes(1)
expect(notifySpy).toHaveBeenCalledTimes(1)
})
test('fallback spawn injects failure context into generated prompt', async () => {
const { mod, agentToolCallSpy } = await importHookChainsHarness()
const configPath = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'fallback-context',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{
type: 'spawn_fallback_agent',
description: 'Fallback for failed task',
},
],
},
],
})
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: {
task_id: 'task-ctx-1',
task_subject: 'Repair migration guard',
task_description: 'Fix regression in check ordering',
error: 'Task failed after retry budget exhausted',
},
},
runtime: {
onSpawnFallbackAgent: async request => {
const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
await (AgentTool.call as unknown as (...args: unknown[]) => Promise<unknown>)({
prompt: request.prompt,
description: request.description,
run_in_background: request.runInBackground,
subagent_type: request.agentType,
model: request.model,
})
return { launched: true, agentId: 'agent-fallback-ctx' }
},
},
})
expect(result.actionResults[0]?.status).toBe('executed')
expect(agentToolCallSpy).toHaveBeenCalledTimes(1)
const callInput = agentToolCallSpy.mock.calls[0]?.[0] as {
prompt: string
description: string
run_in_background: boolean
}
expect(callInput.description).toBe('Fallback for failed task')
expect(callInput.run_in_background).toBe(true)
expect(callInput.prompt).toContain('Event: TaskCompleted')
expect(callInput.prompt).toContain('Outcome: failed')
expect(callInput.prompt).toContain('Task subject: Repair migration guard')
expect(callInput.prompt).toContain('Failure details: Task failed after retry budget exhausted')
})
test('notify_team dispatches mailbox writes when team exists and skips when absent', async () => {
const withTeam = await importHookChainsHarness({
teamName: 'mesh-a',
senderName: 'lead-a',
teamFile: {
name: 'mesh-a',
members: [{ name: 'lead-a' }, { name: 'worker-1' }, { name: 'worker-2' }],
},
})
const configPathWithTeam = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'notify-existing-team',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'notify_team' }],
},
],
})
const withTeamResult = await withTeam.mod.dispatchHookChainsForEvent({
configPathOverride: configPathWithTeam,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-team-ok', error: 'boom' },
},
})
expect(withTeamResult.actionResults[0]?.status).toBe('executed')
expect(withTeam.writeToMailboxSpy).toHaveBeenCalledTimes(2)
const recipients = withTeam.writeToMailboxSpy.mock.calls.map(
call => call[0] as string,
)
expect(recipients.sort()).toEqual(['worker-1', 'worker-2'])
const withoutTeam = await importHookChainsHarness({
teamName: 'mesh-missing',
senderName: 'lead-missing',
teamFile: null,
})
const configPathWithoutTeam = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'notify-missing-team',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'notify_team' }],
},
],
})
const withoutTeamResult = await withoutTeam.mod.dispatchHookChainsForEvent({
configPathOverride: configPathWithoutTeam,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-team-missing', error: 'boom' },
},
})
expect(withoutTeamResult.actionResults[0]?.status).toBe('skipped')
expect(withoutTeamResult.actionResults[0]?.reason).toContain('Team file not found')
expect(withoutTeam.writeToMailboxSpy).not.toHaveBeenCalled()
})
test('warm_remote_capacity is a safe no-op when bridge is inactive', async () => {
const { mod } = await importHookChainsHarness({
allowRemoteSessions: true,
replBridgeHandle: null,
})
const configPath = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'bridge-warmup-noop',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'warm_remote_capacity' }],
},
],
})
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-warm-1' },
},
})
expect(result.actionResults).toHaveLength(1)
expect(result.actionResults[0]?.status).toBe('skipped')
expect(result.actionResults[0]?.reason).toContain('Bridge is not active')
})
})

View File

@@ -1,476 +0,0 @@
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import { mkdtemp, rm, writeFile } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
type HookChainsModule = typeof import('./hookChains.js')
const tempDirs: string[] = []
const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
async function makeConfigFile(config: unknown): Promise<string> {
const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-'))
tempDirs.push(dir)
const filePath = join(dir, 'hook-chains.json')
await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
return filePath
}
async function importHookChainsModule(options?: {
allowRemoteSessions?: boolean
}): Promise<HookChainsModule> {
mock.restore()
const allowRemoteSessions = options?.allowRemoteSessions ?? true
mock.module('../services/analytics/index.js', () => ({
logEvent: () => {},
}))
mock.module('./telemetry/events.js', () => ({
logOTelEvent: async () => {},
}))
mock.module('../services/policyLimits/index.js', () => ({
isPolicyAllowed: () => allowRemoteSessions,
}))
return import(`./hookChains.js?test=${Date.now()}-${Math.random()}`)
}
beforeEach(() => {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
})
afterEach(async () => {
mock.restore()
if (originalHookChainsEnabled === undefined) {
delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
} else {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
}
await Promise.all(
tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
)
})
describe('hookChains schema validation', () => {
test('returns disabled config when env gate is unset', async () => {
delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
rules: [
{
id: 'env-gated-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(loaded.exists).toBe(false)
expect(loaded.config.enabled).toBe(false)
expect(loaded.config.rules).toHaveLength(0)
})
test('loads valid config and memoizes by mtime/size', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 5000,
defaultDedupWindowMs: 5000,
rules: [
{
id: 'task-failure-fallback',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{
type: 'spawn_fallback_agent',
description: 'Fallback recovery agent',
},
],
},
],
})
const first = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(first.exists).toBe(true)
expect(first.error).toBeUndefined()
expect(first.fromCache).toBe(false)
expect(first.config.enabled).toBe(true)
expect(first.config.rules).toHaveLength(1)
expect(first.config.rules[0]?.id).toBe('task-failure-fallback')
const second = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(second.exists).toBe(true)
expect(second.error).toBeUndefined()
expect(second.fromCache).toBe(true)
expect(second.config.rules).toHaveLength(1)
})
test('accepts wrapped { hookChains: ... } config shape', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
hookChains: {
version: 1,
enabled: true,
rules: [
{
id: 'wrapped-shape',
trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
actions: [{ type: 'notify_team' }],
},
],
},
})
const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(loaded.error).toBeUndefined()
expect(loaded.config.enabled).toBe(true)
expect(loaded.config.rules[0]?.id).toBe('wrapped-shape')
})
test('returns disabled config for invalid schema', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
rules: [
{
id: 'invalid-rule',
trigger: {
event: 'TaskCompleted',
outcome: 'failed',
outcomes: ['failed'],
},
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(loaded.exists).toBe(true)
expect(loaded.error).toBeDefined()
expect(loaded.config.enabled).toBe(false)
expect(loaded.config.rules).toHaveLength(0)
})
})
describe('evaluateHookChainRules', () => {
test('matches by event + outcome + condition', async () => {
const mod = await importHookChainsModule()
const rules = [
{
id: 'post-tool-failure-rule',
trigger: { event: 'PostToolUseFailure', outcome: 'failed' },
condition: {
toolNames: ['Edit'],
errorIncludes: ['permission'],
eventFieldEquals: { 'meta.source': 'scheduler' },
},
actions: [{ type: 'spawn_fallback_agent' }],
},
]
const matches = mod.evaluateHookChainRules(rules as never, {
eventName: 'PostToolUseFailure',
outcome: 'failed',
payload: {
tool_name: 'Edit',
error: 'Permission denied by policy',
meta: { source: 'scheduler' },
},
})
expect(matches).toHaveLength(1)
expect(matches[0]?.rule.id).toBe('post-tool-failure-rule')
})
test('does not match when event/condition fail', async () => {
const mod = await importHookChainsModule()
const rules = [
{
id: 'rule-no-match',
trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
condition: { toolNames: ['Write'] },
actions: [{ type: 'spawn_fallback_agent' }],
},
]
const wrongEvent = mod.evaluateHookChainRules(rules as never, {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { tool_name: 'Write' },
})
expect(wrongEvent).toHaveLength(0)
const wrongCondition = mod.evaluateHookChainRules(rules as never, {
eventName: 'PostToolUseFailure',
outcome: 'failed',
payload: { tool_name: 'Edit' },
})
expect(wrongCondition).toHaveLength(0)
})
})
describe('dispatchHookChainsForEvent guard logic', () => {
test('dedup skips duplicate event/action within dedup window', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 4,
defaultCooldownMs: 0,
defaultDedupWindowMs: 60_000,
rules: [
{
id: 'dedup-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
cooldownMs: 0,
dedupWindowMs: 60_000,
actions: [{ id: 'spawn-1', type: 'spawn_fallback_agent' }],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-1' }))
const first = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-123', error: 'boom' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
const second = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-123', error: 'boom' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
expect(first.actionResults[0]?.status).toBe('executed')
expect(second.actionResults[0]?.status).toBe('skipped')
expect(second.actionResults[0]?.reason).toContain('dedup')
expect(spawn).toHaveBeenCalledTimes(1)
})
test('cooldown skips second dispatch when rule cooldown is active', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 4,
defaultCooldownMs: 60_000,
defaultDedupWindowMs: 0,
rules: [
{
id: 'cooldown-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
cooldownMs: 60_000,
dedupWindowMs: 0,
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-2' }))
const first = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-456' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
const second = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-789' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
expect(first.actionResults[0]?.status).toBe('executed')
expect(second.actionResults[0]?.status).toBe('skipped')
expect(second.actionResults[0]?.reason).toContain('cooldown')
expect(spawn).toHaveBeenCalledTimes(1)
})
test('depth limit blocks dispatch when chain depth reaches max', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 1,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'depth-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-3' }))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-depth' },
},
runtime: {
chainDepth: 1,
onSpawnFallbackAgent: spawn,
},
})
expect(result.enabled).toBe(true)
expect(result.matchedRuleIds).toHaveLength(0)
expect(result.actionResults).toHaveLength(0)
expect(spawn).not.toHaveBeenCalled()
})
})
describe('action dispatch skip scenarios', () => {
test('fails spawn_fallback_agent when launcher callback is missing', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'missing-launcher',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-missing-launcher' },
},
runtime: {},
})
expect(result.actionResults[0]?.status).toBe('failed')
expect(result.actionResults[0]?.reason).toContain('launcher')
})
test('skips disabled action and does not execute callback', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'disabled-action-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{
type: 'spawn_fallback_agent',
enabled: false,
},
],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-4' }))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-disabled' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
expect(result.actionResults[0]?.status).toBe('skipped')
expect(result.actionResults[0]?.reason).toContain('disabled')
expect(spawn).not.toHaveBeenCalled()
})
test('skips warm_remote_capacity when policy denies remote sessions', async () => {
const mod = await importHookChainsModule({ allowRemoteSessions: false })
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'policy-denied-remote-warm',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'warm_remote_capacity' }],
},
],
})
const warm = mock(async () => ({
warmed: true,
environmentId: 'env-123',
}))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-policy-denied' },
},
runtime: { onWarmRemoteCapacity: warm },
})
expect(result.actionResults[0]?.status).toBe('skipped')
expect(result.actionResults[0]?.reason).toContain('policy')
expect(warm).not.toHaveBeenCalled()
})
})

File diff suppressed because it is too large Load Diff

View File

@@ -10,7 +10,6 @@ import { wrapSpawn } from './ShellCommand.js'
import { TaskOutput } from './task/TaskOutput.js' import { TaskOutput } from './task/TaskOutput.js'
import { getCwd } from './cwd.js' import { getCwd } from './cwd.js'
import { randomUUID } from 'crypto' import { randomUUID } from 'crypto'
import { feature } from 'bun:bundle'
import { formatShellPrefixCommand } from './bash/shellPrefix.js' import { formatShellPrefixCommand } from './bash/shellPrefix.js'
import { import {
getHookEnvFilePath, getHookEnvFilePath,
@@ -135,7 +134,6 @@ import { registerPendingAsyncHook } from './hooks/AsyncHookRegistry.js'
import { enqueuePendingNotification } from './messageQueueManager.js' import { enqueuePendingNotification } from './messageQueueManager.js'
import { import {
extractTextContent, extractTextContent,
createAssistantMessage,
getLastAssistantMessage, getLastAssistantMessage,
wrapInSystemReminder, wrapInSystemReminder,
} from './messages.js' } from './messages.js'
@@ -147,7 +145,6 @@ import {
import { createAttachmentMessage } from './attachments.js' import { createAttachmentMessage } from './attachments.js'
import { all } from './generators.js' import { all } from './generators.js'
import { findToolByName, type Tools, type ToolUseContext } from '../Tool.js' import { findToolByName, type Tools, type ToolUseContext } from '../Tool.js'
import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
import { execPromptHook } from './hooks/execPromptHook.js' import { execPromptHook } from './hooks/execPromptHook.js'
import type { Message, AssistantMessage } from '../types/message.js' import type { Message, AssistantMessage } from '../types/message.js'
import { execAgentHook } from './hooks/execAgentHook.js' import { execAgentHook } from './hooks/execAgentHook.js'
@@ -165,147 +162,9 @@ import type { AppState } from '../state/AppState.js'
import { jsonStringify, jsonParse } from './slowOperations.js' import { jsonStringify, jsonParse } from './slowOperations.js'
import { isEnvTruthy } from './envUtils.js' import { isEnvTruthy } from './envUtils.js'
import { errorMessage, getErrnoCode } from './errors.js' import { errorMessage, getErrnoCode } from './errors.js'
import { getAgentName, getTeamName, getTeammateColor } from './teammate.js'
import type {
HookChainOutcome,
HookChainRuntimeContext,
SpawnFallbackAgentRequest,
SpawnFallbackAgentResponse,
} from './hookChains.js'
const TOOL_HOOK_EXECUTION_TIMEOUT_MS = 10 * 60 * 1000 const TOOL_HOOK_EXECUTION_TIMEOUT_MS = 10 * 60 * 1000
function normalizeFallbackAgentModel(
model: string | undefined,
): 'sonnet' | 'opus' | 'haiku' | undefined {
if (model === 'sonnet' || model === 'opus' || model === 'haiku') {
return model
}
return undefined
}
async function launchFallbackAgentFromHookChains(
request: SpawnFallbackAgentRequest,
toolUseContext: ToolUseContext,
canUseTool: CanUseToolFn,
): Promise<SpawnFallbackAgentResponse> {
try {
const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
const normalizedModel = normalizeFallbackAgentModel(request.model)
const result = await AgentTool.call(
{
prompt: request.prompt,
description: request.description,
run_in_background: true,
...(request.agentType ? { subagent_type: request.agentType } : {}),
...(normalizedModel ? { model: normalizedModel } : {}),
},
toolUseContext,
canUseTool,
createAssistantMessage({ content: [] }),
)
const data = result.data as
| {
status?: string
agentId?: string
agent_id?: string
}
| undefined
const status = data?.status
if (
status === 'async_launched' ||
status === 'completed' ||
status === 'remote_launched' ||
status === 'teammate_spawned'
) {
return {
launched: true,
agentId: data?.agentId ?? data?.agent_id,
}
}
return {
launched: true,
reason:
status !== undefined
? `Fallback launched with status ${status}`
: undefined,
}
} catch (error) {
return {
launched: false,
reason: `Fallback launch failed: ${errorMessage(error)}`,
}
}
}
async function dispatchHookChainFromHookRuntime(args: {
eventName: 'PostToolUseFailure' | 'TaskCompleted'
outcome: HookChainOutcome
payload: Record<string, unknown>
signal?: AbortSignal
toolUseContext?: ToolUseContext
}): Promise<void> {
try {
if (!feature('HOOK_CHAINS')) {
return
}
const { dispatchHookChainsForEvent } = await import('./hookChains.js')
const runtime: HookChainRuntimeContext = {
signal: args.signal,
senderName: getAgentName() ?? undefined,
senderColor: getTeammateColor() ?? undefined,
teamName: getTeamName() ?? undefined,
}
const chainDepth = args.toolUseContext?.queryTracking?.depth
if (typeof chainDepth === 'number' && Number.isFinite(chainDepth)) {
runtime.chainDepth = chainDepth
}
const hookChainsCanUseTool = (
args.toolUseContext as
| (ToolUseContext & { hookChainsCanUseTool?: CanUseToolFn })
| undefined
)?.hookChainsCanUseTool
if (args.toolUseContext) {
runtime.onSpawnFallbackAgent = request => {
if (!hookChainsCanUseTool) {
return Promise.resolve({
launched: false,
reason:
'Fallback action requires canUseTool in this hook runtime context',
})
}
return launchFallbackAgentFromHookChains(
request,
args.toolUseContext!,
hookChainsCanUseTool,
)
}
}
await dispatchHookChainsForEvent({
event: {
eventName: args.eventName,
outcome: args.outcome,
payload: args.payload,
},
runtime,
})
} catch (error) {
logForDebugging(
`[hook-chains] Dispatch failed for ${args.eventName}: ${errorMessage(error)}`,
)
}
}
/** /**
* SessionEnd hooks run during shutdown/clear and need a much tighter bound * SessionEnd hooks run during shutdown/clear and need a much tighter bound
* than TOOL_HOOK_EXECUTION_TIMEOUT_MS. This value is used by callers as both * than TOOL_HOOK_EXECUTION_TIMEOUT_MS. This value is used by callers as both
@@ -3643,11 +3502,9 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
): AsyncGenerator<AggregatedHookResult> { ): AsyncGenerator<AggregatedHookResult> {
const appState = toolUseContext.getAppState() const appState = toolUseContext.getAppState()
const sessionId = toolUseContext.agentId ?? getSessionId() const sessionId = toolUseContext.agentId ?? getSessionId()
const hasPostToolFailureHooks = hasHookForEvent( if (!hasHookForEvent('PostToolUseFailure', appState, sessionId)) {
'PostToolUseFailure', return
appState, }
sessionId,
)
const hookInput: PostToolUseFailureHookInput = { const hookInput: PostToolUseFailureHookInput = {
...createBaseHookInput(permissionMode, undefined, toolUseContext), ...createBaseHookInput(permissionMode, undefined, toolUseContext),
@@ -3659,33 +3516,12 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
is_interrupt: isInterrupt, is_interrupt: isInterrupt,
} }
let blockingHookCount = 0 yield* executeHooks({
hookInput,
if (hasPostToolFailureHooks) { toolUseID,
for await (const result of executeHooks({ matchQuery: toolName,
hookInput,
toolUseID,
matchQuery: toolName,
signal,
timeoutMs,
toolUseContext,
})) {
if (result.blockingError) {
blockingHookCount++
}
yield result
}
}
await dispatchHookChainFromHookRuntime({
eventName: 'PostToolUseFailure',
outcome: 'failed',
payload: {
...hookInput,
hook_blocking_error_count: blockingHookCount,
hook_execution_skipped: !hasPostToolFailureHooks,
},
signal, signal,
timeoutMs,
toolUseContext, toolUseContext,
}) })
} }
@@ -3971,36 +3807,12 @@ export async function* executeTaskCompletedHooks(
team_name: teamName, team_name: teamName,
} }
let blockingHookCount = 0 yield* executeHooks({
let preventedContinuation = false
for await (const result of executeHooks({
hookInput, hookInput,
toolUseID: randomUUID(), toolUseID: randomUUID(),
signal, signal,
timeoutMs, timeoutMs,
toolUseContext, toolUseContext,
})) {
if (result.blockingError) {
blockingHookCount++
}
if (result.preventContinuation) {
preventedContinuation = true
}
yield result
}
await dispatchHookChainFromHookRuntime({
eventName: 'TaskCompleted',
outcome:
blockingHookCount > 0 || preventedContinuation ? 'failed' : 'success',
payload: {
...hookInput,
hook_blocking_error_count: blockingHookCount,
hook_prevented_continuation: preventedContinuation,
},
signal,
toolUseContext,
}) })
} }

View File

@@ -75,7 +75,6 @@ import type {
import { isAdvisorBlock } from './advisor.js' import { isAdvisorBlock } from './advisor.js'
import { isAgentSwarmsEnabled } from './agentSwarmsEnabled.js' import { isAgentSwarmsEnabled } from './agentSwarmsEnabled.js'
import { count } from './array.js' import { count } from './array.js'
import { isEnvTruthy } from './envUtils.js'
import { import {
type Attachment, type Attachment,
type HookAttachment, type HookAttachment,
@@ -3667,9 +3666,6 @@ Read the team config to discover your teammates' names. Check the task list peri
]) ])
} }
case 'todo_reminder': { case 'todo_reminder': {
if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
return []
}
const todoItems = attachment.content const todoItems = attachment.content
.map((todo, index) => `${index + 1}. [${todo.status}] ${todo.content}`) .map((todo, index) => `${index + 1}. [${todo.status}] ${todo.content}`)
.join('\n') .join('\n')
@@ -3690,9 +3686,6 @@ Read the team config to discover your teammates' names. Check the task list peri
if (!isTodoV2Enabled()) { if (!isTodoV2Enabled()) {
return [] return []
} }
if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
return []
}
const taskItems = attachment.content const taskItems = attachment.content
.map(task => `#${task.id}. [${task.status}] ${task.subject}`) .map(task => `#${task.id}. [${task.status}] ${task.subject}`)
.join('\n') .join('\n')

View File

@@ -1,205 +0,0 @@
/**
* Model Benchmarking for OpenClaude
*
* Tests and compares model speed/quality for informed model selection.
* Supports OpenAI-compatible, Ollama, Anthropic, Bedrock, Vertex.
*/
import { getAPIProvider } from './providers.js'
export interface BenchmarkResult {
model: string
provider: string
firstTokenMs: number
totalTokens: number
tokensPerSecond: number
success: boolean
error?: string
}
const TEST_PROMPT = 'Write a short hello world in Python.'
const MAX_TOKENS = 50
const TIMEOUT_MS = 30000
function getBenchmarkEndpoint(): string | null {
const provider = getAPIProvider()
const baseUrl = process.env.OPENAI_BASE_URL
// Check for Ollama (local)
if (baseUrl?.includes('localhost:11434') || baseUrl?.includes('localhost:11435')) {
return `${baseUrl}/chat/completions`
}
// OpenAI-compatible endpoints
if (provider === 'openai' || provider === 'firstParty') {
return `${baseUrl || 'https://api.openai.com/v1'}/chat/completions`
}
// NVIDIA NIM or MiniMax via OPENAI_BASE_URL
if (baseUrl?.includes('nvidia') || baseUrl?.includes('minimax')) {
return `${baseUrl}/chat/completions`
}
return null
}
function getBenchmarkAuthHeader(): string | null {
const apiKey = process.env.OPENAI_API_KEY
if (!apiKey) return null
return `Bearer ${apiKey}`
}
export async function benchmarkModel(
model: string,
onChunk?: (text: string) => void,
): Promise<BenchmarkResult> {
const endpoint = getBenchmarkEndpoint()
const authHeader = getBenchmarkAuthHeader()
if (!endpoint || !authHeader) {
return {
model,
provider: getAPIProvider(),
firstTokenMs: 0,
totalTokens: 0,
tokensPerSecond: 0,
success: false,
error: 'Benchmark not supported for this provider',
}
}
const startTime = performance.now()
let totalTokens = 0
let firstTokenMs: number | null = null
try {
const response = await fetch(endpoint, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': authHeader,
},
body: JSON.stringify({
model,
messages: [{ role: 'user', content: TEST_PROMPT }],
max_tokens: MAX_TOKENS,
stream: true,
}),
signal: AbortSignal.timeout(TIMEOUT_MS),
})
if (!response.ok) {
let errorMsg = `HTTP ${response.status}`
try {
const error = await response.json()
errorMsg = error.error?.message || errorMsg
} catch {
// ignore
}
return {
model,
provider: getAPIProvider(),
firstTokenMs: 0,
totalTokens: 0,
tokensPerSecond: 0,
success: false,
error: errorMsg,
}
}
const reader = response.body?.getReader()
if (!reader) {
throw new Error('No response body')
}
const decoder = new TextDecoder()
let buffer = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
buffer += decoder.decode(value, { stream: true })
const lines = buffer.split('\n')
buffer = lines.pop() || ''
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6)
if (data === '[DONE]') continue
try {
const json = JSON.parse(data)
const content = json.choices?.[0]?.delta?.content
if (content) {
if (firstTokenMs === null) {
firstTokenMs = performance.now() - startTime
}
totalTokens += content.length / 4
onChunk?.(content)
}
} catch {
// skip invalid JSON
}
}
}
}
const totalMs = performance.now() - startTime
const tokensPerSecond = totalMs > 0 ? (totalTokens / totalMs) * 1000 : 0
return {
model,
provider: getAPIProvider(),
firstTokenMs: firstTokenMs ?? 0,
totalTokens,
tokensPerSecond,
success: true,
}
} catch (error) {
return {
model,
provider: getAPIProvider(),
firstTokenMs: 0,
totalTokens: 0,
tokensPerSecond: 0,
success: false,
error: error instanceof Error ? error.message : 'Unknown error',
}
}
}
export async function benchmarkMultipleModels(
models: string[],
onProgress?: (completed: number, total: number, result: BenchmarkResult) => void,
): Promise<BenchmarkResult[]> {
const results: BenchmarkResult[] = []
for (let i = 0; i < models.length; i++) {
const result = await benchmarkModel(models[i])
results.push(result)
onProgress?.(i + 1, models.length, result)
}
return results
}
export function formatBenchmarkResults(results: BenchmarkResult[]): string {
const header = 'Model'.padEnd(40) + 'TPS' + ' First Token' + ' Status'
const divider = '-'.repeat(70)
const rows = results
.sort((a, b) => b.tokensPerSecond - a.tokensPerSecond)
.map(r => {
const name = r.model.length > 38 ? r.model.slice(0, 37) + '…' : r.model
const tps = r.tokensPerSecond.toFixed(1).padStart(6)
const first = r.firstTokenMs > 0 ? `${r.firstTokenMs.toFixed(0)}ms`.padStart(12) : 'N/A'.padStart(12)
const status = r.success ? '✓' : '✗'
return name.padEnd(40) + tps + ' ' + first + ' ' + status
})
return [header, divider, ...rows].join('\n')
}
export function isBenchmarkSupported(): boolean {
const endpoint = getBenchmarkEndpoint()
const authHeader = getBenchmarkAuthHeader()
return endpoint !== null && authHeader !== null
}

View File

@@ -20,7 +20,7 @@ export const OPENAI_MODEL_DEFAULTS = {
// Override with GEMINI_MODEL env var. // Override with GEMINI_MODEL env var.
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
export const GEMINI_MODEL_DEFAULTS = { export const GEMINI_MODEL_DEFAULTS = {
opus: 'gemini-2.5-pro', // most capable opus: 'gemini-2.5-pro-preview-03-25', // most capable
sonnet: 'gemini-2.0-flash', // balanced sonnet: 'gemini-2.0-flash', // balanced
haiku: 'gemini-2.0-flash-lite', // fast & cheap haiku: 'gemini-2.0-flash-lite', // fast & cheap
} as const } as const
@@ -112,7 +112,7 @@ export const CLAUDE_OPUS_4_CONFIG = {
vertex: 'claude-opus-4@20250514', vertex: 'claude-opus-4@20250514',
foundry: 'claude-opus-4', foundry: 'claude-opus-4',
openai: 'gpt-4o', openai: 'gpt-4o',
gemini: 'gemini-2.5-pro', gemini: 'gemini-2.5-pro-preview-03-25',
github: 'github:copilot', github: 'github:copilot',
codex: 'gpt-5.4', codex: 'gpt-5.4',
'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct', 'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -125,7 +125,7 @@ export const CLAUDE_OPUS_4_1_CONFIG = {
vertex: 'claude-opus-4-1@20250805', vertex: 'claude-opus-4-1@20250805',
foundry: 'claude-opus-4-1', foundry: 'claude-opus-4-1',
openai: 'gpt-4o', openai: 'gpt-4o',
gemini: 'gemini-2.5-pro', gemini: 'gemini-2.5-pro-preview-03-25',
github: 'github:copilot', github: 'github:copilot',
codex: 'gpt-5.4', codex: 'gpt-5.4',
'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct', 'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -138,7 +138,7 @@ export const CLAUDE_OPUS_4_5_CONFIG = {
vertex: 'claude-opus-4-5@20251101', vertex: 'claude-opus-4-5@20251101',
foundry: 'claude-opus-4-5', foundry: 'claude-opus-4-5',
openai: 'gpt-4o', openai: 'gpt-4o',
gemini: 'gemini-2.5-pro', gemini: 'gemini-2.5-pro-preview-03-25',
github: 'github:copilot', github: 'github:copilot',
codex: 'gpt-5.4', codex: 'gpt-5.4',
'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct', 'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -151,7 +151,7 @@ export const CLAUDE_OPUS_4_6_CONFIG = {
vertex: 'claude-opus-4-6', vertex: 'claude-opus-4-6',
foundry: 'claude-opus-4-6', foundry: 'claude-opus-4-6',
openai: 'gpt-4o', openai: 'gpt-4o',
gemini: 'gemini-2.5-pro', gemini: 'gemini-2.5-pro-preview-03-25',
github: 'github:copilot', github: 'github:copilot',
codex: 'gpt-5.4', codex: 'gpt-5.4',
'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct', 'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',

View File

@@ -1,199 +0,0 @@
import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
import { saveGlobalConfig } from '../config.js'
import {
getDefaultHaikuModel,
getDefaultOpusModel,
getDefaultSonnetModel,
getSmallFastModel,
getUserSpecifiedModelSetting,
} from './model.js'
const SAVED_ENV = {
CLAUDE_CODE_USE_OPENAI: process.env.CLAUDE_CODE_USE_OPENAI,
CLAUDE_CODE_USE_GEMINI: process.env.CLAUDE_CODE_USE_GEMINI,
CLAUDE_CODE_USE_GITHUB: process.env.CLAUDE_CODE_USE_GITHUB,
CLAUDE_CODE_USE_MISTRAL: process.env.CLAUDE_CODE_USE_MISTRAL,
CLAUDE_CODE_USE_BEDROCK: process.env.CLAUDE_CODE_USE_BEDROCK,
CLAUDE_CODE_USE_VERTEX: process.env.CLAUDE_CODE_USE_VERTEX,
CLAUDE_CODE_USE_FOUNDRY: process.env.CLAUDE_CODE_USE_FOUNDRY,
NVIDIA_NIM: process.env.NVIDIA_NIM,
MINIMAX_API_KEY: process.env.MINIMAX_API_KEY,
OPENAI_MODEL: process.env.OPENAI_MODEL,
OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
CODEX_API_KEY: process.env.CODEX_API_KEY,
CHATGPT_ACCOUNT_ID: process.env.CHATGPT_ACCOUNT_ID,
}
function restoreEnv(key: keyof typeof SAVED_ENV): void {
if (SAVED_ENV[key] === undefined) {
delete process.env[key]
} else {
process.env[key] = SAVED_ENV[key]
}
}
beforeEach(() => {
// Other test files (notably modelOptions.github.test.ts) install a
// persistent mock.module for './providers.js' that overrides getAPIProvider
// globally. Without mock.restore() here, those overrides bleed into this
// suite and the provider-kind branches we're testing become unreachable.
mock.restore()
delete process.env.CLAUDE_CODE_USE_OPENAI
delete process.env.CLAUDE_CODE_USE_GEMINI
delete process.env.CLAUDE_CODE_USE_GITHUB
delete process.env.CLAUDE_CODE_USE_MISTRAL
delete process.env.CLAUDE_CODE_USE_BEDROCK
delete process.env.CLAUDE_CODE_USE_VERTEX
delete process.env.CLAUDE_CODE_USE_FOUNDRY
delete process.env.NVIDIA_NIM
delete process.env.MINIMAX_API_KEY
delete process.env.OPENAI_MODEL
delete process.env.OPENAI_BASE_URL
delete process.env.CODEX_API_KEY
delete process.env.CHATGPT_ACCOUNT_ID
saveGlobalConfig(current => ({
...current,
model: undefined,
}))
})
afterEach(() => {
for (const key of Object.keys(SAVED_ENV) as Array<keyof typeof SAVED_ENV>) {
restoreEnv(key)
}
saveGlobalConfig(current => ({
...current,
model: undefined,
}))
})
test('codex provider reads OPENAI_MODEL, not stale settings.model', () => {
// Regression: switching from Moonshot (settings.model='kimi-k2.6' persisted
// from that session) to the Codex profile. Codex profile correctly sets
// OPENAI_MODEL=codexplan + base URL to chatgpt.com/backend-api/codex.
// getUserSpecifiedModelSetting previously ignored env for 'codex' provider
// and returned settings.model='kimi-k2.6', causing Codex's API to reject
// the request: "The 'kimi-k2.6' model is not supported when using Codex".
saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_BASE_URL = 'https://chatgpt.com/backend-api/codex'
process.env.OPENAI_MODEL = 'codexplan'
process.env.CODEX_API_KEY = 'codex-test'
process.env.CHATGPT_ACCOUNT_ID = 'acct_test'
const model = getUserSpecifiedModelSetting()
expect(model).toBe('codexplan')
})
test('nvidia-nim provider reads OPENAI_MODEL, not stale settings.model', () => {
saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
process.env.NVIDIA_NIM = '1'
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
const model = getUserSpecifiedModelSetting()
expect(model).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
})
test('minimax provider reads OPENAI_MODEL, not stale settings.model', () => {
saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'MiniMax-M2.5'
const model = getUserSpecifiedModelSetting()
expect(model).toBe('MiniMax-M2.5')
})
test('openai provider still reads OPENAI_MODEL (regression guard)', () => {
saveGlobalConfig(current => ({ ...current, model: 'stale-default' }))
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'gpt-4o'
const model = getUserSpecifiedModelSetting()
expect(model).toBe('gpt-4o')
})
test('github provider still reads OPENAI_MODEL (regression guard)', () => {
saveGlobalConfig(current => ({ ...current, model: 'stale-default' }))
process.env.CLAUDE_CODE_USE_GITHUB = '1'
process.env.OPENAI_MODEL = 'github:copilot'
const model = getUserSpecifiedModelSetting()
expect(model).toBe('github:copilot')
})
// ---------------------------------------------------------------------------
// Default model helpers — must not fall through to claude-haiku-4-5 etc. for
// OpenAI-shim providers whose endpoints don't speak Anthropic model names.
// Hitting that fallthrough caused WebFetch to hang for 60s on MiniMax/Codex
// because queryHaiku() shipped an unknown model id to the shim endpoint.
// ---------------------------------------------------------------------------
test('getSmallFastModel returns OPENAI_MODEL for MiniMax (regression: WebFetch hang)', () => {
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.5-highspeed'
expect(getSmallFastModel()).toBe('MiniMax-M2.5-highspeed')
})
test('getSmallFastModel returns OPENAI_MODEL for Codex (regression)', () => {
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_BASE_URL = 'https://chatgpt.com/backend-api/codex'
process.env.OPENAI_MODEL = 'codexspark'
process.env.CODEX_API_KEY = 'codex-test'
process.env.CHATGPT_ACCOUNT_ID = 'acct_test'
expect(getSmallFastModel()).toBe('codexspark')
})
test('getSmallFastModel returns OPENAI_MODEL for NVIDIA NIM (regression)', () => {
process.env.NVIDIA_NIM = '1'
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
expect(getSmallFastModel()).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
})
test('getDefaultOpusModel returns OPENAI_MODEL for MiniMax', () => {
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.7'
expect(getDefaultOpusModel()).toBe('MiniMax-M2.7')
})
test('getDefaultSonnetModel returns OPENAI_MODEL for NVIDIA NIM', () => {
process.env.NVIDIA_NIM = '1'
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
expect(getDefaultSonnetModel()).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
})
test('getDefaultHaikuModel returns OPENAI_MODEL for MiniMax', () => {
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.5-highspeed'
expect(getDefaultHaikuModel()).toBe('MiniMax-M2.5-highspeed')
})
test('default helpers do not leak claude-* names to shim providers', () => {
// Umbrella guard: for each OpenAI-shim provider, none of the default-model
// helpers may return an Anthropic-branded model name. That was the source
// of the WebFetch 60s hang — MiniMax received "claude-haiku-4-5" and sat
// on the connection.
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.7'
for (const fn of [
getSmallFastModel,
getDefaultOpusModel,
getDefaultSonnetModel,
getDefaultHaikuModel,
]) {
const model = fn()
expect(model.toLowerCase()).not.toContain('claude')
}
})

View File

@@ -52,25 +52,10 @@ export function getSmallFastModel(): ModelName {
if (getAPIProvider() === 'openai') { if (getAPIProvider() === 'openai') {
return process.env.OPENAI_MODEL || 'gpt-4o-mini' return process.env.OPENAI_MODEL || 'gpt-4o-mini'
} }
// Codex provider — OPENAI_MODEL is always set for Codex profiles; only fall
// back to a codex-spark alias when an override env strips it.
if (getAPIProvider() === 'codex') {
return process.env.OPENAI_MODEL || 'codexspark'
}
// For GitHub Copilot provider // For GitHub Copilot provider
if (getAPIProvider() === 'github') { if (getAPIProvider() === 'github') {
return process.env.OPENAI_MODEL || 'github:copilot' return process.env.OPENAI_MODEL || 'github:copilot'
} }
// NVIDIA NIM — OPENAI_MODEL carries the user's active NIM model; use a
// small Meta Llama variant as the conservative fallback.
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'meta/llama-3.1-8b-instruct'
}
// MiniMax — OPENAI_MODEL carries the active MiniMax model; fall back to
// the fastest tier (M2.5-highspeed) when missing.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.5-highspeed'
}
return getDefaultHaikuModel() return getDefaultHaikuModel()
} }
@@ -106,24 +91,11 @@ export function getUserSpecifiedModelSetting(): ModelSetting | undefined {
const setting = normalizeModelSetting(settings.model) const setting = normalizeModelSetting(settings.model)
// Read the model env var that matches the active provider to prevent // Read the model env var that matches the active provider to prevent
// cross-provider leaks (e.g. ANTHROPIC_MODEL sent to the OpenAI API). // cross-provider leaks (e.g. ANTHROPIC_MODEL sent to the OpenAI API).
//
// All OpenAI-shim providers (openai, codex, github, nvidia-nim, minimax)
// set CLAUDE_CODE_USE_OPENAI=1 + OPENAI_MODEL via
// applyProviderProfileToProcessEnv. Earlier this check only included
// openai/github — codex/nvidia-nim/minimax fell through to the stale
// settings.model, so switching from (say) Moonshot to Codex kept firing
// `kimi-k2.6` at the Codex endpoint and getting 400s.
const provider = getAPIProvider() const provider = getAPIProvider()
const isOpenAIShimProvider =
provider === 'openai' ||
provider === 'codex' ||
provider === 'github' ||
provider === 'nvidia-nim' ||
provider === 'minimax'
specifiedModel = specifiedModel =
(provider === 'gemini' ? process.env.GEMINI_MODEL : undefined) || (provider === 'gemini' ? process.env.GEMINI_MODEL : undefined) ||
(provider === 'mistral' ? process.env.MISTRAL_MODEL : undefined) || (provider === 'mistral' ? process.env.MISTRAL_MODEL : undefined) ||
(isOpenAIShimProvider ? process.env.OPENAI_MODEL : undefined) || (provider === 'openai' || provider === 'gemini' || provider === 'mistral' || provider === 'github' ? process.env.OPENAI_MODEL : undefined) ||
(provider === 'firstParty' ? process.env.ANTHROPIC_MODEL : undefined) || (provider === 'firstParty' ? process.env.ANTHROPIC_MODEL : undefined) ||
setting || setting ||
undefined undefined
@@ -168,7 +140,7 @@ export function getDefaultOpusModel(): ModelName {
} }
// Gemini provider // Gemini provider
if (getAPIProvider() === 'gemini') { if (getAPIProvider() === 'gemini') {
return process.env.GEMINI_MODEL || 'gemini-2.5-pro' return process.env.GEMINI_MODEL || 'gemini-2.5-pro-preview-03-25'
} }
// Mistral provider // Mistral provider
if (getAPIProvider() === 'mistral') { if (getAPIProvider() === 'mistral') {
@@ -186,14 +158,6 @@ export function getDefaultOpusModel(): ModelName {
if (getAPIProvider() === 'github') { if (getAPIProvider() === 'github') {
return process.env.OPENAI_MODEL || 'github:copilot' return process.env.OPENAI_MODEL || 'github:copilot'
} }
// NVIDIA NIM
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'nvidia/llama-3.1-nemotron-70b-instruct'
}
// MiniMax — flagship tier for "opus"-equivalent.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.7'
}
// 3P providers (Bedrock, Vertex, Foundry) — kept as a separate branch // 3P providers (Bedrock, Vertex, Foundry) — kept as a separate branch
// even when values match, since 3P availability lags firstParty and // even when values match, since 3P availability lags firstParty and
// these will diverge again at the next model launch. // these will diverge again at the next model launch.
@@ -228,14 +192,6 @@ export function getDefaultSonnetModel(): ModelName {
if (getAPIProvider() === 'github') { if (getAPIProvider() === 'github') {
return process.env.OPENAI_MODEL || 'github:copilot' return process.env.OPENAI_MODEL || 'github:copilot'
} }
// NVIDIA NIM
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'nvidia/llama-3.1-nemotron-70b-instruct'
}
// MiniMax — mid tier for "sonnet"-equivalent.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.5'
}
// Default to Sonnet 4.5 for 3P since they may not have 4.6 yet // Default to Sonnet 4.5 for 3P since they may not have 4.6 yet
if (getAPIProvider() !== 'firstParty') { if (getAPIProvider() !== 'firstParty') {
return getModelStrings().sonnet45 return getModelStrings().sonnet45
@@ -268,14 +224,6 @@ export function getDefaultHaikuModel(): ModelName {
if (getAPIProvider() === 'gemini') { if (getAPIProvider() === 'gemini') {
return process.env.GEMINI_MODEL || 'gemini-2.0-flash-lite' return process.env.GEMINI_MODEL || 'gemini-2.0-flash-lite'
} }
// NVIDIA NIM
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'meta/llama-3.1-8b-instruct'
}
// MiniMax — fastest tier for "haiku"-equivalent.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.5-highspeed'
}
// Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex) // Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex)
return getModelStrings().haiku45 return getModelStrings().haiku45

View File

@@ -1,30 +0,0 @@
import { describe, expect, it, beforeEach, afterEach, vi } from 'bun:test'
import { isModelCacheValid, getCachedModelsFromDisk, saveModelsToCache } from '../model/modelCache.js'
vi.mock('../model/ollamaModels.js', () => ({
isOllamaProvider: vi.fn(() => true),
}))
describe('modelCache', () => {
const mockModel = { value: 'llama3', label: 'Llama 3', description: 'Test model' }
describe('isModelCacheValid', () => {
it('returns false for non-existent cache', async () => {
const result = await isModelCacheValid('ollama')
expect(result).toBe(false)
})
})
describe('getCachedModelsFromDisk', () => {
it('returns null when not cache available', async () => {
const result = await getCachedModelsFromDisk()
expect(result).toBeNull()
})
})
describe('saveModelsToCache', () => {
it('has saveModelsToCache function', () => {
expect(typeof saveModelsToCache).toBe('function')
})
})
})

View File

@@ -1,165 +0,0 @@
/**
* Model Caching for OpenClaude
*
* Caches model lists to disk for faster startup and offline access.
* Uses async fs operations to avoid blocking the event loop.
*/
import { access, readFile, writeFile, mkdir, unlink } from 'node:fs/promises'
import { existsSync } from 'node:fs'
import { join } from 'node:path'
import { homedir } from 'node:os'
import { getAPIProvider } from './providers.js'
const CACHE_VERSION = '1'
const CACHE_TTL_HOURS = 24
const CACHE_DIR_NAME = '.openclaude-model-cache'
interface ModelCache {
version: string
timestamp: number
provider: string
models: Array<{ value: string; label: string; description: string }>
}
function getCacheDir(): string {
const home = homedir()
const cacheDir = join(home, CACHE_DIR_NAME)
if (!existsSync(cacheDir)) {
mkdir(cacheDir, { recursive: true })
}
return cacheDir
}
function getCacheFilePath(provider: string): string {
return join(getCacheDir(), `${provider}.json`)
}
function isOpenAICompatibleProvider(): boolean {
const baseUrl = process.env.OPENAI_BASE_URL || ''
return baseUrl.includes('localhost') || baseUrl.includes('nvidia') || baseUrl.includes('minimax') || getAPIProvider() === 'openai'
}
export async function isModelCacheValid(provider: string): Promise<boolean> {
const cachePath = getCacheFilePath(provider)
try {
await access(cachePath)
} catch {
return false
}
try {
const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
if (data.version !== CACHE_VERSION) {
return false
}
if (data.provider !== provider) {
return false
}
const ageHours = (Date.now() - data.timestamp) / (1000 * 60 * 60)
return ageHours < CACHE_TTL_HOURS
} catch {
return false
}
}
export async function getCachedModelsFromDisk<T>(): Promise<T[] | null> {
const provider = getAPIProvider()
const baseUrl = process.env.OPENAI_BASE_URL || ''
const isLocalOllama = baseUrl.includes('localhost:11434') || baseUrl.includes('localhost:11435')
const isNvidia = baseUrl.includes('nvidia') || baseUrl.includes('integrate.api.nvidia')
const isMiniMax = baseUrl.includes('minimax')
if (!isLocalOllama && !isNvidia && !isMiniMax && provider !== 'openai') {
return null
}
const cachePath = getCacheFilePath(provider)
if (!(await isModelCacheValid(provider))) {
return null
}
try {
const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
return data.models as T[]
} catch {
return null
}
}
export async function saveModelsToCache(
models: Array<{ value: string; label: string; description: string }>,
): Promise<void> {
const provider = getAPIProvider()
if (!provider) return
const cachePath = getCacheFilePath(provider)
const cacheData: ModelCache = {
version: CACHE_VERSION,
timestamp: Date.now(),
provider,
models,
}
try {
await writeFile(cachePath, JSON.stringify(cacheData, null, 2), 'utf-8')
} catch (error) {
console.warn('[ModelCache] Failed to save cache:', error)
}
}
export async function clearModelCache(provider?: string): Promise<void> {
if (provider) {
const cachePath = getCacheFilePath(provider)
try {
await unlink(cachePath)
} catch {
// ignore if doesn't exist
}
} else {
const cacheDir = getCacheDir()
try {
await unlink(join(cacheDir, 'ollama.json'))
await unlink(join(cacheDir, 'nvidia-nim.json'))
await unlink(join(cacheDir, 'minimax.json'))
} catch {
// ignore
}
}
}
export async function getModelCacheInfo(): Promise<{ provider: string; age: string } | null> {
const provider = getAPIProvider()
const cachePath = getCacheFilePath(provider)
try {
await access(cachePath)
} catch {
return null
}
try {
const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
const ageMs = Date.now() - data.timestamp
const ageHours = Math.floor(ageMs / (1000 * 60 * 60))
const ageMins = Math.floor((ageMs % (1000 * 60 * 60)) / (1000 * 60))
return {
provider: data.provider,
age: ageHours > 0 ? `${ageHours}h ${ageMins}m` : `${ageMins}m`,
}
} catch {
return null
}
}
export function isCacheAvailable(): boolean {
const baseUrl = process.env.OPENAI_BASE_URL || ''
const isLocalOllama = baseUrl.includes('localhost:11434') || baseUrl.includes('localhost:11435')
const isNvidia = baseUrl.includes('nvidia') || baseUrl.includes('integrate.api.nvidia')
const isMiniMax = baseUrl.includes('minimax')
return isLocalOllama || isNvidia || isMiniMax || getAPIProvider() === 'openai'
}

View File

@@ -219,17 +219,6 @@ const OPENAI_CONTEXT_WINDOWS: Record<string, number> = {
'kimi-k2.5': 262_144, 'kimi-k2.5': 262_144,
'glm-5': 202_752, 'glm-5': 202_752,
'glm-4.7': 202_752, 'glm-4.7': 202_752,
// Moonshot AI direct API (api.moonshot.ai/v1). Values from Moonshot's
// published model card — all K2 tier share 256K context. Prefix matching
// in lookupByKey catches variants like "kimi-k2.6-preview".
'kimi-k2.6': 262_144,
'kimi-k2': 131_072,
'kimi-k2-instruct': 131_072,
'kimi-k2-thinking': 262_144,
'moonshot-v1-8k': 8_192,
'moonshot-v1-32k': 32_768,
'moonshot-v1-128k': 131_072,
} }
/** /**
@@ -402,62 +391,18 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
'kimi-k2.5': 32_768, 'kimi-k2.5': 32_768,
'glm-5': 16_384, 'glm-5': 16_384,
'glm-4.7': 16_384, 'glm-4.7': 16_384,
// Moonshot AI direct API
'kimi-k2.6': 32_768,
'kimi-k2': 32_768,
'kimi-k2-instruct': 32_768,
'kimi-k2-thinking': 32_768,
'moonshot-v1-8k': 4_096,
'moonshot-v1-32k': 16_384,
'moonshot-v1-128k': 32_768,
} }
// External context-window overrides loaded once at startup. function lookupByModel<T>(table: Record<string, T>, model: string): T | undefined {
// Set CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS to a JSON object mapping model name
// → context-window token count to add or override entries without editing
// this file. Example:
// CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS='{"my-corp/llm-v2":200000}'
const OPENAI_EXTERNAL_CONTEXT_WINDOWS: Record<string, number> = (() => {
try {
const raw = process.env.CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS
if (raw) {
const parsed = JSON.parse(raw)
if (typeof parsed === 'object' && parsed !== null) return parsed as Record<string, number>
}
} catch { /* ignore malformed JSON */ }
return {}
})()
// External max-output-token overrides.
// Set CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS to a JSON object mapping model name
// → max output token count.
const OPENAI_EXTERNAL_MAX_OUTPUT_TOKENS: Record<string, number> = (() => {
try {
const raw = process.env.CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS
if (raw) {
const parsed = JSON.parse(raw)
if (typeof parsed === 'object' && parsed !== null) return parsed as Record<string, number>
}
} catch { /* ignore malformed JSON */ }
return {}
})()
function lookupByModel<T>(table: Record<string, T>, externalTable: Record<string, T>, model: string): T | undefined {
// Try provider-qualified key first: "{OPENAI_MODEL}:{model}" so that // Try provider-qualified key first: "{OPENAI_MODEL}:{model}" so that
// e.g. "github:copilot:claude-haiku-4.5" can have different limits than // e.g. "github:copilot:claude-haiku-4.5" can have different limits than
// a bare "claude-haiku-4.5" served by another provider. // a bare "claude-haiku-4.5" served by another provider.
const providerModel = process.env.OPENAI_MODEL?.trim() const providerModel = process.env.OPENAI_MODEL?.trim()
if (providerModel && providerModel !== model) { if (providerModel && providerModel !== model) {
const qualified = `${providerModel}:${model}` const qualified = `${providerModel}:${model}`
// External table takes precedence over the built-in table.
const externalQualified = lookupByKey(externalTable, qualified)
if (externalQualified !== undefined) return externalQualified
const qualifiedResult = lookupByKey(table, qualified) const qualifiedResult = lookupByKey(table, qualified)
if (qualifiedResult !== undefined) return qualifiedResult if (qualifiedResult !== undefined) return qualifiedResult
} }
const externalResult = lookupByKey(externalTable, model)
if (externalResult !== undefined) return externalResult
return lookupByKey(table, model) return lookupByKey(table, model)
} }
@@ -481,7 +426,7 @@ function lookupByKey<T>(table: Record<string, T>, model: string): T | undefined
* "gpt-4o-2024-11-20" resolve to the base "gpt-4o" entry. * "gpt-4o-2024-11-20" resolve to the base "gpt-4o" entry.
*/ */
export function getOpenAIContextWindow(model: string): number | undefined { export function getOpenAIContextWindow(model: string): number | undefined {
return lookupByModel(OPENAI_CONTEXT_WINDOWS, OPENAI_EXTERNAL_CONTEXT_WINDOWS, model) return lookupByModel(OPENAI_CONTEXT_WINDOWS, model)
} }
/** /**
@@ -489,5 +434,5 @@ export function getOpenAIContextWindow(model: string): number | undefined {
* Returns undefined if the model is not in the table. * Returns undefined if the model is not in the table.
*/ */
export function getOpenAIMaxOutputTokens(model: string): number | undefined { export function getOpenAIMaxOutputTokens(model: string): number | undefined {
return lookupByModel(OPENAI_MAX_OUTPUT_TOKENS, OPENAI_EXTERNAL_MAX_OUTPUT_TOKENS, model) return lookupByModel(OPENAI_MAX_OUTPUT_TOKENS, model)
} }

View File

@@ -19,12 +19,7 @@ export function getAPIProvider(): APIProvider {
if (isEnvTruthy(process.env.NVIDIA_NIM)) { if (isEnvTruthy(process.env.NVIDIA_NIM)) {
return 'nvidia-nim' return 'nvidia-nim'
} }
// MiniMax is signalled by a real API key, not a '1'/'true' flag. Using if (isEnvTruthy(process.env.MINIMAX_API_KEY)) {
// isEnvTruthy() here silently treated every MiniMax user as 'firstParty'
// (or 'openai' once they set CLAUDE_CODE_USE_OPENAI via the profile),
// making every provider-kind-specific branch for 'minimax' elsewhere in
// the codebase unreachable. Presence check is the correct signal.
if (typeof process.env.MINIMAX_API_KEY === 'string' && process.env.MINIMAX_API_KEY.trim() !== '') {
return 'minimax' return 'minimax'
} }
return isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI) return isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI)

View File

@@ -1,299 +0,0 @@
import { describe, expect, test } from 'bun:test'
import {
detectBestProvider,
detectLocalService,
detectProviderFromEnv,
} from './providerAutoDetect.ts'
// Hermetic env scan: always report "no Codex auth on disk" so tests don't
// depend on the dev machine's ~/.codex/auth.json state.
function scan(env: Record<string, string | undefined>) {
return detectProviderFromEnv({ env, hasCodexAuth: () => false })
}
describe('detectProviderFromEnv — priority order', () => {
test('ANTHROPIC_API_KEY wins over all others', () => {
expect(
scan({
ANTHROPIC_API_KEY: 'sk-ant-x',
OPENAI_API_KEY: 'sk-x',
GEMINI_API_KEY: 'gem-x',
}),
).toEqual({ kind: 'anthropic', source: 'ANTHROPIC_API_KEY set' })
})
test('CODEX_API_KEY beats OpenAI/Gemini/etc', () => {
expect(
scan({
CODEX_API_KEY: 'codex-x',
OPENAI_API_KEY: 'sk-x',
}),
).toEqual({ kind: 'codex', source: 'CODEX_API_KEY set' })
})
test('CHATGPT_ACCOUNT_ID alone is enough for Codex', () => {
expect(
scan({
CHATGPT_ACCOUNT_ID: 'acct-123',
}),
).toEqual({ kind: 'codex', source: 'CHATGPT_ACCOUNT_ID set' })
})
test('Codex auth file on disk is detected without any env', () => {
expect(
detectProviderFromEnv({ env: {}, hasCodexAuth: () => true }),
).toEqual({ kind: 'codex', source: '~/.codex/auth.json present' })
})
test('GITHUB_TOKEN wins over OpenAI', () => {
expect(
scan({
GITHUB_TOKEN: 'ghp-x',
OPENAI_API_KEY: 'sk-x',
}),
).toEqual({ kind: 'github', source: 'GITHUB_TOKEN set (GitHub Copilot)' })
})
test('GH_TOKEN is equivalent to GITHUB_TOKEN', () => {
expect(
scan({
GH_TOKEN: 'ghp-x',
}),
).toEqual({ kind: 'github', source: 'GH_TOKEN set (GitHub Copilot)' })
})
test('OPENAI_API_KEYS (plural) detected', () => {
expect(
scan({
OPENAI_API_KEYS: 'sk-a,sk-b',
}),
).toEqual({ kind: 'openai', source: 'OPENAI_API_KEYS set' })
})
test('OPENAI_API_KEY reports baseUrl when set', () => {
expect(
scan({
OPENAI_API_KEY: 'sk-x',
OPENAI_BASE_URL: 'https://openrouter.ai/api/v1',
}),
).toEqual({
kind: 'openai',
source: 'OPENAI_API_KEY set',
baseUrl: 'https://openrouter.ai/api/v1',
})
})
test('GEMINI_API_KEY detected', () => {
expect(scan({ GEMINI_API_KEY: 'gem-x' })).toEqual({
kind: 'gemini',
source: 'GEMINI_API_KEY set',
})
})
test('GOOGLE_API_KEY also detects Gemini', () => {
expect(scan({ GOOGLE_API_KEY: 'gk-x' })).toEqual({
kind: 'gemini',
source: 'GOOGLE_API_KEY set',
})
})
test('MISTRAL_API_KEY detected', () => {
expect(scan({ MISTRAL_API_KEY: 'mis-x' })).toEqual({
kind: 'mistral',
source: 'MISTRAL_API_KEY set',
})
})
test('MINIMAX_API_KEY detected', () => {
expect(scan({ MINIMAX_API_KEY: 'mm-x' })).toEqual({
kind: 'minimax',
source: 'MINIMAX_API_KEY set',
})
})
test('empty-string values are ignored', () => {
expect(
scan({
ANTHROPIC_API_KEY: '',
OPENAI_API_KEY: ' ',
GEMINI_API_KEY: 'gem-x',
}),
).toEqual({ kind: 'gemini', source: 'GEMINI_API_KEY set' })
})
test('no credentials → null', () => {
expect(scan({})).toBeNull()
})
})
describe('detectLocalService', () => {
test('returns Ollama when its /api/tags responds ok', async () => {
const fetchImpl = (async (input: URL | RequestInfo) => {
const url = typeof input === 'string' ? input : (input as URL).toString()
if (url.includes(':11434')) {
return new Response('{"models":[]}', { status: 200 })
}
return new Response('', { status: 404 })
}) as typeof fetch
const result = await detectLocalService({
env: {},
fetchImpl,
timeoutMs: 200,
})
expect(result?.kind).toBe('ollama')
expect(result?.baseUrl).toBe('http://localhost:11434')
})
test('Ollama wins over LM Studio even when both are reachable', async () => {
const fetchImpl = (async () => new Response('{}', { status: 200 })) as typeof fetch
const result = await detectLocalService({
env: {},
fetchImpl,
timeoutMs: 200,
})
expect(result?.kind).toBe('ollama')
})
test('falls back to LM Studio when Ollama is unreachable', async () => {
const fetchImpl = (async (input: URL | RequestInfo) => {
const url = typeof input === 'string' ? input : (input as URL).toString()
if (url.includes(':1234')) {
return new Response('{"data":[]}', { status: 200 })
}
return new Response('', { status: 404 })
}) as typeof fetch
const result = await detectLocalService({
env: {},
fetchImpl,
timeoutMs: 200,
})
expect(result?.kind).toBe('lm-studio')
expect(result?.baseUrl).toBe('http://localhost:1234')
})
test('returns null when no local services respond', async () => {
const fetchImpl = (async () =>
new Response('', { status: 500 })) as typeof fetch
const result = await detectLocalService({
env: {},
fetchImpl,
timeoutMs: 200,
})
expect(result).toBeNull()
})
test('honors OLLAMA_BASE_URL override', async () => {
const probedUrls: string[] = []
const fetchImpl = (async (input: URL | RequestInfo) => {
const url = typeof input === 'string' ? input : (input as URL).toString()
probedUrls.push(url)
return new Response('{"models":[]}', { status: 200 })
}) as typeof fetch
const result = await detectLocalService({
env: { OLLAMA_BASE_URL: 'http://10.0.0.5:11434' },
fetchImpl,
timeoutMs: 200,
})
expect(result?.baseUrl).toBe('http://10.0.0.5:11434')
expect(probedUrls).toContain('http://10.0.0.5:11434/api/tags')
})
test('probe timeout does not throw — returns null', async () => {
const fetchImpl = (async (_input: URL | RequestInfo, init?: RequestInit) => {
// Respect the caller's abort signal so the race with timeoutMs is fair.
return new Promise<Response>((_resolve, reject) => {
const onAbort = () => reject(new Error('aborted'))
init?.signal?.addEventListener('abort', onAbort)
setTimeout(() => {
init?.signal?.removeEventListener('abort', onAbort)
_resolve(new Response('ok'))
}, 500)
})
}) as typeof fetch
const result = await detectLocalService({
env: {},
fetchImpl,
timeoutMs: 50,
})
expect(result).toBeNull()
})
test('network errors do not throw', async () => {
const fetchImpl = (async () => {
throw new Error('ECONNREFUSED')
}) as typeof fetch
const result = await detectLocalService({
env: {},
fetchImpl,
timeoutMs: 200,
})
expect(result).toBeNull()
})
})
describe('detectBestProvider — orchestrator', () => {
test('env match short-circuits the local probe', async () => {
let probeCalled = false
const fetchImpl = (async () => {
probeCalled = true
return new Response('{}', { status: 200 })
}) as typeof fetch
const result = await detectBestProvider({
env: { ANTHROPIC_API_KEY: 'sk-ant' },
fetchImpl,
timeoutMs: 200,
hasCodexAuth: () => false,
})
expect(result?.kind).toBe('anthropic')
expect(probeCalled).toBe(false)
})
test('env miss falls through to local-service probe', async () => {
const fetchImpl = (async () => new Response('{}', { status: 200 })) as typeof fetch
const result = await detectBestProvider({
env: {},
fetchImpl,
timeoutMs: 200,
hasCodexAuth: () => false,
})
expect(result?.kind).toBe('ollama')
})
test('skipLocal prevents network probes', async () => {
let probeCalled = false
const fetchImpl = (async () => {
probeCalled = true
return new Response('{}', { status: 200 })
}) as typeof fetch
const result = await detectBestProvider({
env: {},
fetchImpl,
skipLocal: true,
hasCodexAuth: () => false,
})
expect(result).toBeNull()
expect(probeCalled).toBe(false)
})
test('completely empty environment returns null', async () => {
const fetchImpl = (async () => {
throw new Error('nothing reachable')
}) as typeof fetch
const result = await detectBestProvider({
env: {},
fetchImpl,
timeoutMs: 100,
hasCodexAuth: () => false,
})
expect(result).toBeNull()
})
})

View File

@@ -1,283 +0,0 @@
/**
* Zero-config provider autodetection.
*
* Scans the environment (API keys, OAuth tokens, stored credentials) and local
* network (Ollama, LM Studio) to pick the best provider for first-run users
* who have not explicitly configured one. Returns a structured detection
* result that callers can consume to build a launch-ready profile env, or
* null when nothing is detected — in which case the existing onboarding /
* picker flow should take over.
*
* Detection priority (first match wins):
* 1. ANTHROPIC_API_KEY → first-party Claude (most capable default)
* 2. Codex: CODEX_API_KEY, CHATGPT_ACCOUNT_ID, or valid ~/.codex/auth.json
* 3. GitHub Copilot: GITHUB_TOKEN or GH_TOKEN
* 4. OPENAI_API_KEY / OPENAI_API_KEYS
* 5. GEMINI_API_KEY or GOOGLE_API_KEY
* 6. MISTRAL_API_KEY
* 7. MINIMAX_API_KEY
* 8. Local Ollama reachable (default localhost:11434)
* 9. Local LM Studio reachable (default localhost:1234)
*
* Local-service probes are parallelized and cheap (short timeout, no
* request body). Env scans are synchronous and run first so we don't make
* network calls when a credential is already present.
*
* This module intentionally does NOT decide whether to apply the detection;
* callers should gate on hasExplicitProviderSelection() (providerProfile.ts)
* and the presence of a persisted profile file.
*/
import { existsSync } from 'fs'
import { homedir } from 'os'
import { join } from 'path'
export type DetectedProviderKind =
| 'anthropic'
| 'codex'
| 'github'
| 'openai'
| 'gemini'
| 'mistral'
| 'minimax'
| 'ollama'
| 'lm-studio'
export type DetectedProvider = {
kind: DetectedProviderKind
/** One-line human-readable reason, e.g. "ANTHROPIC_API_KEY set". */
source: string
/** Present when the detection already resolved a usable base URL. */
baseUrl?: string
/** Present when detection also narrowed down a specific model. */
model?: string
}
type EnvLike = NodeJS.ProcessEnv | Record<string, string | undefined>
function envHasNonEmpty(env: EnvLike, key: string): boolean {
const value = env[key]
return typeof value === 'string' && value.trim().length > 0
}
function firstSet(env: EnvLike, keys: readonly string[]): string | undefined {
for (const key of keys) {
if (envHasNonEmpty(env, key)) return key
}
return undefined
}
function defaultHasCodexAuthFile(): boolean {
const paths = [
process.env.CODEX_AUTH_PATH,
join(homedir(), '.codex', 'auth.json'),
]
return paths.some(p => p && existsSync(p))
}
export type DetectProviderFromEnvOptions = {
env?: EnvLike
/**
* Override Codex auth-file detection. Primarily for tests — the default
* implementation checks ~/.codex/auth.json and CODEX_AUTH_PATH on disk.
*/
hasCodexAuth?: () => boolean
}
/**
* Synchronous env-only scan. Returns the highest-priority env-provided
* provider, or null if nothing is present. Intentionally does not touch
* the network — fast path for the common case where a user has exported
* one of the standard API-key env vars.
*/
function isOptionsObject(
value: EnvLike | DetectProviderFromEnvOptions | undefined,
): value is DetectProviderFromEnvOptions {
if (!value || typeof value !== 'object') return false
if ('hasCodexAuth' in value && typeof value.hasCodexAuth === 'function') {
return true
}
if ('env' in value && typeof (value as { env?: unknown }).env === 'object') {
return true
}
return false
}
export function detectProviderFromEnv(
envOrOptions: EnvLike | DetectProviderFromEnvOptions = process.env,
): DetectedProvider | null {
const options: DetectProviderFromEnvOptions = isOptionsObject(envOrOptions)
? envOrOptions
: { env: envOrOptions as EnvLike }
const env = options.env ?? process.env
const hasCodexAuth = options.hasCodexAuth ?? defaultHasCodexAuthFile
if (envHasNonEmpty(env, 'ANTHROPIC_API_KEY')) {
return { kind: 'anthropic', source: 'ANTHROPIC_API_KEY set' }
}
if (
envHasNonEmpty(env, 'CODEX_API_KEY') ||
envHasNonEmpty(env, 'CHATGPT_ACCOUNT_ID') ||
envHasNonEmpty(env, 'CODEX_ACCOUNT_ID') ||
hasCodexAuth()
) {
const sourceEnv =
firstSet(env, ['CODEX_API_KEY', 'CHATGPT_ACCOUNT_ID', 'CODEX_ACCOUNT_ID'])
return {
kind: 'codex',
source: sourceEnv ? `${sourceEnv} set` : '~/.codex/auth.json present',
}
}
const githubKey = firstSet(env, ['GITHUB_TOKEN', 'GH_TOKEN'])
if (githubKey) {
return {
kind: 'github',
source: `${githubKey} set (GitHub Copilot)`,
}
}
const openaiKey = firstSet(env, ['OPENAI_API_KEYS', 'OPENAI_API_KEY'])
if (openaiKey) {
return {
kind: 'openai',
source: `${openaiKey} set`,
baseUrl: env.OPENAI_BASE_URL ?? env.OPENAI_API_BASE,
}
}
const geminiKey = firstSet(env, ['GEMINI_API_KEY', 'GOOGLE_API_KEY'])
if (geminiKey) {
return { kind: 'gemini', source: `${geminiKey} set` }
}
if (envHasNonEmpty(env, 'MISTRAL_API_KEY')) {
return { kind: 'mistral', source: 'MISTRAL_API_KEY set' }
}
if (envHasNonEmpty(env, 'MINIMAX_API_KEY')) {
return { kind: 'minimax', source: 'MINIMAX_API_KEY set' }
}
return null
}
type LocalProbe = {
kind: DetectedProviderKind
url: string
timeoutMs: number
source: string
baseUrl: string
}
const DEFAULT_LOCAL_PROBE_TIMEOUT_MS = 1200
async function probeReachable(
url: string,
timeoutMs: number,
fetchImpl: typeof fetch,
): Promise<boolean> {
const controller = new AbortController()
const timer = setTimeout(() => controller.abort(), timeoutMs)
try {
const response = await fetchImpl(url, {
method: 'GET',
signal: controller.signal,
})
return response.ok
} catch {
return false
} finally {
clearTimeout(timer)
}
}
/**
* Returns the highest-priority local service reachable from the host.
* Runs probes in parallel and picks by priority rather than first-response,
* so slow-but-preferred services still win over fast-but-lower-priority ones.
*/
export async function detectLocalService(options?: {
env?: EnvLike
fetchImpl?: typeof fetch
timeoutMs?: number
}): Promise<DetectedProvider | null> {
const env = options?.env ?? process.env
const fetchImpl = options?.fetchImpl ?? globalThis.fetch
const timeoutMs = options?.timeoutMs ?? DEFAULT_LOCAL_PROBE_TIMEOUT_MS
const ollamaBase = (env.OLLAMA_BASE_URL ?? 'http://localhost:11434').replace(
/\/+$/,
'',
)
const lmStudioBase = (env.LM_STUDIO_BASE_URL ?? 'http://localhost:1234').replace(
/\/+$/,
'',
)
const probes: LocalProbe[] = [
{
kind: 'ollama',
url: `${ollamaBase}/api/tags`,
timeoutMs,
source: `Ollama reachable at ${ollamaBase}`,
baseUrl: ollamaBase,
},
{
kind: 'lm-studio',
url: `${lmStudioBase}/v1/models`,
timeoutMs,
source: `LM Studio reachable at ${lmStudioBase}`,
baseUrl: lmStudioBase,
},
]
const results = await Promise.all(
probes.map(async probe => ({
probe,
reachable: await probeReachable(probe.url, probe.timeoutMs, fetchImpl),
})),
)
for (const { probe, reachable } of results) {
if (reachable) {
return {
kind: probe.kind,
source: probe.source,
baseUrl: probe.baseUrl,
}
}
}
return null
}
/**
* Orchestrator: env scan first (sync, free), then local-service probes
* (async, ~1-2s worst case) only if nothing was found in env.
*/
export async function detectBestProvider(options?: {
env?: EnvLike
fetchImpl?: typeof fetch
timeoutMs?: number
/** Skip local-service probes — useful for tests or offline smoke checks. */
skipLocal?: boolean
/** Override for Codex auth-file detection. See detectProviderFromEnv. */
hasCodexAuth?: () => boolean
}): Promise<DetectedProvider | null> {
const env = options?.env ?? process.env
const fromEnv = detectProviderFromEnv({
env,
hasCodexAuth: options?.hasCodexAuth,
})
if (fromEnv) return fromEnv
if (options?.skipLocal) return null
return detectLocalService({
env,
fetchImpl: options?.fetchImpl,
timeoutMs: options?.timeoutMs,
})
}

View File

@@ -81,15 +81,6 @@ test('detects common local openai-compatible providers by hostname', async () =>
).toBe('vLLM') ).toBe('vLLM')
}) })
test('detects Moonshot (Kimi) from api.moonshot.ai hostname', async () => {
const { getLocalOpenAICompatibleProviderLabel } =
await loadProviderDiscoveryModule()
expect(
getLocalOpenAICompatibleProviderLabel('https://api.moonshot.ai/v1'),
).toBe('Moonshot (Kimi)')
})
test('falls back to a generic local openai-compatible label', async () => { test('falls back to a generic local openai-compatible label', async () => {
const { getLocalOpenAICompatibleProviderLabel } = const { getLocalOpenAICompatibleProviderLabel } =
await loadProviderDiscoveryModule() await loadProviderDiscoveryModule()
@@ -299,65 +290,3 @@ test('ollama generation readiness reports ready when chat probe succeeds', async
probeModel: 'llama3.1:8b', probeModel: 'llama3.1:8b',
}) })
}) })
test('atomic chat readiness reports unreachable when /v1/models is down', async () => {
const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
const calledUrls: string[] = []
globalThis.fetch = mock(input => {
const url = typeof input === 'string' ? input : input.url
calledUrls.push(url)
return Promise.resolve(new Response('unavailable', { status: 503 }))
}) as typeof globalThis.fetch
await expect(
probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
).resolves.toEqual({ state: 'unreachable' })
expect(calledUrls[0]).toBe('http://127.0.0.1:1337/v1/models')
})
test('atomic chat readiness reports no_models when server is reachable but empty', async () => {
const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
globalThis.fetch = mock(() =>
Promise.resolve(
new Response(JSON.stringify({ data: [] }), {
status: 200,
headers: { 'Content-Type': 'application/json' },
}),
),
) as typeof globalThis.fetch
await expect(
probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
).resolves.toEqual({ state: 'no_models' })
})
test('atomic chat readiness returns loaded model ids when ready', async () => {
const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
globalThis.fetch = mock(() =>
Promise.resolve(
new Response(
JSON.stringify({
data: [
{ id: 'Qwen3_5-4B_Q4_K_M' },
{ id: 'llama-3.1-8b-instruct' },
],
}),
{
status: 200,
headers: { 'Content-Type': 'application/json' },
},
),
),
) as typeof globalThis.fetch
await expect(
probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
).resolves.toEqual({
state: 'ready',
models: ['Qwen3_5-4B_Q4_K_M', 'llama-3.1-8b-instruct'],
})
})

View File

@@ -197,10 +197,6 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
if (host.includes('minimax') || haystack.includes('minimax')) { if (host.includes('minimax') || haystack.includes('minimax')) {
return 'MiniMax' return 'MiniMax'
} }
// Moonshot AI (Kimi) direct API
if (host.includes('moonshot') || haystack.includes('moonshot') || haystack.includes('kimi')) {
return 'Moonshot (Kimi)'
}
} catch { } catch {
// Fall back to the generic label when the base URL is malformed. // Fall back to the generic label when the base URL is malformed.
} }
@@ -302,24 +298,6 @@ export async function listAtomicChatModels(
} }
} }
export type AtomicChatReadiness =
| { state: 'unreachable' }
| { state: 'no_models' }
| { state: 'ready'; models: string[] }
export async function probeAtomicChatReadiness(options?: {
baseUrl?: string
}): Promise<AtomicChatReadiness> {
if (!(await hasLocalAtomicChat(options?.baseUrl))) {
return { state: 'unreachable' }
}
const models = await listAtomicChatModels(options?.baseUrl)
if (models.length === 0) {
return { state: 'no_models' }
}
return { state: 'ready', models }
}
export async function benchmarkOllamaModel( export async function benchmarkOllamaModel(
modelName: string, modelName: string,
baseUrl?: string, baseUrl?: string,

View File

@@ -572,64 +572,31 @@ test('buildStartupEnvFromProfile leaves explicit provider selections untouched',
assert.equal(env.OPENAI_API_KEY, undefined) assert.equal(env.OPENAI_API_KEY, undefined)
}) })
test('buildStartupEnvFromProfile preserves plural-profile env when the legacy file is stale', async () => { test('buildStartupEnvFromProfile lets saved startup profile override profile-managed env', async () => {
// Regression: a user saves a provider via /provider (plural system).
// addProviderProfile does NOT sync the legacy .openclaude-profile.json,
// so the legacy file retains whatever it had from an earlier setup (e.g.
// OpenAI defaults). At startup, applyActiveProviderProfileFromConfig()
// correctly applies the active plural profile (Moonshot) first, marking
// env with CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED=1. The legacy-file
// load must NOT overwrite that env — it previously did, surfacing as
// "banner shows the wrong provider / model".
const processEnv = { const processEnv = {
CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED: '1', CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED: '1',
CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID: 'saved_moonshot', CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID: 'saved_ollama',
CLAUDE_CODE_USE_OPENAI: '1', CLAUDE_CODE_USE_OPENAI: '1',
OPENAI_BASE_URL: 'https://api.moonshot.ai/v1', OPENAI_BASE_URL: 'http://localhost:11434/v1',
OPENAI_MODEL: 'kimi-k2.6', OPENAI_MODEL: 'llama3.1:8b',
} }
const env = await buildStartupEnvFromProfile({ const env = await buildStartupEnvFromProfile({
// Stale legacy file — points at SambaNova, but user's active plural
// profile is Moonshot and was just applied.
persisted: profile('openai', { persisted: profile('openai', {
OPENAI_API_KEY: 'sk-stale', OPENAI_API_KEY: 'sk-persisted',
OPENAI_MODEL: 'Meta-Llama-3.1-70B-Instruct', OPENAI_MODEL: 'Meta-Llama-3.1-70B-Instruct',
OPENAI_BASE_URL: 'https://api.sambanova.ai/v1', OPENAI_BASE_URL: 'https://api.sambanova.ai/v1',
}), }),
processEnv, processEnv,
}) })
assert.equal(env, processEnv)
assert.equal(env.OPENAI_BASE_URL, 'https://api.moonshot.ai/v1')
assert.equal(env.OPENAI_MODEL, 'kimi-k2.6')
// Plural markers are retained — downstream code uses them to verify the
// env still belongs to the profile it was applied from.
assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED, '1')
assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, 'saved_moonshot')
})
test('buildStartupEnvFromProfile falls back to legacy file when plural system has not applied', async () => {
// Counter-example: first-run user with only the legacy file (no plural
// active profile yet). The legacy file is the correct source, so the
// load must proceed as before.
const processEnv = {
CLAUDE_CODE_USE_OPENAI: '1',
}
const env = await buildStartupEnvFromProfile({
persisted: profile('openai', {
OPENAI_API_KEY: 'sk-legacy',
OPENAI_MODEL: 'gpt-4o',
OPENAI_BASE_URL: 'https://api.openai.com/v1',
}),
processEnv,
})
assert.notEqual(env, processEnv) assert.notEqual(env, processEnv)
assert.equal(env.OPENAI_API_KEY, 'sk-legacy') assert.equal(env.CLAUDE_CODE_USE_OPENAI, '1')
assert.equal(env.OPENAI_BASE_URL, 'https://api.openai.com/v1') assert.equal(env.OPENAI_API_KEY, 'sk-persisted')
assert.equal(env.OPENAI_MODEL, 'gpt-4o') assert.equal(env.OPENAI_MODEL, 'Meta-Llama-3.1-70B-Instruct')
assert.equal(env.OPENAI_BASE_URL, 'https://api.sambanova.ai/v1')
assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED, undefined)
assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, undefined)
}) })
test('buildStartupEnvFromProfile treats explicit falsey provider flags as user intent', async () => { test('buildStartupEnvFromProfile treats explicit falsey provider flags as user intent', async () => {

View File

@@ -841,35 +841,43 @@ export async function buildStartupEnvFromProfile(options?: {
const processEnv = options?.processEnv ?? process.env const processEnv = options?.processEnv ?? process.env
const persisted = options?.persisted ?? loadProfileFile() const persisted = options?.persisted ?? loadProfileFile()
// Saved /provider profiles should still win over provider-manager env that was
// auto-applied during startup. Only an explicit shell/flag provider selection
// should bypass the persisted startup profile.
//
const profileManagedEnv = processEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED === '1' const profileManagedEnv = processEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED === '1'
// The legacy single-profile file (~/.openclaude-profile.json) is a // If the user explicitly selected a provider via env, allow it to bypass
// first-run / fallback mechanism. The newer plural provider-profile // the persisted profile only when we can prove it was managed by the
// system (`/provider` presets + activeProviderProfileId in config) is // persisted profile env itself.
// applied earlier in the bootstrap via applyActiveProviderProfileFromConfig
// and signals completion with CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED=1.
// //
// If the plural system has already set env, trust it — do NOT overlay the // Practically: on initial startup, provider routing env vars can already
// legacy file. addProviderProfile() does not sync the legacy file, so a // be present due to earlier auto-application steps. We should still apply
// stale legacy file (e.g. OpenAI defaults from an earlier manual setup) // the persisted profile rather than returning early.
// would otherwise overwrite the correct plural env and surface as the
// "banner shows gpt-4o / api.openai.com even though my saved profile is
// Moonshot" bug.
if (profileManagedEnv) {
return processEnv
}
if (!persisted) { if (!persisted) {
return processEnv return processEnv
} }
const launchProcessEnv = profileManagedEnv
? (() => {
const cleanedEnv = { ...processEnv }
for (const key of PROFILE_ENV_KEYS) {
delete cleanedEnv[key]
}
delete cleanedEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED
delete cleanedEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID
return cleanedEnv
})()
: processEnv
return buildLaunchEnv({ return buildLaunchEnv({
profile: persisted.profile, profile: persisted.profile,
persisted, persisted,
goal: goal:
options?.goal ?? options?.goal ??
normalizeRecommendationGoal(processEnv.OPENCLAUDE_PROFILE_GOAL), normalizeRecommendationGoal(processEnv.OPENCLAUDE_PROFILE_GOAL),
processEnv, processEnv: launchProcessEnv,
getOllamaChatBaseUrl: getOllamaChatBaseUrl:
options?.getOllamaChatBaseUrl ?? getOllamaChatBaseUrl, options?.getOllamaChatBaseUrl ?? getOllamaChatBaseUrl,
resolveOllamaDefaultModel: options?.resolveOllamaDefaultModel, resolveOllamaDefaultModel: options?.resolveOllamaDefaultModel,

View File

@@ -256,83 +256,6 @@ describe('applyActiveProviderProfileFromConfig', () => {
expect(process.env.OPENAI_MODEL).toBe('qwen2.5:3b') expect(process.env.OPENAI_MODEL).toBe('qwen2.5:3b')
}) })
test('applies active profile when a bare CLAUDE_CODE_USE_OPENAI flag is stale (no BASE_URL/MODEL)', async () => {
// Regression: a leftover `CLAUDE_CODE_USE_OPENAI=1` in the shell with no
// paired OPENAI_BASE_URL / OPENAI_MODEL is not a real explicit selection
// — it's a stale export. The previous guard treated it as intent and
// skipped the saved profile, causing the startup banner to show hardcoded
// defaults (gpt-4o @ api.openai.com) instead of the user's active
// profile.
const { applyActiveProviderProfileFromConfig } =
await importFreshProviderProfileModules()
process.env.CLAUDE_CODE_USE_OPENAI = '1'
delete process.env.OPENAI_BASE_URL
delete process.env.OPENAI_API_BASE
delete process.env.OPENAI_MODEL
const applied = applyActiveProviderProfileFromConfig({
providerProfiles: [
buildProfile({
id: 'saved_moonshot',
baseUrl: 'https://api.moonshot.ai/v1',
model: 'kimi-k2.6',
}),
],
activeProviderProfileId: 'saved_moonshot',
} as any)
expect(applied?.id).toBe('saved_moonshot')
expect(process.env.OPENAI_BASE_URL).toBe('https://api.moonshot.ai/v1')
expect(process.env.OPENAI_MODEL).toBe('kimi-k2.6')
})
test('still respects complete shell selection with USE flag + BASE_URL', async () => {
// Counter-example: when the user really did set both the flag AND a
// concrete BASE_URL, that IS explicit intent and wins over the saved
// profile. This preserves the original "explicit startup wins" semantic.
const { applyActiveProviderProfileFromConfig } =
await importFreshProviderProfileModules()
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_BASE_URL = 'http://192.168.1.1:8080/v1'
delete process.env.OPENAI_MODEL
const applied = applyActiveProviderProfileFromConfig({
providerProfiles: [
buildProfile({
id: 'saved_moonshot',
baseUrl: 'https://api.moonshot.ai/v1',
model: 'kimi-k2.6',
}),
],
activeProviderProfileId: 'saved_moonshot',
} as any)
expect(applied).toBeUndefined()
expect(process.env.OPENAI_BASE_URL).toBe('http://192.168.1.1:8080/v1')
})
test('still respects complete shell selection with USE flag + MODEL', async () => {
const { applyActiveProviderProfileFromConfig } =
await importFreshProviderProfileModules()
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'gpt-4o-mini'
delete process.env.OPENAI_BASE_URL
const applied = applyActiveProviderProfileFromConfig({
providerProfiles: [
buildProfile({
id: 'saved_moonshot',
baseUrl: 'https://api.moonshot.ai/v1',
model: 'kimi-k2.6',
}),
],
activeProviderProfileId: 'saved_moonshot',
} as any)
expect(applied).toBeUndefined()
expect(process.env.OPENAI_MODEL).toBe('gpt-4o-mini')
})
test('does not override explicit startup selection when profile marker is stale', async () => { test('does not override explicit startup selection when profile marker is stale', async () => {
const { applyActiveProviderProfileFromConfig } = const { applyActiveProviderProfileFromConfig } =
await importFreshProviderProfileModules() await importFreshProviderProfileModules()
@@ -527,18 +450,6 @@ describe('getProviderPresetDefaults', () => {
expect(defaults.baseUrl).toBe('http://localhost:11434/v1') expect(defaults.baseUrl).toBe('http://localhost:11434/v1')
expect(defaults.model).toBe('llama3.1:8b') expect(defaults.model).toBe('llama3.1:8b')
}) })
test('atomic-chat preset defaults to a local Atomic Chat endpoint', async () => {
const { getProviderPresetDefaults } = await importFreshProviderProfileModules()
delete process.env.OPENAI_MODEL
const defaults = getProviderPresetDefaults('atomic-chat')
expect(defaults.provider).toBe('openai')
expect(defaults.name).toBe('Atomic Chat')
expect(defaults.baseUrl).toBe('http://127.0.0.1:1337/v1')
expect(defaults.requiresApiKey).toBe(false)
})
}) })
describe('setActiveProviderProfile', () => { describe('setActiveProviderProfile', () => {

View File

@@ -33,7 +33,6 @@ export type ProviderPreset =
| 'custom' | 'custom'
| 'nvidia-nim' | 'nvidia-nim'
| 'minimax' | 'minimax'
| 'atomic-chat'
export type ProviderProfileInput = { export type ProviderProfileInput = {
provider?: ProviderProfile['provider'] provider?: ProviderProfile['provider']
@@ -286,15 +285,6 @@ export function getProviderPresetDefaults(
apiKey: process.env.MINIMAX_API_KEY ?? '', apiKey: process.env.MINIMAX_API_KEY ?? '',
requiresApiKey: true, requiresApiKey: true,
} }
case 'atomic-chat':
return {
provider: 'openai',
name: 'Atomic Chat',
baseUrl: 'http://127.0.0.1:1337/v1',
model: process.env.OPENAI_MODEL ?? 'local-model',
apiKey: '',
requiresApiKey: false,
}
case 'ollama': case 'ollama':
default: default:
return { return {
@@ -332,58 +322,6 @@ function hasProviderSelectionFlags(
) )
} }
/**
* A "complete" explicit provider selection = a USE flag AND at least one
* concrete config value that tells us WHERE to route (a base URL) or WHAT
* to run (a model id). A bare `CLAUDE_CODE_USE_OPENAI=1` with nothing else
* is almost always a stale shell export from a previous session, not real
* intent — and if we respect it, we skip the user's saved active profile
* and fall back to hardcoded defaults (gpt-4o / api.openai.com), which is
* the exact bug users report as "my saved provider isn't picked up".
*
* Used to gate whether saved-profile env should override shell state at
* startup. The weaker `hasProviderSelectionFlags` is still used for the
* anthropic-profile conflict check (any flag is a conflict for
* first-party anthropic) and for alignment fingerprinting.
*/
function hasCompleteProviderSelection(
processEnv: NodeJS.ProcessEnv = process.env,
): boolean {
if (!hasProviderSelectionFlags(processEnv)) return false
if (processEnv.CLAUDE_CODE_USE_OPENAI !== undefined) {
return (
trimOrUndefined(processEnv.OPENAI_BASE_URL) !== undefined ||
trimOrUndefined(processEnv.OPENAI_API_BASE) !== undefined ||
trimOrUndefined(processEnv.OPENAI_MODEL) !== undefined
)
}
if (processEnv.CLAUDE_CODE_USE_GEMINI !== undefined) {
return (
trimOrUndefined(processEnv.GEMINI_BASE_URL) !== undefined ||
trimOrUndefined(processEnv.GEMINI_MODEL) !== undefined ||
trimOrUndefined(processEnv.GEMINI_API_KEY) !== undefined ||
trimOrUndefined(processEnv.GOOGLE_API_KEY) !== undefined
)
}
if (processEnv.CLAUDE_CODE_USE_MISTRAL !== undefined) {
return (
trimOrUndefined(processEnv.MISTRAL_BASE_URL) !== undefined ||
trimOrUndefined(processEnv.MISTRAL_MODEL) !== undefined ||
trimOrUndefined(processEnv.MISTRAL_API_KEY) !== undefined
)
}
if (processEnv.CLAUDE_CODE_USE_GITHUB !== undefined) {
return (
trimOrUndefined(processEnv.GITHUB_TOKEN) !== undefined ||
trimOrUndefined(processEnv.GH_TOKEN) !== undefined ||
trimOrUndefined(processEnv.OPENAI_MODEL) !== undefined
)
}
// Bedrock / Vertex / Foundry signal cloud-provider routing in env; treat
// the flag alone as complete (these paths rely on ambient AWS/GCP creds).
return true
}
function hasConflictingProviderFlagsForProfile( function hasConflictingProviderFlagsForProfile(
processEnv: NodeJS.ProcessEnv, processEnv: NodeJS.ProcessEnv,
profile: ProviderProfile, profile: ProviderProfile,
@@ -626,15 +564,9 @@ export function applyActiveProviderProfileFromConfig(
processEnv[PROFILE_ENV_APPLIED_FLAG] === '1' && processEnv[PROFILE_ENV_APPLIED_FLAG] === '1' &&
trimOrUndefined(processEnv[PROFILE_ENV_APPLIED_ID]) === activeProfile.id trimOrUndefined(processEnv[PROFILE_ENV_APPLIED_ID]) === activeProfile.id
if (!options?.force && (hasCompleteProviderSelection(processEnv) || processEnv[PROFILE_ENV_APPLIED_FLAG] === '1')) { if (!options?.force && (hasProviderSelectionFlags(processEnv) || processEnv[PROFILE_ENV_APPLIED_FLAG] === '1')) {
// Respect explicit startup provider intent. Auto-heal only when this // Respect explicit startup provider intent. Auto-heal only when this
// exact active profile previously applied the current env. // exact active profile previously applied the current env.
// NOTE: we gate on hasCompleteProviderSelection (flag + concrete config)
// rather than hasProviderSelectionFlags alone. A bare CLAUDE_CODE_USE_*=1
// with no BASE_URL/MODEL is almost always a stale shell export, not
// intent — respecting it would skip the saved profile and fall through
// to hardcoded provider defaults, which surfaces as "my saved provider
// isn't being picked up at startup".
if (!isCurrentEnvProfileManaged) { if (!isCurrentEnvProfileManaged) {
return undefined return undefined
} }

View File

@@ -1,86 +0,0 @@
import { describe, expect, it, beforeEach } from 'bun:test'
import {
createCorrelationId,
logApiCallStart,
logApiCallEnd,
} from './requestLogging.js'
describe('requestLogging', () => {
describe('createCorrelationId', () => {
it('returns a non-empty string', () => {
const id = createCorrelationId()
expect(id).toBeTruthy()
expect(typeof id).toBe('string')
})
it('returns unique IDs', () => {
const id1 = createCorrelationId()
const id2 = createCorrelationId()
expect(id1).not.toBe(id2)
})
})
describe('logApiCallStart', () => {
it('returns correlation ID and start time', () => {
const result = logApiCallStart('openai', 'gpt-4o')
expect(result.correlationId).toBeTruthy()
expect(result.startTime).toBeGreaterThan(0)
})
it('logs without throwing', () => {
expect(() => logApiCallStart('ollama', 'llama3')).not.toThrow()
})
})
describe('logApiCallEnd', () => {
it('logs success without throwing', () => {
const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
expect(() =>
logApiCallEnd(
correlationId,
startTime,
'gpt-4o',
'success',
100,
50,
false,
),
).not.toThrow()
})
it('logs error without throwing', () => {
const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
expect(() =>
logApiCallEnd(
correlationId,
startTime,
'gpt-4o',
'error',
0,
0,
false,
undefined,
undefined,
'Network error',
),
).not.toThrow()
})
it('logs with all parameters without throwing', () => {
const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
expect(() =>
logApiCallEnd(
correlationId,
startTime,
'gpt-4o',
'success',
100,
50,
true,
'error message',
{ provider: 'openai' },
),
).not.toThrow()
})
})
})

View File

@@ -1,89 +0,0 @@
/**
* Structured Request Logging
*
* Uses existing logForDebugging for structured logging.
*/
import { randomUUID } from 'crypto'
import { logForDebugging } from './debug.js'
export interface RequestLog {
correlationId: string
timestamp: number
provider: string
model: string
duration: number
status: 'success' | 'error'
tokensIn: number
tokensOut: number
error?: string
streaming: boolean
}
export function createCorrelationId(): string {
return randomUUID()
}
export function logApiCallStart(
provider: string,
model: string,
): { correlationId: string; startTime: number } {
const correlationId = createCorrelationId()
const startTime = Date.now()
logForDebugging(
JSON.stringify({
type: 'api_call_start',
correlationId,
provider,
model,
timestamp: startTime,
}),
{ level: 'debug' },
)
return { correlationId, startTime }
}
export function logApiCallEnd(
correlationId: string,
startTime: number,
model: string,
status: 'success' | 'error',
tokensIn: number,
tokensOut: number,
streaming: boolean,
firstTokenMs?: number,
totalChunks?: number,
error?: string,
): void {
const duration = Date.now() - startTime
const logData: Record<string, unknown> = {
type: status === 'error' ? 'api_call_error' : 'api_call_end',
correlationId,
model,
duration_ms: duration,
status,
tokens_in: tokensIn,
tokens_out: tokensOut,
streaming,
}
if (firstTokenMs !== undefined) {
logData.first_token_ms = firstTokenMs
}
if (totalChunks !== undefined) {
logData.total_chunks = totalChunks
}
if (error) {
logData.error = error
}
logForDebugging(
JSON.stringify(logData),
{ level: status === 'error' ? 'error' : 'debug' },
)
}

View File

@@ -1,61 +0,0 @@
import { describe, expect, it, beforeEach } from 'bun:test'
import {
createStreamState,
processStreamChunk,
flushStreamBuffer,
getStreamStats,
} from './streamingOptimizer.js'
describe('streamingOptimizer', () => {
let state: ReturnType<typeof createStreamState>
beforeEach(() => {
state = createStreamState()
})
describe('createStreamState', () => {
it('creates initial state with zero counts', () => {
expect(state.chunkCount).toBe(0)
expect(state.firstTokenTime).toBeNull()
expect(state.startTime).toBeGreaterThan(0)
})
})
describe('processStreamChunk', () => {
it('tracks first token time on first chunk', () => {
processStreamChunk(state, 'hello')
expect(state.firstTokenTime).not.toBeNull()
expect(state.chunkCount).toBe(1)
})
it('increments chunk count', () => {
processStreamChunk(state, 'chunk1')
processStreamChunk(state, 'chunk2')
expect(state.chunkCount).toBe(2)
})
})
describe('getStreamStats', () => {
it('returns zero values for empty stream', () => {
const stats = getStreamStats(state)
expect(stats.totalChunks).toBe(0)
expect(stats.firstTokenMs).toBeNull()
expect(stats.durationMs).toBeGreaterThanOrEqual(0)
})
it('returns correct stats after processing chunks', () => {
processStreamChunk(state, 'test')
const stats = getStreamStats(state)
expect(stats.totalChunks).toBe(1)
expect(stats.firstTokenMs).toBeGreaterThanOrEqual(0)
expect(stats.durationMs).toBeGreaterThanOrEqual(0)
})
})
describe('flushStreamBuffer', () => {
it('returns empty string (no-op)', () => {
const result = flushStreamBuffer(state)
expect(result).toBe('')
})
})
})

View File

@@ -1,51 +0,0 @@
/**
* Streaming Stats Tracker
*
* Observational stats tracking for streaming responses.
* No buffering - purely tracks metrics for monitoring.
*/
export interface StreamStats {
totalChunks: number
firstTokenMs: number | null
durationMs: number
}
export interface StreamState {
chunkCount: number
firstTokenTime: number | null
startTime: number
}
export function createStreamState(): StreamState {
return {
chunkCount: 0,
firstTokenTime: null,
startTime: Date.now(),
}
}
export function processStreamChunk(state: StreamState, _chunk: string): void {
if (state.firstTokenTime === null) {
state.firstTokenTime = Date.now()
}
state.chunkCount++
}
export function flushStreamBuffer(_state: StreamState): string {
return '' // No-op - kept for API compatibility
}
export function getStreamStats(state: StreamState): StreamStats {
const now = Date.now()
const firstTokenMs = state.firstTokenTime
? now - state.firstTokenTime
: null
const durationMs = now - state.startTime
return {
totalChunks: state.chunkCount,
firstTokenMs,
durationMs,
}
}

View File

@@ -1,106 +0,0 @@
import { describe, expect, it } from 'bun:test'
import { ThinkingTokenAnalyzer } from './thinkingTokenExtractor.js'
describe('ThinkingTokenAnalyzer', () => {
describe('extract', () => {
it('extracts thinking and output separately', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'thinking', thinking: 'Let me think about this...' },
{ type: 'text', text: 'Here is my answer.' },
],
},
} as any
const result = ThinkingTokenAnalyzer.extract(message)
expect(result.thinking).toBeGreaterThan(0)
expect(result.output).toBeGreaterThan(0)
expect(result.total).toBe(result.thinking + result.output)
})
it('handles no thinking', () => {
const message = {
type: 'assistant',
message: {
content: [{ type: 'text', text: 'Hello world' }],
},
} as any
const result = ThinkingTokenAnalyzer.extract(message)
expect(result.thinking).toBe(0)
expect(result.output).toBeGreaterThan(0)
})
it('handles redacted thinking', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'redacted_thinking', data: '[thinking hidden]' },
{ type: 'text', text: 'Answer here.' },
],
},
} as any
const result = ThinkingTokenAnalyzer.extract(message)
expect(result.thinking).toBeGreaterThan(0)
expect(result.output).toBeGreaterThan(0)
})
})
describe('analyze', () => {
it('calculates percentages', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'thinking', thinking: 'Thinking1 Thinking2 Thinking3' },
{ type: 'text', text: 'Output1 Output2' },
],
},
} as any
const analysis = ThinkingTokenAnalyzer.analyze(message)
expect(analysis.hasThinking).toBe(true)
expect(analysis.thinkingPercentage).toBeGreaterThan(0)
expect(analysis.outputPercentage).toBeGreaterThan(0)
expect(analysis.reasoningComplexity).toBeTruthy()
})
})
describe('hasSignificantThinking', () => {
it('detects significant thinking', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'thinking', thinking: 'x'.repeat(500) },
{ type: 'text', text: 'short' },
],
},
} as any
expect(ThinkingTokenAnalyzer.hasSignificantThinking(message, 20)).toBe(true)
})
it('rejects minimal thinking', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'thinking', thinking: 'a' },
{ type: 'text', text: 'much longer output text here with more content' },
],
},
} as any
expect(ThinkingTokenAnalyzer.hasSignificantThinking(message, 20)).toBe(false)
})
})
})

View File

@@ -1,192 +0,0 @@
/**
* Thinking Token Extractor - Production-grade thinking token analysis
*
* Extracts and analyzes thinking tokens from assistant messages.
* Provides detailed breakdown, statistics, and insights.
*/
import { roughTokenCountEstimation } from '../services/tokenEstimation.js'
import { jsonStringify } from './slowOperations.js'
import type { AssistantMessage, Message } from '../types/message.js'
export interface ThinkingBlock {
type: 'thinking' | 'redacted_thinking'
content: string
tokens: number
}
export interface OutputBlock {
type: 'text' | 'tool_use'
content: string
tokens: number
}
export interface ThinkingTokenBreakdown {
thinking: number
output: number
total: number
thinkingBlocks: ThinkingBlock[]
outputBlocks: OutputBlock[]
}
export interface ThinkingAnalysis {
hasThinking: boolean
thinkingPercentage: number
outputPercentage: number
blockCount: number
avgThinkingBlockSize: number
avgOutputBlockSize: number
totalTextLength: number
reasoningComplexity: 'low' | 'medium' | 'high'
}
export class ThinkingTokenAnalyzer {
/**
* Extract detailed thinking vs output breakdown
*/
static extract(message: AssistantMessage): ThinkingTokenBreakdown {
const thinkingBlocks: ThinkingBlock[] = []
const outputBlocks: OutputBlock[] = []
let thinking = 0
let output = 0
for (const block of message.message.content) {
if (block.type === 'thinking') {
const tokens = roughTokenCountEstimation(block.thinking)
thinking += tokens
thinkingBlocks.push({
type: 'thinking',
content: block.thinking,
tokens,
})
} else if (block.type === 'redacted_thinking') {
const tokens = roughTokenCountEstimation(block.data)
thinking += tokens
thinkingBlocks.push({
type: 'redacted_thinking',
content: block.data,
tokens,
})
} else if (block.type === 'text') {
const tokens = roughTokenCountEstimation(block.text)
output += tokens
outputBlocks.push({
type: 'text',
content: block.text,
tokens,
})
} else if (block.type === 'tool_use') {
const content = jsonStringify(block.input)
const tokens = roughTokenCountEstimation(content)
output += tokens
outputBlocks.push({
type: 'tool_use',
content,
tokens,
})
}
}
return {
thinking,
output,
total: thinking + output,
thinkingBlocks,
outputBlocks,
}
}
/**
* Simple extraction for quick use
*/
static extractSimple(message: AssistantMessage): ThinkingTokenBreakdown {
return this.extract(message)
}
/**
* Analyze thinking patterns and provide insights
*/
static analyze(message: AssistantMessage): ThinkingAnalysis {
const breakdown = this.extract(message)
const { thinking, output, total, thinkingBlocks, outputBlocks } = breakdown
const hasThinking = thinking > 0
const thinkingPercentage = total > 0 ? (thinking / total) * 100 : 0
const outputPercentage = total > 0 ? (output / total) * 100 : 0
const avgThinkingBlockSize = thinkingBlocks.length > 0
? thinkingBlocks.reduce((sum, b) => sum + b.tokens, 0) / thinkingBlocks.length
: 0
const avgOutputBlockSize = outputBlocks.length > 0
? outputBlocks.reduce((sum, b) => sum + b.tokens, 0) / outputBlocks.length
: 0
const totalTextLength = [...thinkingBlocks, ...outputBlocks].reduce(
(sum, b) => sum + b.content.length,
0,
)
// Complexity based on thinking percentage and block count
let reasoningComplexity: 'low' | 'medium' | 'high' = 'low'
if (thinkingPercentage > 30 || thinkingBlocks.length > 5) {
reasoningComplexity = 'high'
} else if (thinkingPercentage > 10 || thinkingBlocks.length > 2) {
reasoningComplexity = 'medium'
}
return {
hasThinking,
thinkingPercentage: Math.round(thinkingPercentage * 10) / 10,
outputPercentage: Math.round(outputPercentage * 10) / 10,
blockCount: thinkingBlocks.length + outputBlocks.length,
avgThinkingBlockSize: Math.round(avgThinkingBlockSize),
avgOutputBlockSize: Math.round(avgOutputBlockSize),
totalTextLength,
reasoningComplexity,
}
}
/**
* Check if message has significant thinking
*/
static hasSignificantThinking(
message: AssistantMessage,
thresholdPercent = 20,
): boolean {
const analysis = this.analyze(message)
return analysis.thinkingPercentage >= thresholdPercent
}
/**
* Get thinking-only messages from an array
*/
static filterThinkingMessages(messages: Message[]): AssistantMessage[] {
return messages
.filter((m): m is AssistantMessage => m.type === 'assistant')
.filter(m => this.hasSignificantThinking(m))
}
/**
* Calculate total thinking tokens across messages
*/
static totalThinkingTokens(messages: Message[]): number {
return messages
.filter((m): m is AssistantMessage => m.type === 'assistant')
.reduce((sum, m) => sum + this.extract(m).thinking, 0)
}
}
/**
* Legacy export for backward compatibility
*/
export function extractThinkingTokens(
message: AssistantMessage,
): { thinking: number; output: number; total: number } {
const result = ThinkingTokenAnalyzer.extract(message)
return {
thinking: result.thinking,
output: result.output,
total: result.total,
}
}

View File

@@ -1,69 +0,0 @@
import { describe, expect, it } from 'bun:test'
import { extractThinkingTokens } from './tokens.js'
describe('extractThinkingTokens', () => {
it('extracts thinking and output separately', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'thinking', thinking: 'Let me think about this...' },
{ type: 'text', text: 'Here is my answer.' },
],
},
} as any
const result = extractThinkingTokens(message)
expect(result.thinking).toBeGreaterThan(0)
expect(result.output).toBeGreaterThan(0)
expect(result.total).toBe(result.thinking + result.output)
})
it('handles no thinking', () => {
const message = {
type: 'assistant',
message: {
content: [{ type: 'text', text: 'Hello world' }],
},
} as any
const result = extractThinkingTokens(message)
expect(result.thinking).toBe(0)
expect(result.output).toBeGreaterThan(0)
})
it('handles redacted thinking', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'redacted_thinking', data: '[thinking hidden]' },
{ type: 'text', text: 'Answer here.' },
],
},
} as any
const result = extractThinkingTokens(message)
expect(result.thinking).toBeGreaterThan(0)
expect(result.output).toBeGreaterThan(0)
})
it('handles tool use', () => {
const message = {
type: 'assistant',
message: {
content: [
{ type: 'tool_use', id: 'tool_1', name: 'bash', input: { cmd: 'echo test' } },
{ type: 'text', text: 'Ran command.' },
],
},
} as any
const result = extractThinkingTokens(message)
expect(result.output).toBeGreaterThan(0)
})
})

View File

@@ -1,84 +0,0 @@
import { describe, expect, it, beforeEach } from 'bun:test'
import { TokenUsageTracker } from './tokenAnalytics.js'
describe('TokenUsageTracker', () => {
let tracker: TokenUsageTracker
beforeEach(() => {
tracker = new TokenUsageTracker(100)
})
it('records token usage', () => {
tracker.record({
input_tokens: 1000,
output_tokens: 500,
cache_read_input_tokens: 200,
cache_creation_input_tokens: 100,
model: 'claude-sonnet-4-5-20250514',
})
expect(tracker.size).toBe(1)
})
it('calculates analytics', () => {
tracker.record({
input_tokens: 1000,
output_tokens: 500,
model: 'claude-sonnet-4-5-20250514',
})
tracker.record({
input_tokens: 2000,
output_tokens: 300,
model: 'claude-sonnet-4-5-20250514',
})
const analytics = tracker.getAnalytics()
expect(analytics.totalRequests).toBe(2)
expect(analytics.totalInputTokens).toBe(3000)
expect(analytics.totalOutputTokens).toBe(800)
expect(analytics.averageInputPerRequest).toBe(1500)
expect(analytics.averageOutputPerRequest).toBe(400)
})
it('tracks cache hit rate', () => {
tracker.record({
input_tokens: 1000,
output_tokens: 500,
cache_read_input_tokens: 500, // 33% cache
model: 'claude-sonnet-4-5-20250514',
})
const analytics = tracker.getAnalytics()
expect(analytics.cacheHitRate).toBeGreaterThan(0)
})
it('tracks most used model', () => {
tracker.record({ input_tokens: 1000, output_tokens: 100, model: 'sonnet' })
tracker.record({ input_tokens: 1000, output_tokens: 100, model: 'sonnet' })
tracker.record({ input_tokens: 1000, output_tokens: 100, model: 'opus' })
expect(tracker.getAnalytics().mostUsedModel).toBe('sonnet')
})
it('respects max entries limit', () => {
const smallTracker = new TokenUsageTracker(3)
smallTracker.record({ input_tokens: 1, output_tokens: 1, model: 'a' })
smallTracker.record({ input_tokens: 2, output_tokens: 2, model: 'b' })
smallTracker.record({ input_tokens: 3, output_tokens: 3, model: 'c' })
smallTracker.record({ input_tokens: 4, output_tokens: 4, model: 'd' })
smallTracker.record({ input_tokens: 5, output_tokens: 5, model: 'e' })
expect(smallTracker.size).toBe(3)
})
it('clears history', () => {
tracker.record({ input_tokens: 1000, output_tokens: 100, model: 'test' })
tracker.clear()
expect(tracker.size).toBe(0)
})
})

View File

@@ -1,211 +0,0 @@
/**
* Token Analytics - Historical token usage tracking and analysis
*
* Tracks token usage patterns over time for cost optimization
* and capacity planning.
*/
import type { BetaUsage as Usage } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
export interface TokenUsageEntry {
timestamp: number
inputTokens: number
outputTokens: number
cacheReadTokens: number
cacheCreationTokens: number
model: string
}
export interface TokenAnalytics {
totalRequests: number
totalInputTokens: number
totalOutputTokens: number
totalCacheRead: number
totalCacheCreation: number
averageInputPerRequest: number
averageOutputPerRequest: number
cacheHitRate: number
mostUsedModel: string
requestsLastHour: number
requestsLastDay: number
}
/**
* Historical Token Analytics Tracker
*
* Tracks token usage patterns over time for analytics,
* cost optimization, and capacity planning.
*/
export class TokenUsageTracker {
private history: TokenUsageEntry[] = []
private readonly maxEntries: number
constructor(maxEntries = 1000) {
this.maxEntries = maxEntries
}
/**
* Record a token usage event from API response.
*/
record(usage: {
input_tokens: number
output_tokens: number
cache_read_input_tokens?: number
cache_creation_input_tokens?: number
model: string
}): void {
const entry: TokenUsageEntry = {
timestamp: Date.now(),
inputTokens: usage.input_tokens,
outputTokens: usage.output_tokens,
cacheReadTokens: usage.cache_read_input_tokens ?? 0,
cacheCreationTokens: usage.cache_creation_input_tokens ?? 0,
model: usage.model,
}
this.history.push(entry)
if (this.history.length > this.maxEntries) {
this.history = this.history.slice(-this.maxEntries)
}
}
/**
* Get analytics summary for all recorded usage.
*/
getAnalytics(): TokenAnalytics {
if (this.history.length === 0) {
return {
totalRequests: 0,
totalInputTokens: 0,
totalOutputTokens: 0,
totalCacheRead: 0,
totalCacheCreation: 0,
averageInputPerRequest: 0,
averageOutputPerRequest: 0,
cacheHitRate: 0,
mostUsedModel: 'unknown',
requestsLastHour: 0,
requestsLastDay: 0,
}
}
const now = Date.now()
const hourAgo = now - 60 * 60 * 1000
const dayAgo = now - 24 * 60 * 60 * 1000
let totalInput = 0
let totalOutput = 0
let totalCacheRead = 0
let totalCacheCreation = 0
const modelCounts = new Map<string, number>()
let requestsLastHour = 0
let requestsLastDay = 0
for (const entry of this.history) {
totalInput += entry.inputTokens
totalOutput += entry.outputTokens
totalCacheRead += entry.cacheReadTokens
totalCacheCreation += entry.cacheCreationTokens
modelCounts.set(entry.model, (modelCounts.get(entry.model) ?? 0) + 1)
if (entry.timestamp >= hourAgo) requestsLastHour++
if (entry.timestamp >= dayAgo) requestsLastDay++
}
let mostUsedModel = 'unknown'
let maxCount = 0
for (const [model, count] of modelCounts) {
if (count > maxCount) {
maxCount = count
mostUsedModel = model
}
}
const totalRequests = this.history.length
const totalCache = totalCacheRead + totalCacheCreation
const totalTokens = totalInput + totalOutput + totalCache
const cacheHitRate = totalTokens > 0 ? (totalCacheRead / totalTokens) * 100 : 0
return {
totalRequests,
totalInputTokens: totalInput,
totalOutputTokens: totalOutput,
totalCacheRead,
totalCacheCreation,
averageInputPerRequest: Math.round(totalInput / totalRequests),
averageOutputPerRequest: Math.round(totalOutput / totalRequests),
cacheHitRate: Math.round(cacheHitRate),
mostUsedModel,
requestsLastHour,
requestsLastDay,
}
}
/**
* Get recent entries within time window.
*/
getRecent(windowMs: number): TokenUsageEntry[] {
const cutoff = Date.now() - windowMs
return this.history.filter(e => e.timestamp >= cutoff)
}
/**
* Get entries for a specific model
*/
getByModel(model: string): TokenUsageEntry[] {
return this.history.filter(e => e.model === model)
}
/**
* Calculate cost estimate (approximate)
*/
estimateCost(): { input: number; output: number; cache: number } {
const analytics = this.getAnalytics()
// Approximate pricing (adjust as needed)
const inputCost = analytics.totalInputTokens * 0.00015
const outputCost = analytics.totalOutputTokens * 0.0006
const cacheCost = analytics.totalCacheRead * 0.000075
return {
input: Math.round(inputCost * 100) / 100,
output: Math.round(outputCost * 100) / 100,
cache: Math.round(cacheCost * 100) / 100,
}
}
/**
* Clear history.
*/
clear(): void {
this.history = []
}
/**
* Get history size.
*/
get size(): number {
return this.history.length
}
/**
* Export history as JSON
*/
export(): string {
return JSON.stringify(this.history, null, 2)
}
/**
* Import history from JSON
*/
import(json: string): void {
try {
const entries = JSON.parse(json) as TokenUsageEntry[]
this.history = entries.slice(-this.maxEntries)
} catch {
// Invalid JSON, ignore
}
}
}

View File

@@ -1,5 +1,5 @@
import type { BetaUsage as Usage } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs' import type { BetaUsage as Usage } from '@anthropic-ai/sdk/resources/beta/messages/messages.mjs'
import { roughTokenCountEstimation, roughTokenCountEstimationForMessages } from '../services/tokenEstimation.js' import { roughTokenCountEstimationForMessages } from '../services/tokenEstimation.js'
import type { AssistantMessage, Message } from '../types/message.js' import type { AssistantMessage, Message } from '../types/message.js'
import { SYNTHETIC_MESSAGES, SYNTHETIC_MODEL } from './messages.js' import { SYNTHETIC_MESSAGES, SYNTHETIC_MODEL } from './messages.js'
import { jsonStringify } from './slowOperations.js' import { jsonStringify } from './slowOperations.js'
@@ -198,198 +198,6 @@ export function getAssistantMessageContentLength(
return contentLength return contentLength
} }
/**
* Extract thinking tokens from an assistant message.
* Returns breakdown of thinking vs output tokens.
*/
export function extractThinkingTokens(
message: AssistantMessage,
): { thinking: number; output: number; total: number } {
let thinking = 0
let output = 0
for (const block of message.message.content) {
if (block.type === 'thinking') {
thinking += roughTokenCountEstimation(block.thinking)
} else if (block.type === 'redacted_thinking') {
thinking += roughTokenCountEstimation(block.data)
} else if (block.type === 'text') {
output += roughTokenCountEstimation(block.text)
} else if (block.type === 'tool_use') {
output += roughTokenCountEstimation(jsonStringify(block.input))
}
}
return { thinking, output, total: thinking + output }
}
/**
* Token usage history entry for tracking patterns over time.
*/
export interface TokenUsageEntry {
timestamp: number
inputTokens: number
outputTokens: number
cacheReadTokens: number
cacheCreationTokens: number
model: string
}
/**
* Token analytics summary from historical data.
*/
export interface TokenAnalytics {
totalRequests: number
totalInputTokens: number
totalOutputTokens: number
totalCacheRead: number
totalCacheCreation: number
averageInputPerRequest: number
averageOutputPerRequest: number
cacheHitRate: number
mostUsedModel: string
requestsLastHour: number
requestsLastDay: number
}
/**
* Historical Token Analytics Tracker
*
* Tracks token usage patterns over time for analytics,
* cost optimization, and capacity planning.
*/
export class TokenUsageTracker {
private history: TokenUsageEntry[] = []
private readonly maxEntries: number
constructor(maxEntries = 1000) {
this.maxEntries = maxEntries
}
/**
* Record a token usage event from API response.
*/
record(usage: {
input_tokens: number
output_tokens: number
cache_read_input_tokens?: number
cache_creation_input_tokens?: number
model: string
}): void {
const entry: TokenUsageEntry = {
timestamp: Date.now(),
inputTokens: usage.input_tokens,
outputTokens: usage.output_tokens,
cacheReadTokens: usage.cache_read_input_tokens ?? 0,
cacheCreationTokens: usage.cache_creation_input_tokens ?? 0,
model: usage.model,
}
this.history.push(entry)
// Trim old entries
if (this.history.length > this.maxEntries) {
this.history = this.history.slice(-this.maxEntries)
}
}
/**
* Get analytics summary for all recorded usage.
*/
getAnalytics(): TokenAnalytics {
if (this.history.length === 0) {
return {
totalRequests: 0,
totalInputTokens: 0,
totalOutputTokens: 0,
totalCacheRead: 0,
totalCacheCreation: 0,
averageInputPerRequest: 0,
averageOutputPerRequest: 0,
cacheHitRate: 0,
mostUsedModel: 'unknown',
requestsLastHour: 0,
requestsLastDay: 0,
}
}
const now = Date.now()
const hourAgo = now - 60 * 60 * 1000
const dayAgo = now - 24 * 60 * 60 * 1000
let totalInput = 0
let totalOutput = 0
let totalCacheRead = 0
let totalCacheCreation = 0
let modelCounts = new Map<string, number>()
let requestsLastHour = 0
let requestsLastDay = 0
for (const entry of this.history) {
totalInput += entry.inputTokens
totalOutput += entry.outputTokens
totalCacheRead += entry.cacheReadTokens
totalCacheCreation += entry.cacheCreationTokens
modelCounts.set(entry.model, (modelCounts.get(entry.model) ?? 0) + 1)
if (entry.timestamp >= hourAgo) requestsLastHour++
if (entry.timestamp >= dayAgo) requestsLastDay++
}
// Find most used model
let mostUsedModel = 'unknown'
let maxCount = 0
for (const [model, count] of modelCounts) {
if (count > maxCount) {
maxCount = count
mostUsedModel = model
}
}
const totalRequests = this.history.length
const totalCache = totalCacheRead + totalCacheCreation
const totalTokens = totalInput + totalOutput + totalCache
const cacheHitRate = totalTokens > 0 ? (totalCacheRead / totalTokens) * 100 : 0
return {
totalRequests,
totalInputTokens: totalInput,
totalOutputTokens: totalOutput,
totalCacheRead,
totalCacheCreation,
averageInputPerRequest: Math.round(totalInput / totalRequests),
averageOutputPerRequest: Math.round(totalOutput / totalRequests),
cacheHitRate: Math.round(cacheHitRate),
mostUsedModel,
requestsLastHour,
requestsLastDay,
}
}
/**
* Get recent entries within time window.
*/
getRecent(windowMs: number): TokenUsageEntry[] {
const cutoff = Date.now() - windowMs
return this.history.filter(e => e.timestamp >= cutoff)
}
/**
* Clear history.
*/
clear(): void {
this.history = []
}
/**
* Get history size.
*/
get size(): number {
return this.history.length
}
}
/** /**
* Get the current context window size in tokens. * Get the current context window size in tokens.
* *