Compare commits

...

14 Commits

Author SHA1 Message Date
gnanam1990
91dea452be fix(shell): drop now-unused realpath import 2026-04-24 07:43:43 +05:30
gnanam1990
0e620ae9ea fix(shell): recover when CWD path was replaced by a non-directory
Closes #844.

When the session's cached working directory is renamed on disk and
a file is subsequently created at the old path (e.g. `mv orig renamed
&& touch orig`), every Bash tool invocation failed with
`ENOTDIR: not a directory, posix_spawn '/usr/bin/zsh'` (exit 126),
and `!`-prefixed commands silently failed. No recovery was possible
without restarting the session.

Root cause: the pre-spawn guard in `src/utils/Shell.ts:exec()` used
`realpath(cwd)` to detect a missing CWD. `realpath()` succeeds on
any existing path — file or directory — so a path that was replaced
with a regular file slipped past the check. spawn() was then called
with `cwd` pointing at a non-directory and failed with ENOTDIR.

Fix: replace `realpath()` with `stat().isDirectory()` for both the
primary CWD check and the `getOriginalCwd()` fallback check. When
the cached CWD is no longer a directory, fall back to the original
CWD (as before) and update state so subsequent tools recover
transparently.

Verification:
  - Repro: `mkdir -p /tmp/x/orig && mv /tmp/x/orig /tmp/x/renamed
    && touch /tmp/x/orig`, then exec with stale cwd=/tmp/x/orig
  - Before: exit 126, stderr "ENOTDIR: not a directory, posix_spawn"
  - After:  exit 0, cwd transparently recovered to originalCwd
  - `bun test` — no new regressions (pre-existing model/provider
    test failures are unrelated and present on main)
2026-04-24 07:38:46 +05:30
0xfandom
e346b8d5ec fix(startup): url authoritative over model name in banner provider detect (#864)
The banner provider branch tested model-name substrings (`/deepseek/`, `/kimi/`,
`/mistral/`, `/llama/`) before aggregator base-URL substrings (`/openrouter/`,
`/together/`, `/groq/`, `/azure/`). When running OpenRouter/Together/Groq with
vendor-prefixed model IDs (e.g. `deepseek/deepseek-chat`, `moonshotai/kimi-k2`,
`deepseek-r1-distill-llama-70b`), the banner mislabelled the provider.

Reorder: explicit env flags (NVIDIA_NIM, MINIMAX_API_KEY) and codex transport
win first; base-URL host checks run before rawModel fallback; rawModel only
fires when the base URL is generic/custom. Add unit tests covering the
aggregator × vendor-prefixed-model matrix plus direct-vendor regressions.

Closes #855
2026-04-24 01:52:27 +08:00
hika, maeng
b750e9e97d fix: make OpenAI fallback context window configurable + support external model lookup (#861)
* fix: make OpenAI fallback context window configurable and support external lookup table

Unknown OpenAI-compatible models fell back to a hardcoded 128k constant,
causing auto-compact to fire prematurely on models with larger windows
(issue #635 follow-up). Two escape hatches are added without touching the
built-in table:

- CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW (number): overrides the 128k
  default for all unknown models.
- CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS (JSON object): per-model overrides that
  take precedence over the built-in OPENAI_CONTEXT_WINDOWS table; supports
  the same provider-qualified and prefix-matching lookup as the built-in path.
- CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS (JSON object): same pattern for output
  token limits.

This lets operators deploy new or private models without patching
openaiContextWindows.ts on every model release.

* docs: add new OpenAI context window env vars to .env.example

Document CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW,
CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS, and
CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS with usage examples.

Addresses reviewer feedback on PR #861.

---------

Co-authored-by: opencode <dev@example.com>
2026-04-24 00:34:08 +08:00
0xfandom
28de94df5d feat: add OPENCLAUDE_DISABLE_TOOL_REMINDERS env var to suppress hidden tool-output reminders (#837)
Gates three injection sites behind OPENCLAUDE_DISABLE_TOOL_REMINDERS:
- FileReadTool cyber-risk mitigation reminder (appended to every Read
  result when the model is not in MITIGATION_EXEMPT_MODELS)
- todo_reminder attachment for TodoWrite usage
- task_reminder attachment for TaskCreate/TaskUpdate usage

All three reminders are model-only side-channel instructions the user
cannot see today. Users who want full transparency over what the model
receives can now opt out without patching dist/cli.mjs on every upgrade.

Default behavior is unchanged when the flag is unset.

Closes #809
2026-04-23 01:37:02 +08:00
0xfandom
23e8cfbd5b fix(test): add missing teammate exports to hookChains integration mock (#840)
mock.module('./teammate.js', ...) only declared getAgentName/getTeamName/
getTeammateColor. Bun applies module mocks process-globally and
mock.restore() does not undo them, so whenever another test file ran
after hookChains.integration.test.ts and reached the real teammate
module it received undefined for isTeammate/isPlanModeRequired/
getAgentId/getParentSessionId.

This surfaced in CI as intermittent failures in
src/commands/provider/provider.test.tsx (TextEntryDialog / wizard
remount / ProviderWizard hides Codex OAuth), because getDefaultAppState
in AppStateStore.ts calls teammateUtils.isTeammate().

Match the mock surface to the real teammate.ts exports so downstream
consumers keep working even after the integration test pollutes the
module cache. Keeps the same behavioral overrides this test needed.

Closes #839
2026-04-23 01:36:42 +08:00
Kevin Codex
531e3f1059 feat(tools): resilient web search and fetch across all providers (#836)
- Add exponential backoff retry to DuckDuckGo adapter (3 attempts with
  jitter) to handle transient rate-limiting and connection errors.
- Add native fetch() fallback in WebFetch when axios hangs with custom
  DNS lookup in bundled contexts.
- Prevent broken native-path fallback for web search on OpenAI shim
  providers (minimax, moonshot, nvidia-nim, etc.) that do not support
  Anthropic's web_search_20250305 tool.
- Cherry-pick existing fixes:
  - a48bd56: cover codex/minimax/nvidia-nim in getSmallFastModel()
  - 31f0b68: 45s budget + raw-markdown fallback for secondary model
  - 446c1e8: sparse Codex /responses payload parsing
  - ae3f0b2: echo reasoning_content on assistant tool-call messages
- Fix domainCheck.test.ts mock modules to include isFirstPartyAnthropicBaseUrl
  and isGithubNativeAnthropicMode exports.

Co-authored-by: OpenClaude <openclaude@gitlawb.com>
2026-04-23 01:14:00 +08:00
KRATOS
3c4d8435c4 fix: surface actionable error when DuckDuckGo web search is rate-limited (#834)
Non-Anthropic / non-codex providers (minimax, kimi, generic OpenAI-compatible)
fell through to the DDG adapter when no paid search key was configured. DDG's
scraper is blocked on most IPs, so web_search surfaced an opaque "anomaly in
the request" error. Catch that response in the DDG provider and rethrow with
the exact env vars that would unblock the tool, or the option to switch to a
native-search provider.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 00:58:20 +08:00
Kevin Codex
67de6bd2cf fix(openai-shim): echo reasoning_content on assistant tool-call messages for Moonshot (#828)
Kimi / Moonshot's chat completions endpoint requires that every assistant
message carrying tool_calls also carry reasoning_content when the
"thinking" feature is active. When an agent sends prior-turn assistant
history back (standard multi-turn / subagent / Explore patterns), the
shim previously stripped the thinking block:

  case 'thinking':
  case 'redacted_thinking':
    // Strip thinking blocks for OpenAI-compatible providers.
    break

That's correct for providers that would mis-interpret serialized
<thinking> tags, but Moonshot validates the schema strictly and rejects
with:

  API Error: 400 {"error":{"message":"thinking is enabled but
  reasoning_content is missing in assistant tool call message at
  index N","type":"invalid_request_error"}}

Reproducer: launch with Kimi profile, run any tool-using command
(Explore, Bash, etc.) — every request after the first 400s.

Fix: in convertMessages(), when the per-request flag
preserveReasoningContent is set (only for Moonshot baseUrls today),
attach the original thinking block's text as reasoning_content on the
outgoing OpenAI-shaped assistant message. Other providers continue to
strip (unknown-field rejection risk).

OpenAIMessage type grows a reasoning_content?: string field.
convertMessages() accepts an options object and threads the flag
through; the only call site (_doOpenAIRequest) gates via
isMoonshotBaseUrl(request.baseUrl).

Tests (openaiShim.test.ts):
  - Moonshot: echoes reasoning_content on assistant tool-call messages
    (regression for the reported 400)
  - non-Moonshot providers do NOT receive reasoning_content (guards
    against leaking the field to strict-parse endpoints)

Full suite: 1195/1195 pass under --max-concurrency=1. PR scan clean.

Co-authored-by: OpenClaude <openclaude@gitlawb.com>
2026-04-22 22:47:57 +08:00
0xfandom
4d559c9135 docs(env): document OPENCLAUDE_DISABLE_STRICT_TOOLS in .env.example (#826)
Code support was merged in #770 but the .env.example entry was
missed, leaving users without a discoverable way to find the flag.

Closes #737
2026-04-22 22:16:47 +08:00
JATMN
b7b83eff13 Fix bracketed paste blocking provider form submit (#818) 2026-04-22 19:48:33 +08:00
Urvish L.
44a2c30d5f feat: implement Hook Chains runtime integration for self-healing agent mesh MVP (#711)
* feat: implement Hook Chains runtime integration for self-healing agent mesh MVP

- Add Hook Chains config loader, evaluator, and dispatcher in src/utils/hookChains.ts
- Wire PostToolUseFailure hook dispatch in executePostToolUseFailureHooks()
- Wire TaskCompleted hook dispatch in executeTaskCompletedHooks()
- Integrate fallback-agent launcher with permission preservation (canUseTool threading)
- Add safety hardening for config-read errors (try-catch protection)
- Update docs with MVP runtime trigger explanation
- Add 10 unit tests and 4 integration tests covering config, rules, guards, and actions

This completes the self-healing agent mesh MVP by enabling declarative rule-based
responses to tool failures and task completions, with fallback agent spawning,
team notification, and capacity warming actions.

* Update docs/hook-chains.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/utils/hookChains.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: address PR #711 review blockers for Hook Chains

- Gate hook-chain dispatch behind feature('HOOK_CHAINS') and default env gate to off
- Remove committed local artifact (agent.log) and ignore it in .gitignore
- Revert hook dispatcher signature threading changes for canUseTool
- Use ToolUseContext metadata hookChainsCanUseTool for fallback launch permissions
- Make spawn_fallback_agent fail explicitly when launcher context is unavailable
- Add config cache max age and guard map size limits to bound runtime memory
- Update docs and tests for default-off gating and explicit fallback failure

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-22 19:40:23 +08:00
ArkhAngelLifeJiggy
5b9cd21e37 feat: add streaming optimizer and structured request logging (#703)
* Integrate request logging and streaming optimizer

- Add logApiCallStart/End for API request tracking with correlation IDs
- Add streaming state tracking with processStreamChunk
- Flush buffer and log stream stats at stream end
- Resolve merge conflict with main branch

* feat: add streaming optimizer and structured request logging

* fix: address PR review feedback

- Remove buffering from streamingOptimizer - now purely observational
- Use logForDebugging instead of console.log for structured logging
- Remove dead code (streamResponse, bufferedStreamResponse, etc.)
- Use existing logging infrastructure instead of raw console.log
- Keep only used functions: createStreamState, processStreamChunk, getStreamStats

* test: add unit tests for requestLogging and streamingOptimizer

- streamingOptimizer.test.ts: 6 tests for createStreamState, processStreamChunk, getStreamStats
- requestLogging.test.ts: 6 tests for createCorrelationId, logApiCallStart, logApiCallEnd

* fix: correct durationMs test to be >= 0 instead of exactly 0

* fix: address PR #703 blockers and non-blockers

1. BLOCKER FIX: Skip clone() for streaming responses
   - Only call response.clone() + .json() for non-streaming requests
   - For streaming, usage comes via stream chunks anyway

2. NON-BLOCKER: Document dead code in flushStreamBuffer
   - Added comment explaining it's a no-op kept for API compat

3. NON-BLOCKER: vi.mock in tests - left as-is (test framework issue)

* fix: address all remaining non-blockers for PR #703

1. Remove dead code: flushStreamBuffer call and unused import
2. Fix test for Bun: remove vi.mock, use simple no-throw tests
2026-04-22 15:36:07 +08:00
ArkhAngelLifeJiggy
e92e5274b2 feat: add model-specific tokenizers and compression ratio detection (#799)
- ModelTokenizerConfig for different model families
- getTokenizerConfig() / getBytesPerTokenForModel()
- Content type detection (json, code, prose, list, technical)
- COMPRESSION_RATIOS - empirical ratios per content type
- estimateWithBounds() - confidence intervals

Features: 1.1, 1.14, 1.15
Tests: 13 passing
2026-04-22 13:24:12 +08:00
37 changed files with 4723 additions and 143 deletions

View File

@@ -149,6 +149,23 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
# Use a custom OpenAI-compatible endpoint (optional — defaults to api.openai.com)
# OPENAI_BASE_URL=https://api.openai.com/v1
# Fallback context window size (tokens) when the model is not found in the
# built-in table (default: 128000). Increase this for models with larger
# context windows (e.g. 200000 for Claude-sized contexts).
# CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW=128000
# Per-model context window overrides as a JSON object.
# Takes precedence over the built-in table, so you can register new or
# custom models without patching source.
# Example: CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS={"my-corp/llm-v3":262144,"gpt-4o-mini":128000}
# CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS=
# Per-model maximum output token overrides as a JSON object.
# Use this alongside CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS when your model
# supports a different output limit than what the built-in table specifies.
# Example: CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS={"my-corp/llm-v3":8192}
# CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS=
# -----------------------------------------------------------------------------
# Option 3: Google Gemini
@@ -267,6 +284,16 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
# Disable "Co-authored-by" line in git commits made by OpenClaude
# OPENCLAUDE_DISABLE_CO_AUTHORED_BY=1
# Disable strict tool schema normalization for non-Gemini providers
# Useful when MCP tools with complex optional params (e.g. list[dict])
# trigger "Extra required key ... supplied" errors from OpenAI-compatible endpoints
# OPENCLAUDE_DISABLE_STRICT_TOOLS=1
# Disable hidden <system-reminder> messages injected into tool output
# Suppresses the file-read cyber-risk reminder and the todo/task tool nudges
# Useful for users who want full transparency over what the model sees
# OPENCLAUDE_DISABLE_TOOL_REMINDERS=1
# Custom timeout for API requests in milliseconds (default: varies)
# API_TIMEOUT_MS=60000

1
.gitignore vendored
View File

@@ -11,3 +11,4 @@ CLAUDE.md
package-lock.json
/.claude
coverage/
agent.log

333
docs/hook-chains.md Normal file
View File

@@ -0,0 +1,333 @@
# Hook Chains (Self-Healing Agent Mesh MVP)
Hook Chains provide an event-driven recovery layer for important workflow failures.
When a matching hook event occurs, OpenClaude evaluates declarative rules and can dispatch remediation actions such as:
- `spawn_fallback_agent`
- `notify_team`
- `warm_remote_capacity`
## Disabled-By-Default Rollout
> **Rollout recommendation:** keep Hook Chains disabled until you validate rules in your environment.
>
> - Set top-level config to `"enabled": false` initially.
> - Enable per environment when ready.
> - Dispatch is gated by `feature('HOOK_CHAINS')`.
> - Env gate defaults to off unless `CLAUDE_CODE_ENABLE_HOOK_CHAINS=1` is set.
This keeps existing workflows unchanged while you tune guard windows and action behavior.
## Feature Overview
Hook Chains are loaded from a deterministic config file and evaluated on dispatched hook events.
MVP runtime trigger wiring:
- `PostToolUseFailure` hooks dispatch Hook Chains with outcome `failed`.
- `TaskCompleted` hooks dispatch Hook Chains with outcome:
- `success` when completion hooks did not block.
- `failed` when completion hooks returned blocking errors or prevented continuation.
Default config path:
- `.openclaude/hook-chains.json`
Override path:
- `CLAUDE_CODE_HOOK_CHAINS_CONFIG_PATH=/abs/or/relative/path/to/hook-chains.json`
Global gate:
- `feature('HOOK_CHAINS')` must be enabled in the build
- `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0|1` (defaults to disabled when unset)
## Safety Guarantees
The runtime is intentionally conservative:
- **Depth guard:** chain dispatch is blocked when `chainDepth >= maxChainDepth`.
- **Rule cooldown:** each rule can only re-fire after cooldown expires.
- **Dedup window:** identical event/action combinations are suppressed for a window.
- **Abort-safe behavior:** if the current signal is aborted, actions skip safely.
- **Policy-aware remote warm:** `warm_remote_capacity` skips when remote sessions are policy denied.
- **Bridge inactive no-op:** `warm_remote_capacity` safely skips when no active bridge handle exists.
- **Missing team context safety:** `notify_team` skips with structured reason if no team context/team file is available.
- **Fallback launcher safety:** `spawn_fallback_agent` fails with a structured reason when launch permissions/context are unavailable.
## Configuration Schema Reference
Top-level object:
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 30000,
"defaultDedupWindowMs": 30000,
"rules": []
}
```
### Top-Level Fields
| Field | Type | Required | Notes |
|---|---|---:|---|
| `version` | `1` | No | Defaults to `1`. |
| `enabled` | `boolean` | No | Global feature switch for this config file. |
| `maxChainDepth` | `integer` | No | Global depth guard (default `2`, max `10`). |
| `defaultCooldownMs` | `integer` | No | Default rule cooldown in ms (default `30000`). |
| `defaultDedupWindowMs` | `integer` | No | Default action dedup window in ms (default `30000`). |
| `rules` | `HookChainRule[]` | No | Defaults to `[]`. May be omitted or empty; when no rules are present, dispatch is a no-op and returns `enabled: false`. |
> **Note:** An empty ruleset is valid and can be used to keep Hook Chains configured but effectively disabled until rules are added.
### Rule Object (`HookChainRule`)
```json
{
"id": "task-failure-recovery",
"enabled": true,
"trigger": {
"event": "TaskCompleted",
"outcome": "failed"
},
"condition": {
"toolNames": ["Edit"],
"taskStatuses": ["failed"],
"errorIncludes": ["timeout", "permission denied"],
"eventFieldEquals": {
"meta.source": "scheduler"
}
},
"cooldownMs": 60000,
"dedupWindowMs": 30000,
"maxDepth": 2,
"actions": []
}
```
| Field | Type | Required | Notes |
|---|---|---:|---|
| `id` | `string` | Yes | Stable identifier used in telemetry/guards. |
| `enabled` | `boolean` | No | Per-rule switch. |
| `trigger.event` | `HookEvent` | Yes | Event name to match. |
| `trigger.outcome` | `"success"|"failed"|"timeout"|"unknown"` | No | Single outcome matcher. |
| `trigger.outcomes` | `Outcome[]` | No | Multi-outcome matcher. Use either `outcome` or `outcomes`. |
| `condition` | `object` | No | Optional extra matching constraints. |
| `cooldownMs` | `integer` | No | Overrides global cooldown for this rule. |
| `dedupWindowMs` | `integer` | No | Overrides global dedup for this rule. |
| `maxDepth` | `integer` | No | Per-rule depth cap. |
| `actions` | `HookChainAction[]` | Yes | One or more actions to execute in order. |
### Condition Fields
| Field | Type | Notes |
|---|---|---|
| `toolNames` | `string[]` | Matches `tool_name` / `toolName` in event payload. |
| `taskStatuses` | `string[]` | Matches `task_status` / `taskStatus` / `status`. |
| `errorIncludes` | `string[]` | Case-insensitive substring match against `error` / `reason` / `message`. |
| `eventFieldEquals` | `Record<string, string\|number\|boolean>` | Dot-path equality against payload (example: `"meta.source": "scheduler"`). |
### Actions
#### `spawn_fallback_agent`
```json
{
"type": "spawn_fallback_agent",
"id": "fallback-1",
"enabled": true,
"dedupWindowMs": 30000,
"description": "Fallback recovery for failed task",
"promptTemplate": "Recover task ${TASK_SUBJECT}. Event=${EVENT_NAME}, outcome=${OUTCOME}, error=${ERROR}. Payload=${PAYLOAD_JSON}",
"agentType": "general-purpose",
"model": "sonnet"
}
```
#### `notify_team`
```json
{
"type": "notify_team",
"id": "notify-ops",
"enabled": true,
"dedupWindowMs": 30000,
"teamName": "mesh-team",
"recipients": ["*"],
"summary": "Hook chain ${RULE_ID} fired",
"messageTemplate": "Event=${EVENT_NAME} outcome=${OUTCOME}\nTask=${TASK_ID}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
}
```
#### `warm_remote_capacity`
```json
{
"type": "warm_remote_capacity",
"id": "warm-bridge",
"enabled": true,
"dedupWindowMs": 60000,
"createDefaultEnvironmentIfMissing": false
}
```
## Complete Example Configs
### 1) Retry via Fallback Agent
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 30000,
"defaultDedupWindowMs": 30000,
"rules": [
{
"id": "retry-task-via-fallback",
"trigger": {
"event": "TaskCompleted",
"outcome": "failed"
},
"cooldownMs": 60000,
"actions": [
{
"type": "spawn_fallback_agent",
"id": "spawn-retry-agent",
"description": "Retry failed task with fallback agent",
"promptTemplate": "A task failed. Recover it safely.\nTask=${TASK_SUBJECT}\nDescription=${TASK_DESCRIPTION}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}",
"agentType": "general-purpose",
"model": "sonnet"
}
]
}
]
}
```
### 2) Notify Only
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 30000,
"defaultDedupWindowMs": 30000,
"rules": [
{
"id": "notify-on-tool-failure",
"trigger": {
"event": "PostToolUseFailure",
"outcome": "failed"
},
"condition": {
"toolNames": ["Edit", "Write", "Bash"]
},
"actions": [
{
"type": "notify_team",
"id": "notify-team-failure",
"recipients": ["*"],
"summary": "Tool failure detected",
"messageTemplate": "Tool failure detected.\nEvent=${EVENT_NAME} outcome=${OUTCOME}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
}
]
}
]
}
```
### 3) Combined Fallback + Notify + Bridge Warm
```json
{
"version": 1,
"enabled": true,
"maxChainDepth": 2,
"defaultCooldownMs": 45000,
"defaultDedupWindowMs": 30000,
"rules": [
{
"id": "full-recovery-chain",
"trigger": {
"event": "TaskCompleted",
"outcomes": ["failed", "timeout"]
},
"condition": {
"errorIncludes": ["timeout", "capacity", "connection"]
},
"cooldownMs": 90000,
"actions": [
{
"type": "spawn_fallback_agent",
"id": "fallback-agent",
"description": "Recover failed task execution",
"promptTemplate": "Recover failed task and produce a concise fix summary.\nTask=${TASK_SUBJECT}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
},
{
"type": "notify_team",
"id": "notify-team",
"recipients": ["*"],
"summary": "Recovery chain triggered",
"messageTemplate": "Recovery chain ${RULE_ID} fired.\nOutcome=${OUTCOME}\nTask=${TASK_SUBJECT}\nError=${ERROR}"
},
{
"type": "warm_remote_capacity",
"id": "warm-capacity",
"createDefaultEnvironmentIfMissing": false
}
]
}
]
}
```
## Template Variables
The following placeholders are supported by `promptTemplate`, `summary`, and `messageTemplate`:
- `${EVENT_NAME}`
- `${OUTCOME}`
- `${RULE_ID}`
- `${TASK_SUBJECT}`
- `${TASK_DESCRIPTION}`
- `${TASK_ID}`
- `${ERROR}`
- `${PAYLOAD_JSON}`
## Troubleshooting
### Rule never triggers
- Verify `trigger.event` and `trigger.outcome`/`trigger.outcomes` exactly match dispatched event data.
- Check `condition` filters (especially `toolNames` and `eventFieldEquals` dot-path keys).
- Confirm the config file is valid JSON and schema-valid.
### Actions show as skipped
Common skip reasons:
- `action disabled`
- `rule cooldown active ...`
- `dedup window active ...`
- `max chain depth reached ...`
- `No team context is available ...`
- `Team file not found ...`
- `Remote sessions are blocked by policy`
- `Bridge is not active; warm_remote_capacity is a safe no-op`
- `No fallback agent launcher is registered in runtime context`
### Config changes not reflected
- Loader uses memoization by file mtime/size.
- Ensure your editor writes the file fully and updates mtime.
- If needed, force reload from the caller side with `forceReloadConfig: true`.
### Existing workflows changed unexpectedly
- Set `"enabled": false` at top-level.
- Or globally disable with `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0`.
- Re-enable gradually after validating one rule at a time.

View File

@@ -249,6 +249,11 @@ export type ToolUseContext = {
/** When true, canUseTool must always be called even when hooks auto-approve.
* Used by speculation for overlay file path rewriting. */
requireCanUseTool?: boolean
/**
* Optional callback used by hook-chain fallback actions that launch
* AgentTool from hook runtime paths.
*/
hookChainsCanUseTool?: CanUseToolFn
messages: Message[]
fileReadingLimits?: {
maxTokens?: number

View File

@@ -169,6 +169,14 @@ describe('Web search result count improvements', () => {
expect(content).toMatch(/max_uses:\s*15/)
})
test('codex web search path guarantees a non-empty result body', async () => {
const content = await file(
'tools/WebSearchTool/WebSearchTool.ts',
).text()
expect(content).toContain("results.push('No results found.')")
})
})
// ---------------------------------------------------------------------------

View File

@@ -0,0 +1,158 @@
import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
import { detectProvider } from './StartupScreen.js'
const ENV_KEYS = [
'CLAUDE_CODE_USE_OPENAI',
'CLAUDE_CODE_USE_GEMINI',
'CLAUDE_CODE_USE_GITHUB',
'CLAUDE_CODE_USE_BEDROCK',
'CLAUDE_CODE_USE_VERTEX',
'CLAUDE_CODE_USE_MISTRAL',
'OPENAI_BASE_URL',
'OPENAI_API_KEY',
'OPENAI_MODEL',
'GEMINI_MODEL',
'MISTRAL_MODEL',
'ANTHROPIC_MODEL',
'NVIDIA_NIM',
'MINIMAX_API_KEY',
]
const originalEnv: Record<string, string | undefined> = {}
beforeEach(() => {
for (const key of ENV_KEYS) {
originalEnv[key] = process.env[key]
delete process.env[key]
}
})
afterEach(() => {
for (const key of ENV_KEYS) {
if (originalEnv[key] === undefined) {
delete process.env[key]
} else {
process.env[key] = originalEnv[key]
}
}
})
function setupOpenAIMode(baseUrl: string, model: string): void {
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_BASE_URL = baseUrl
process.env.OPENAI_MODEL = model
process.env.OPENAI_API_KEY = 'test-key'
}
// --- Issue #855: aggregator URL must win over vendor-prefixed model name ---
describe('detectProvider — aggregator URL authoritative over model-name substring (#855)', () => {
test('OpenRouter + deepseek/deepseek-chat labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'deepseek/deepseek-chat')
expect(detectProvider().name).toBe('OpenRouter')
})
test('OpenRouter + moonshotai/kimi-k2 labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'moonshotai/kimi-k2')
expect(detectProvider().name).toBe('OpenRouter')
})
test('OpenRouter + mistralai/mistral-large labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'mistralai/mistral-large')
expect(detectProvider().name).toBe('OpenRouter')
})
test('OpenRouter + meta-llama/llama-3.3 labels as OpenRouter', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'meta-llama/llama-3.3-70b-instruct')
expect(detectProvider().name).toBe('OpenRouter')
})
test('Together + deepseek-ai/DeepSeek-V3 labels as Together AI', () => {
setupOpenAIMode('https://api.together.xyz/v1', 'deepseek-ai/DeepSeek-V3')
expect(detectProvider().name).toBe('Together AI')
})
test('Together + meta-llama/Llama-3.3 labels as Together AI', () => {
setupOpenAIMode('https://api.together.xyz/v1', 'meta-llama/Llama-3.3-70B-Instruct-Turbo')
expect(detectProvider().name).toBe('Together AI')
})
test('Groq + deepseek-r1-distill-llama-70b labels as Groq', () => {
setupOpenAIMode('https://api.groq.com/openai/v1', 'deepseek-r1-distill-llama-70b')
expect(detectProvider().name).toBe('Groq')
})
test('Groq + llama-3.3-70b-versatile labels as Groq', () => {
setupOpenAIMode('https://api.groq.com/openai/v1', 'llama-3.3-70b-versatile')
expect(detectProvider().name).toBe('Groq')
})
test('Azure + any deepseek deployment labels as Azure OpenAI', () => {
setupOpenAIMode('https://my-resource.openai.azure.com/', 'deepseek-chat')
expect(detectProvider().name).toBe('Azure OpenAI')
})
})
// --- Direct vendor endpoints still label correctly (regression) ---
describe('detectProvider — direct vendor endpoints', () => {
test('api.deepseek.com labels as DeepSeek', () => {
setupOpenAIMode('https://api.deepseek.com/v1', 'deepseek-chat')
expect(detectProvider().name).toBe('DeepSeek')
})
test('api.moonshot.cn labels as Moonshot (Kimi)', () => {
setupOpenAIMode('https://api.moonshot.cn/v1', 'moonshot-v1-8k')
expect(detectProvider().name).toBe('Moonshot (Kimi)')
})
test('api.mistral.ai labels as Mistral', () => {
setupOpenAIMode('https://api.mistral.ai/v1', 'mistral-large-latest')
expect(detectProvider().name).toBe('Mistral')
})
test('default OpenAI URL + gpt-4o labels as OpenAI', () => {
setupOpenAIMode('https://api.openai.com/v1', 'gpt-4o')
expect(detectProvider().name).toBe('OpenAI')
})
})
// --- rawModel fallback for generic/custom endpoints ---
describe('detectProvider — rawModel fallback when URL is generic', () => {
test('custom proxy + deepseek-chat falls back to DeepSeek', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'deepseek-chat')
expect(detectProvider().name).toBe('DeepSeek')
})
test('custom proxy + kimi-k2 falls back to Moonshot (Kimi)', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'kimi-k2-instruct')
expect(detectProvider().name).toBe('Moonshot (Kimi)')
})
test('custom proxy + llama-3.3 falls back to Meta Llama', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'llama-3.3-70b')
expect(detectProvider().name).toBe('Meta Llama')
})
test('custom proxy + mistral-large falls back to Mistral', () => {
setupOpenAIMode('https://my-proxy.internal/v1', 'mistral-large-latest')
expect(detectProvider().name).toBe('Mistral')
})
})
// --- Explicit env flags win over URL heuristics ---
describe('detectProvider — explicit dedicated-provider env flags', () => {
test('NVIDIA_NIM=1 overrides aggregator URL', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'some-nim-model')
process.env.NVIDIA_NIM = '1'
expect(detectProvider().name).toBe('NVIDIA NIM')
})
test('MINIMAX_API_KEY overrides aggregator URL', () => {
setupOpenAIMode('https://openrouter.ai/api/v1', 'any-model')
process.env.MINIMAX_API_KEY = 'test-key'
expect(detectProvider().name).toBe('MiniMax')
})
})

View File

@@ -83,7 +83,7 @@ const LOGO_CLAUDE = [
// ─── Provider detection ───────────────────────────────────────────────────────
function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
export function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
const useGemini = process.env.CLAUDE_CODE_USE_GEMINI === '1' || process.env.CLAUDE_CODE_USE_GEMINI === 'true'
const useGithub = process.env.CLAUDE_CODE_USE_GITHUB === '1' || process.env.CLAUDE_CODE_USE_GITHUB === 'true'
const useOpenAI = process.env.CLAUDE_CODE_USE_OPENAI === '1' || process.env.CLAUDE_CODE_USE_OPENAI === 'true'
@@ -117,30 +117,34 @@ function detectProvider(): { name: string; model: string; baseUrl: string; isLoc
const baseUrl = resolvedRequest.baseUrl
const isLocal = isLocalProviderUrl(baseUrl)
let name = 'OpenAI'
if (/nvidia/i.test(baseUrl) || /nvidia/i.test(rawModel) || process.env.NVIDIA_NIM)
name = 'NVIDIA NIM'
else if (/minimax/i.test(baseUrl) || /minimax/i.test(rawModel) || process.env.MINIMAX_API_KEY)
name = 'MiniMax'
else if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex'))
// Explicit dedicated-provider env flags win.
if (process.env.NVIDIA_NIM) name = 'NVIDIA NIM'
else if (process.env.MINIMAX_API_KEY) name = 'MiniMax'
else if (
resolvedRequest.transport === 'codex_responses' ||
baseUrl.includes('chatgpt.com/backend-api/codex')
)
name = 'Codex'
else if (/moonshot/i.test(baseUrl) || /kimi/i.test(rawModel))
name = 'Moonshot (Kimi)'
else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))
name = 'DeepSeek'
else if (/openrouter/i.test(baseUrl))
name = 'OpenRouter'
else if (/together/i.test(baseUrl))
name = 'Together AI'
else if (/groq/i.test(baseUrl))
name = 'Groq'
else if (/mistral/i.test(baseUrl) || /mistral/i.test(rawModel))
name = 'Mistral'
else if (/azure/i.test(baseUrl))
name = 'Azure OpenAI'
else if (/llama/i.test(rawModel))
name = 'Meta Llama'
else if (isLocal)
name = getLocalOpenAICompatibleProviderLabel(baseUrl)
// Base URL is authoritative — must precede rawModel checks so aggregators
// (OpenRouter/Together/Groq) aren't mislabelled as DeepSeek/Kimi/etc.
// when routed to models whose IDs contain a vendor prefix. See issue #855.
else if (/openrouter/i.test(baseUrl)) name = 'OpenRouter'
else if (/together/i.test(baseUrl)) name = 'Together AI'
else if (/groq/i.test(baseUrl)) name = 'Groq'
else if (/azure/i.test(baseUrl)) name = 'Azure OpenAI'
else if (/nvidia/i.test(baseUrl)) name = 'NVIDIA NIM'
else if (/minimax/i.test(baseUrl)) name = 'MiniMax'
else if (/moonshot/i.test(baseUrl)) name = 'Moonshot (Kimi)'
else if (/deepseek/i.test(baseUrl)) name = 'DeepSeek'
else if (/mistral/i.test(baseUrl)) name = 'Mistral'
// rawModel fallback — fires only when base URL is generic/custom.
else if (/nvidia/i.test(rawModel)) name = 'NVIDIA NIM'
else if (/minimax/i.test(rawModel)) name = 'MiniMax'
else if (/kimi/i.test(rawModel)) name = 'Moonshot (Kimi)'
else if (/deepseek/i.test(rawModel)) name = 'DeepSeek'
else if (/mistral/i.test(rawModel)) name = 'Mistral'
else if (/llama/i.test(rawModel)) name = 'Meta Llama'
else if (isLocal) name = getLocalOpenAICompatibleProviderLabel(baseUrl)
// Resolve model alias to actual model name + reasoning effort
let displayModel = resolvedRequest.resolvedModel

View File

@@ -1,5 +1,8 @@
import { expect, test } from 'bun:test'
import { supportsClipboardImageFallback } from './usePasteHandler.ts'
import {
shouldHandleInputAsPaste,
supportsClipboardImageFallback,
} from './usePasteHandler.ts'
test('supports clipboard image fallback on Windows', () => {
expect(supportsClipboardImageFallback('windows')).toBe(true)
@@ -20,3 +23,42 @@ test('does not support clipboard image fallback on WSL', () => {
test('does not support clipboard image fallback on unknown platforms', () => {
expect(supportsClipboardImageFallback('unknown')).toBe(false)
})
test('does not treat a bracketed paste as pending when no paste handlers are provided', () => {
expect(
shouldHandleInputAsPaste({
hasTextPasteHandler: false,
hasImagePasteHandler: false,
inputLength: 'kimi-k2.5'.length,
pastePending: false,
hasImageFilePath: false,
isFromPaste: true,
}),
).toBe(false)
})
test('treats bracketed text paste as pending when a text paste handler exists', () => {
expect(
shouldHandleInputAsPaste({
hasTextPasteHandler: true,
hasImagePasteHandler: false,
inputLength: 'kimi-k2.5'.length,
pastePending: false,
hasImageFilePath: false,
isFromPaste: true,
}),
).toBe(true)
})
test('treats image path paste as pending when only an image handler exists', () => {
expect(
shouldHandleInputAsPaste({
hasTextPasteHandler: false,
hasImagePasteHandler: true,
inputLength: 'C:\\Users\\jat\\image.png'.length,
pastePending: false,
hasImageFilePath: true,
isFromPaste: false,
}),
).toBe(true)
})

View File

@@ -35,6 +35,24 @@ type PasteHandlerProps = {
) => void
}
export function shouldHandleInputAsPaste(options: {
hasTextPasteHandler: boolean
hasImagePasteHandler: boolean
inputLength: number
pastePending: boolean
hasImageFilePath: boolean
isFromPaste: boolean
}): boolean {
return (
(options.hasTextPasteHandler &&
(options.inputLength > PASTE_THRESHOLD ||
options.pastePending ||
options.hasImageFilePath ||
options.isFromPaste)) ||
(options.hasImagePasteHandler && options.hasImageFilePath)
)
}
export function usePasteHandler({
onPaste,
onInput,
@@ -236,11 +254,6 @@ export function usePasteHandler({
// The keypress parser sets isPasted=true for content within bracketed paste.
const isFromPaste = event.keypress.isPasted
// If this is pasted content, set isPasting state for UI feedback
if (isFromPaste) {
setIsPasting(true)
}
// Handle large pastes (>PASTE_THRESHOLD chars)
// Usually we get one or two input characters at a time. If we
// get more than the threshold, the user has probably pasted.
@@ -268,6 +281,7 @@ export function usePasteHandler({
canFallbackToClipboardImage &&
onImagePaste
) {
setIsPasting(true)
checkClipboardForImage()
// Reset isPasting since there's no text content to process
setIsPasting(false)
@@ -275,14 +289,17 @@ export function usePasteHandler({
}
// Check if we should handle as paste (from bracketed paste, large input, or continuation)
const shouldHandleAsPaste =
onPaste &&
(input.length > PASTE_THRESHOLD ||
pastePendingRef.current ||
hasImageFilePath ||
isFromPaste)
const shouldHandleAsPaste = shouldHandleInputAsPaste({
hasTextPasteHandler: Boolean(onPaste),
hasImagePasteHandler: Boolean(onImagePaste),
inputLength: input.length,
pastePending: pastePendingRef.current,
hasImageFilePath,
isFromPaste,
})
if (shouldHandleAsPaste) {
setIsPasting(true)
pastePendingRef.current = true
setPasteState(({ chunks, timeoutId }) => {
return {

View File

@@ -8,6 +8,7 @@ import {
convertCodexResponseToAnthropicMessage,
convertToolsToResponsesTools,
} from './codexShim.js'
import { __test as webSearchToolTest } from '../../tools/WebSearchTool/WebSearchTool.js'
const tempDirs: string[] = []
const originalEnv = {
@@ -609,6 +610,164 @@ describe('Codex request translation', () => {
])
})
test('recovers Codex web search text and sources from sparse completed response', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
sources: [
{
title: 'OpenClaude repo',
url: 'https://github.com/example/openclaude',
},
],
},
{
type: 'message',
role: 'assistant',
content: [
{
type: 'text',
text: 'OpenClaude is available on GitHub.',
sources: [
{
title: 'Docs',
url: 'https://docs.example.com/openclaude',
},
],
},
],
},
],
},
'OpenClaude GitHub 2026',
0.42,
)
expect(output.results).toEqual([
'OpenClaude is available on GitHub.',
{
tool_use_id: 'codex-web-search',
content: [
{
title: 'OpenClaude repo',
url: 'https://github.com/example/openclaude',
},
{
title: 'Docs',
url: 'https://docs.example.com/openclaude',
},
],
},
])
})
test('falls back to a non-empty Codex web search result message', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{ output: [] },
'OpenClaude GitHub 2026',
0.11,
)
expect(output.results).toEqual(['No results found.'])
})
test('surfaces Codex web search failure reason with a message', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
error: { message: 'upstream search provider rate-limited' },
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual([
'Web search failed: upstream search provider rate-limited',
])
})
test('surfaces Codex web search failure reason nested under action.error', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
action: { error: { message: 'query blocked' } },
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual(['Web search failed: query blocked'])
})
test('handles Codex web search failure with no reason attached', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual(['Web search failed.'])
})
test('a failure item does not suppress sources from a later message item', () => {
const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
{
output: [
{
type: 'web_search_call',
status: 'failed',
error: { message: 'partial outage' },
},
{
type: 'message',
role: 'assistant',
content: [
{
type: 'output_text',
text: 'Partial results below.',
sources: [
{ title: 'Docs', url: 'https://docs.example.com/openclaude' },
],
},
],
},
],
},
'OpenClaude GitHub 2026',
0.05,
)
expect(output.results).toEqual([
'Web search failed: partial outage',
'Partial results below.',
{
tool_use_id: 'codex-web-search',
content: [
{ title: 'Docs', url: 'https://docs.example.com/openclaude' },
],
},
])
})
test('translates Codex SSE text stream into Anthropic events', async () => {
const responseText = [
'event: response.output_item.added',

View File

@@ -3343,6 +3343,139 @@ test('Moonshot: uses max_tokens (not max_completion_tokens) and strips store', a
expect(requestBody?.store).toBeUndefined()
})
test('Moonshot: echoes reasoning_content on assistant tool-call messages', async () => {
// Regression for: "API Error: 400 {"error":{"message":"thinking is enabled
// but reasoning_content is missing in assistant tool call message at index
// N"}}" when the agent sends a prior-turn assistant response back to Kimi.
// The thinking block captured from the inbound response must round-trip
// as reasoning_content on the outgoing echoed assistant message.
process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'kimi-k2.6',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'kimi-k2.6',
system: 'you are kimi',
messages: [
{ role: 'user', content: 'check the logs' },
{
role: 'assistant',
content: [
{
type: 'thinking',
thinking: 'Need to inspect logs via Bash; running a cat.',
},
{ type: 'text', text: "I'll inspect the logs." },
{
type: 'tool_use',
id: 'call_bash_1',
name: 'Bash',
input: { command: 'cat /tmp/app.log' },
},
],
},
{
role: 'user',
content: [
{
type: 'tool_result',
tool_use_id: 'call_bash_1',
content: 'log line 1\nlog line 2',
},
],
},
],
max_tokens: 256,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
const assistantWithToolCall = messages.find(
m => m.role === 'assistant' && Array.isArray(m.tool_calls),
)
expect(assistantWithToolCall).toBeDefined()
expect(assistantWithToolCall?.reasoning_content).toBe(
'Need to inspect logs via Bash; running a cat.',
)
})
test('non-Moonshot providers do NOT receive reasoning_content on assistant messages', async () => {
// Guard: only Moonshot opts in. DeepSeek/OpenRouter/etc. receive the
// outgoing assistant message without reasoning_content to avoid
// unknown-field rejections from strict servers.
process.env.OPENAI_BASE_URL = 'https://api.deepseek.com/v1'
process.env.OPENAI_API_KEY = 'sk-deepseek'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'deepseek-chat',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'deepseek-chat',
system: 'test',
messages: [
{ role: 'user', content: 'hi' },
{
role: 'assistant',
content: [
{ type: 'thinking', thinking: 'thought' },
{ type: 'text', text: 'hello' },
{
type: 'tool_use',
id: 'call_1',
name: 'Bash',
input: { command: 'ls' },
},
],
},
{
role: 'user',
content: [
{ type: 'tool_result', tool_use_id: 'call_1', content: 'files' },
],
},
],
max_tokens: 32,
stream: false,
})
const messages = requestBody?.messages as Array<Record<string, unknown>>
const assistantWithToolCall = messages.find(
m => m.role === 'assistant' && Array.isArray(m.tool_calls),
)
expect(assistantWithToolCall).toBeDefined()
expect(assistantWithToolCall?.reasoning_content).toBeUndefined()
})
test('Moonshot: cn host is also detected', async () => {
process.env.OPENAI_BASE_URL = 'https://api.moonshot.cn/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'

View File

@@ -67,6 +67,8 @@ import {
normalizeToolArguments,
hasToolFieldMapping,
} from './toolArgumentNormalization.js'
import { logApiCallStart, logApiCallEnd } from '../../utils/requestLogging.js'
import { createStreamState, processStreamChunk, getStreamStats } from '../../utils/streamingOptimizer.js'
type SecretValueSource = Partial<{
OPENAI_API_KEY: string
@@ -216,6 +218,14 @@ interface OpenAIMessage {
}>
tool_call_id?: string
name?: string
/**
* Per-assistant-message chain-of-thought, attached when echoing an
* assistant message back to providers that require it (notably Moonshot:
* "thinking is enabled but reasoning_content is missing in assistant
* tool call message at index N" 400). Derived from the Anthropic thinking
* block captured when the original response was translated.
*/
reasoning_content?: string
}
interface OpenAITool {
@@ -383,7 +393,9 @@ function convertMessages(
content?: unknown
}>,
system: unknown,
options?: { preserveReasoningContent?: boolean },
): OpenAIMessage[] {
const preserveReasoningContent = options?.preserveReasoningContent === true
const result: OpenAIMessage[] = []
const knownToolCallIds = new Set<string>()
@@ -486,6 +498,21 @@ function convertMessages(
})(),
}
// Providers that validate reasoning continuity (Moonshot: "thinking
// is enabled but reasoning_content is missing in assistant tool call
// message at index N" 400) need the original chain-of-thought echoed
// back on each assistant message that carries a tool_call. We kept
// the thinking block on the Anthropic side; re-attach it here as the
// `reasoning_content` field on the outgoing OpenAI-shaped message.
// Gated per-provider because other endpoints either ignore the field
// (harmless) or strict-reject unknown fields (harmful).
if (preserveReasoningContent) {
const thinkingText = (thinkingBlock as { thinking?: string } | undefined)?.thinking
if (typeof thinkingText === 'string' && thinkingText.trim().length > 0) {
assistantMsg.reasoning_content = thinkingText
}
}
if (toolUses.length > 0) {
const mappedToolCalls = toolUses
.map(
@@ -857,6 +884,7 @@ async function* openaiStreamToAnthropic(
let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null
let hasEmittedFinalUsage = false
let hasProcessedFinishReason = false
const streamState = createStreamState()
// Emit message_start
yield {
@@ -1020,6 +1048,7 @@ async function* openaiStreamToAnthropic(
delta: { type: 'text_delta', text: visible },
}
}
processStreamChunk(streamState, delta.content)
}
// Tool calls
@@ -1039,6 +1068,7 @@ async function* openaiStreamToAnthropic(
const toolBlockIndex = contentBlockIndex
const initialArguments = tc.function.arguments ?? ''
const normalizeAtStop = hasToolFieldMapping(tc.function.name)
processStreamChunk(streamState, tc.function.arguments ?? '')
activeToolCalls.set(tc.index, {
id: tc.id,
name: tc.function.name,
@@ -1236,6 +1266,20 @@ async function* openaiStreamToAnthropic(
reader.releaseLock()
}
const stats = getStreamStats(streamState)
if (stats.totalChunks > 0) {
logForDebugging(
JSON.stringify({
type: 'stream_stats',
model,
total_chunks: stats.totalChunks,
first_token_ms: stats.firstTokenMs,
duration_ms: stats.durationMs,
}),
{ level: 'debug' },
)
}
yield { type: 'message_stop' }
}
@@ -1441,7 +1485,12 @@ class OpenAIShimMessages {
}>,
request.resolvedModel,
)
const openaiMessages = convertMessages(compressedMessages, params.system)
const openaiMessages = convertMessages(compressedMessages, params.system, {
// Moonshot requires every assistant tool-call message to carry
// reasoning_content when its thinking feature is active. Echo it back
// from the thinking block we captured on the inbound response.
preserveReasoningContent: isMoonshotBaseUrl(request.baseUrl),
})
const body: Record<string, unknown> = {
model: request.resolvedModel,
@@ -1715,6 +1764,12 @@ class OpenAIShimMessages {
}
let response: Response | undefined
const provider = request.baseUrl.includes('nvidia') ? 'nvidia-nim'
: request.baseUrl.includes('minimax') ? 'minimax'
: request.baseUrl.includes('localhost:11434') || request.baseUrl.includes('localhost:11435') ? 'ollama'
: request.baseUrl.includes('anthropic') ? 'anthropic'
: 'openai'
const { correlationId, startTime } = logApiCallStart(provider, request.resolvedModel)
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
response = await fetchWithProxyRetry(
@@ -1752,6 +1807,20 @@ class OpenAIShimMessages {
}
if (response.ok) {
let tokensIn = 0
let tokensOut = 0
// Skip clone() for streaming responses - it blocks until full body is received,
// defeating the purpose of streaming. Usage data is already sent via
// stream_options: { include_usage: true } and can be extracted from the stream.
if (!params.stream) {
try {
const clone = response.clone()
const data = await clone.json()
tokensIn = data.usage?.prompt_tokens ?? 0
tokensOut = data.usage?.completion_tokens ?? 0
} catch { /* ignore */ }
}
logApiCallEnd(correlationId, startTime, request.resolvedModel, 'success', tokensIn, tokensOut, false)
return response
}

View File

@@ -223,6 +223,49 @@ export function bytesPerTokenForFileType(fileExtension: string): number {
}
}
/**
* Tokenizer ratio by model family.
* Different models have different encodings.
*/
export interface ModelTokenizerConfig {
modelFamily: string
bytesPerToken: number
supportsJson: boolean
supportsCode: boolean
}
export const MODEL_TOKENIZER_CONFIGS: ModelTokenizerConfig[] = [
{ modelFamily: 'claude', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
{ modelFamily: 'gpt-4', bytesPerToken: 4, supportsJson: true, supportsCode: true },
{ modelFamily: 'gpt-3.5', bytesPerToken: 4, supportsJson: true, supportsCode: true },
{ modelFamily: 'gemini', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
{ modelFamily: 'llama', bytesPerToken: 3.8, supportsJson: true, supportsCode: true },
{ modelFamily: 'deepseek', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
{ modelFamily: 'minimax', bytesPerToken: 3.2, supportsJson: true, supportsCode: true },
]
/**
* Get tokenizer config for a model.
*/
export function getTokenizerConfig(model: string): ModelTokenizerConfig {
const lower = model.toLowerCase()
for (const config of MODEL_TOKENIZER_CONFIGS) {
if (lower.includes(config.modelFamily)) {
return config
}
}
return { modelFamily: 'unknown', bytesPerToken: 4, supportsJson: true, supportsCode: true }
}
/**
* Get bytes-per-token ratio for a model.
*/
export function getBytesPerTokenForModel(model: string): number {
return getTokenizerConfig(model).bytesPerToken
}
/**
* Like {@link roughTokenCountEstimation} but uses a more accurate
* bytes-per-token ratio when the file type is known.
@@ -241,6 +284,106 @@ export function roughTokenCountEstimationForFileType(
)
}
/**
* Content type classification for compression ratio.
*/
export type ContentType =
| 'json' | 'code' | 'prose' | 'technical'
| 'list' | 'table' | 'mixed'
/**
* Compression ratio by content type.
* Measured empirically - denser content = lower ratio.
*/
export const COMPRESSION_RATIOS: Record<ContentType, { min: number; max: number; typical: number }> = {
json: { min: 1.5, max: 2.5, typical: 2 },
code: { min: 3, max: 4.5, typical: 3.5 },
prose: { min: 3.5, max: 4.5, typical: 4 },
technical: { min: 2.5, max: 3.5, typical: 3 },
list: { min: 2, max: 3, typical: 2.5 },
table: { min: 1.8, max: 2.8, typical: 2.2 },
mixed: { min: 3, max: 4, typical: 3.5 },
}
/**
* Detect content type from content.
*/
export function detectContentType(content: string): ContentType {
const trimmed = content.trim()
// JSON
if ((trimmed.startsWith('{') && trimmed.endsWith('}')) ||
(trimmed.startsWith('[') && trimmed.endsWith(']'))) {
try {
JSON.parse(trimmed)
return 'json'
} catch { /* not valid json */ }
}
// Table (tabs or consistent delimiters)
const lines = trimmed.split('\n')
if (lines.length > 2) {
const hasTabs = lines[0].includes('\t')
const hasCommas = lines[0].includes(',')
if (hasTabs || hasCommas) {
const consistent = lines.slice(1).every(l => l.includes('\t') || l.includes(','))
if (consistent) return 'table'
}
}
// List
if (/^[\d\-\*\•]/.test(trimmed) || /^[\d\-\*\•]/.test(lines[0])) {
return 'list'
}
// Code (high density of special chars)
const codeChars = (content.match(/[{}()\[\];=]/g) || []).length
const codeRatio = codeChars / content.length
if (codeRatio > 0.05) return 'code'
// Technical (has numbers and units)
if (/\d+\s*(px|em|rem|%|ms|s|kb|mb|gb)/i.test(content)) {
return 'technical'
}
// Prose (default - natural language)
return 'prose'
}
/**
* Get compression ratio for content.
*/
export function getCompressionRatio(content: string, type?: ContentType): { ratio: number; min: number; max: number } {
const detectedType = type ?? detectContentType(content)
const { min, max, typical } = COMPRESSION_RATIOS[detectedType]
// Adjust based on actual content length
// Shorter content = higher variance
const lengthBonus = content.length < 100 ? 0.5 : 0
return {
ratio: typical,
min: min + lengthBonus,
max: max + lengthBonus,
}
}
/**
* Estimate tokens with confidence bounds.
*/
export function estimateWithBounds(
content: string,
type?: ContentType,
): { estimate: number; min: number; max: number } {
const { ratio, min: minRatio, max: maxRatio } = getCompressionRatio(content, type)
const estimate = roughTokenCountEstimation(content, ratio)
const min = roughTokenCountEstimation(content, maxRatio)
const max = roughTokenCountEstimation(content, minRatio)
return { estimate, min, max }
}
/**
* Estimates token count for a Message object by extracting and analyzing its text content.
* This provides a more reliable estimate than getTokenUsage for messages that may have been compacted.

View File

@@ -0,0 +1,100 @@
import { describe, expect, it } from 'bun:test'
import {
getTokenizerConfig,
getBytesPerTokenForModel,
detectContentType,
getCompressionRatio,
estimateWithBounds,
} from './tokenEstimation.js'
describe('Model Tokenizers', () => {
describe('getTokenizerConfig', () => {
it('returns config for claude models', () => {
const config = getTokenizerConfig('claude-sonnet-4-5-20250514')
expect(config.modelFamily).toBe('claude')
expect(config.bytesPerToken).toBe(3.5)
})
it('returns config for gpt models', () => {
const config = getTokenizerConfig('gpt-4')
expect(config.modelFamily).toBe('gpt-4')
expect(config.bytesPerToken).toBe(4)
})
it('returns default for unknown models', () => {
const config = getTokenizerConfig('unknown-model')
expect(config.modelFamily).toBe('unknown')
expect(config.bytesPerToken).toBe(4)
})
})
describe('getBytesPerTokenForModel', () => {
it('returns bytes per token for model', () => {
expect(getBytesPerTokenForModel('claude-opus-3-5-20250214')).toBe(3.5)
expect(getBytesPerTokenForModel('gpt-4o')).toBe(4)
expect(getBytesPerTokenForModel('deepseek-chat')).toBe(3.5)
expect(getBytesPerTokenForModel('minimax-M2.7')).toBe(3.2)
})
})
})
describe('Content Type Detection', () => {
describe('detectContentType', () => {
it('detects JSON', () => {
expect(detectContentType('{"key": "value"}')).toBe('json')
expect(detectContentType('[1, 2, 3]')).toBe('json')
})
it('detects code', () => {
expect(detectContentType('function test() { return 1 + 2; }')).toBe('code')
expect(detectContentType('const x = () => {}')).toBe('code')
})
it('detects prose', () => {
expect(detectContentType('This is a natural language response.')).toBe('prose')
expect(detectContentType('Hello world how are you?')).toBe('prose')
})
it('detects code-like technical', () => {
// Has both code chars and technical - higher code char ratio wins
expect(detectContentType('margin: 10px; padding: 5px;')).toBe('code')
})
it('detects list', () => {
expect(detectContentType('- item 1\n- item 2')).toBe('list')
expect(detectContentType('1. first\n2. second')).toBe('list')
})
it('detects prose by default', () => {
// Single column with newlines = prose
expect(detectContentType('a b c\n1 2 3')).toBe('prose')
})
})
})
describe('Compression Ratio', () => {
describe('getCompressionRatio', () => {
it('returns appropriate ratios', () => {
expect(getCompressionRatio('{"a":1}').ratio).toBe(2)
expect(getCompressionRatio('code here {} []').ratio).toBe(3.5)
expect(getCompressionRatio('Hello world').ratio).toBe(4)
})
})
describe('estimateWithBounds', () => {
it('returns estimate with bounds', () => {
const result = estimateWithBounds('Hello world')
expect(result.min).toBeLessThanOrEqual(result.estimate)
expect(result.max).toBeGreaterThanOrEqual(result.estimate)
expect(result.min).toBeLessThan(result.max)
})
it('handles JSON with tighter bounds', () => {
const result = estimateWithBounds('{"key": "value"}')
// JSON has smaller ratio range
expect(result.max).toBeLessThan(10)
})
})
})

View File

@@ -1241,6 +1241,7 @@ async function checkPermissionsAndCallTool(
{
...toolUseContext,
toolUseId: toolUseID,
hookChainsCanUseTool: canUseTool,
userModified: permissionDecision.userModified ?? false,
},
canUseTool,
@@ -1729,19 +1730,29 @@ async function checkPermissionsAndCallTool(
const hookMessages: MessageUpdateLazy<
AttachmentMessage | ProgressMessage<HookProgress>
>[] = []
for await (const hookResult of runPostToolUseFailureHooks(
toolUseContext,
tool,
toolUseID,
messageId,
processedInput,
content,
isInterrupt,
requestId,
mcpServerType,
mcpServerBaseUrl,
)) {
hookMessages.push(hookResult)
const hookChainsContext = toolUseContext as ToolUseContext & {
hookChainsCanUseTool?: CanUseToolFn
}
hookChainsContext.hookChainsCanUseTool = canUseTool
try {
for await (const hookResult of runPostToolUseFailureHooks(
toolUseContext,
tool,
toolUseID,
messageId,
processedInput,
content,
isInterrupt,
requestId,
mcpServerType,
mcpServerBaseUrl,
)) {
hookMessages.push(hookResult)
}
} finally {
if (hookChainsContext.hookChainsCanUseTool === canUseTool) {
delete hookChainsContext.hookChainsCanUseTool
}
}
return [

View File

@@ -284,6 +284,7 @@ export async function* runPostToolUseFailureHooks<Input extends AnyObject>(
isInterrupt,
permissionMode,
toolUseContext.abortController.signal,
undefined,
)) {
try {
// Check if we were aborted during hook execution

View File

@@ -733,6 +733,9 @@ export const CYBER_RISK_MITIGATION_REMINDER =
const MITIGATION_EXEMPT_MODELS = new Set(['claude-opus-4-6'])
function shouldIncludeFileReadMitigation(): boolean {
if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
return false
}
const shortName = getCanonicalName(getMainLoopModel())
return !MITIGATION_EXEMPT_MODELS.has(shortName)
}

View File

@@ -0,0 +1,87 @@
import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
// Mock the Anthropic-API-side before importing the module under test, so
// queryHaiku resolves into whatever the individual test wants (slow, failing,
// or successful). We preserve every other export from claude.js so unrelated
// transitive imports still work.
const haikuMock = mock()
beforeEach(async () => {
haikuMock.mockReset()
const actual = await import('../../services/api/claude.js')
mock.module('../../services/api/claude.js', () => ({
...actual,
queryHaiku: haikuMock,
}))
})
afterEach(() => {
mock.restore()
})
async function runApply(markdown = 'Hello world.', signal?: AbortSignal): Promise<string> {
const nonce = `${Date.now()}-${Math.random()}`
const { applyPromptToMarkdown } =
await import(`./utils.js?ts=${nonce}`)
const ctrl = new AbortController()
return applyPromptToMarkdown(
'summarize',
markdown,
signal ?? ctrl.signal,
false,
false,
)
}
test('returns raw truncated markdown when queryHaiku throws', async () => {
haikuMock.mockImplementation(async () => {
throw new Error('MiniMax rejected the model name')
})
const output = await runApply('Gitlawb homepage content.')
expect(output).toContain('[Secondary-model summarization unavailable')
expect(output).toContain('Gitlawb homepage content.')
})
test('returns raw truncated markdown when queryHaiku simulates a timeout', async () => {
// Simulating raceWithTimeout's rejection path directly — we can't actually
// wait 45s in a test. The error shape matches what raceWithTimeout produces.
haikuMock.mockImplementation(async () => {
const err = new Error('Secondary-model summarization timed out after 45000ms')
;(err as NodeJS.ErrnoException).code = 'SECONDARY_MODEL_TIMEOUT'
throw err
})
const output = await runApply('Slow provider content.')
expect(output).toContain('[Secondary-model summarization unavailable')
expect(output).toContain('Slow provider content.')
})
test('returns the model response when queryHaiku succeeds', async () => {
haikuMock.mockImplementation(async () => ({
message: {
content: [{ type: 'text', text: 'This page is about GitLawb, an AI legal platform.' }],
},
}))
const output = await runApply('some page content')
expect(output).toBe('This page is about GitLawb, an AI legal platform.')
})
test('returns fallback when queryHaiku resolves with empty content', async () => {
haikuMock.mockImplementation(async () => ({ message: { content: [] } }))
const output = await runApply('some page content')
expect(output).toContain('[Secondary-model summarization unavailable')
expect(output).toContain('some page content')
})
test('propagates AbortError from the caller signal', async () => {
const ctrl = new AbortController()
haikuMock.mockImplementation(async () => {
ctrl.abort()
return new Promise(() => {})
})
await expect(runApply('content', ctrl.signal)).rejects.toThrow()
})

View File

@@ -20,8 +20,11 @@ afterEach(() => {
describe('checkDomainBlocklist', () => {
test('returns allowed without API call in OpenAI mode', async () => {
process.env.CLAUDE_CODE_USE_OPENAI = '1'
const actual = await import('../../utils/model/providers.js')
mock.module('../../utils/model/providers.js', () => ({
...actual,
getAPIProvider: () => 'openai',
isFirstPartyAnthropicBaseUrl: () => false,
}))
const getSpy = mock(() =>
Promise.resolve({ status: 200, data: { can_fetch: true } }),
@@ -37,8 +40,11 @@ describe('checkDomainBlocklist', () => {
test('returns allowed without API call in Gemini mode', async () => {
process.env.CLAUDE_CODE_USE_GEMINI = '1'
const actual = await import('../../utils/model/providers.js')
mock.module('../../utils/model/providers.js', () => ({
...actual,
getAPIProvider: () => 'gemini',
isFirstPartyAnthropicBaseUrl: () => false,
}))
const getSpy = mock(() =>
Promise.resolve({ status: 200, data: { can_fetch: true } }),
@@ -57,8 +63,11 @@ describe('checkDomainBlocklist', () => {
delete process.env.CLAUDE_CODE_USE_GEMINI
delete process.env.CLAUDE_CODE_USE_GITHUB
const actual = await import('../../utils/model/providers.js')
mock.module('../../utils/model/providers.js', () => ({
...actual,
getAPIProvider: () => 'firstParty',
isFirstPartyAnthropicBaseUrl: () => true,
}))
const getSpy = mock(() =>
Promise.resolve({ status: 200, data: { can_fetch: true } }),

View File

@@ -275,20 +275,76 @@ export async function getWithPermittedRedirects(
if (depth > MAX_REDIRECTS) {
throw new Error(`Too many redirects (exceeded ${MAX_REDIRECTS})`)
}
const axiosConfig = {
signal,
timeout: FETCH_TIMEOUT_MS,
maxRedirects: 0,
responseType: 'arraybuffer' as const,
maxContentLength: MAX_HTTP_CONTENT_LENGTH,
lookup: ssrfGuardedLookup,
headers: {
Accept: 'text/markdown, text/html, */*',
'User-Agent': getWebFetchUserAgent(),
},
}
try {
return await axios.get(url, {
signal,
timeout: FETCH_TIMEOUT_MS,
maxRedirects: 0,
responseType: 'arraybuffer',
maxContentLength: MAX_HTTP_CONTENT_LENGTH,
lookup: ssrfGuardedLookup,
headers: {
Accept: 'text/markdown, text/html, */*',
'User-Agent': getWebFetchUserAgent(),
},
})
return await axios.get(url, axiosConfig)
} catch (error) {
// Try native fetch as a fallback for timeout / network errors
// (Bun/Node bundled contexts occasionally hang with axios + custom lookup.)
const isTimeoutLike =
axios.isAxiosError(error) &&
(!error.response &&
(error.code === 'ECONNABORTED' ||
error.code === 'ETIMEDOUT' ||
error.message?.toLowerCase().includes('timeout')))
if (isTimeoutLike && !signal.aborted) {
try {
const fetchResponse = await fetch(url, {
signal,
redirect: 'manual',
headers: axiosConfig.headers,
})
// Handle redirects manually
if ([301, 302, 307, 308].includes(fetchResponse.status)) {
const redirectLocation = fetchResponse.headers.get('location')
if (!redirectLocation) {
throw new Error('Redirect missing Location header')
}
const redirectUrl = new URL(redirectLocation, url).toString()
if (redirectChecker(url, redirectUrl)) {
return getWithPermittedRedirects(
redirectUrl,
signal,
redirectChecker,
depth + 1,
)
} else {
return {
type: 'redirect' as const,
originalUrl: url,
redirectUrl,
statusCode: fetchResponse.status,
}
}
}
const arrayBuffer = await fetchResponse.arrayBuffer()
// Build an AxiosResponse-like shape so downstream code stays happy
return {
data: new Uint8Array(arrayBuffer),
status: fetchResponse.status,
statusText: fetchResponse.statusText,
headers: Object.fromEntries(fetchResponse.headers.entries()),
config: axiosConfig,
request: undefined,
} as unknown as AxiosResponse<ArrayBuffer>
} catch {
// Fall through to original error handling
}
}
if (
axios.isAxiosError(error) &&
error.response &&
@@ -489,6 +545,58 @@ export async function getURLMarkdownContent(
return entry
}
// Budget for the secondary-model summarization after fetch. If the small-
// fast model is slow (e.g. a 200k-context third-party running a reasoning
// pass over ~100KB of markdown), we'd rather fall back to raw truncated
// markdown than hang the tool. Also keeps the worst-case WebFetch bounded
// to FETCH_TIMEOUT_MS + SECONDARY_MODEL_TIMEOUT_MS regardless of provider.
const SECONDARY_MODEL_TIMEOUT_MS = 45_000
function raceWithTimeout<T>(
promise: Promise<T>,
timeoutMs: number,
signal: AbortSignal,
): Promise<T> {
return new Promise<T>((resolve, reject) => {
const timer = setTimeout(() => {
const err = new Error(`Secondary-model summarization timed out after ${timeoutMs}ms`)
;(err as NodeJS.ErrnoException).code = 'SECONDARY_MODEL_TIMEOUT'
reject(err)
}, timeoutMs)
const onAbort = () => {
clearTimeout(timer)
reject(new AbortError())
}
if (signal.aborted) {
clearTimeout(timer)
reject(new AbortError())
return
}
signal.addEventListener('abort', onAbort, { once: true })
promise.then(
value => {
clearTimeout(timer)
signal.removeEventListener('abort', onAbort)
resolve(value)
},
err => {
clearTimeout(timer)
signal.removeEventListener('abort', onAbort)
reject(err)
},
)
})
}
function buildFallbackMarkdownSummary(truncatedContent: string): string {
return [
'[Secondary-model summarization unavailable — returning raw fetched content.',
'This typically means the configured small-fast model took too long or errored.]',
'',
truncatedContent,
].join('\n')
}
export async function applyPromptToMarkdown(
prompt: string,
markdownContent: string,
@@ -508,18 +616,35 @@ export async function applyPromptToMarkdown(
prompt,
isPreapprovedDomain,
)
const assistantMessage = await queryHaiku({
systemPrompt: asSystemPrompt([]),
userPrompt: modelPrompt,
signal,
options: {
querySource: 'web_fetch_apply',
agents: [],
isNonInteractiveSession,
hasAppendSystemPrompt: false,
mcpTools: [],
},
})
let assistantMessage
try {
assistantMessage = await raceWithTimeout(
queryHaiku({
systemPrompt: asSystemPrompt([]),
userPrompt: modelPrompt,
signal,
options: {
querySource: 'web_fetch_apply',
agents: [],
isNonInteractiveSession,
hasAppendSystemPrompt: false,
mcpTools: [],
},
}),
SECONDARY_MODEL_TIMEOUT_MS,
signal,
)
} catch (err) {
// User interrupts and SIGINTs still propagate. Everything else (timeout,
// provider-side error, unsupported model on third-party endpoint) falls
// back to raw markdown so the user still gets usable content rather than
// a hang. Log so it's visible in debug traces.
if (err instanceof AbortError || (err as Error)?.name === 'AbortError') {
throw err
}
logError(err)
return buildFallbackMarkdownSummary(truncatedContent)
}
// We need to bubble this up, so that the tool call throws, causing us to return
// an is_error tool_use block to the server, and render a red dot in the UI.
@@ -534,5 +659,5 @@ export async function applyPromptToMarkdown(
return contentBlock.text
}
}
return 'No response from model'
return buildFallbackMarkdownSummary(truncatedContent)
}

View File

@@ -203,6 +203,61 @@ function buildCodexWebSearchInstructions(): string {
].join(' ')
}
function pushCodexTextResult(
results: (SearchResult | string)[],
value: unknown,
): void {
if (typeof value !== 'string') return
const trimmed = value.trim()
if (trimmed) {
results.push(trimmed)
}
}
function addCodexSource(
sourceMap: Map<string, { title: string; url: string }>,
source: unknown,
): void {
if (typeof source?.url !== 'string' || !source.url) return
sourceMap.set(source.url, {
title:
typeof source.title === 'string' && source.title
? source.title
: source.url,
url: source.url,
})
}
function getCodexSources(item: Record<string, any>): unknown[] {
if (Array.isArray(item.action?.sources)) {
return item.action.sources
}
if (Array.isArray(item.sources)) {
return item.sources
}
if (Array.isArray(item.result?.sources)) {
return item.result.sources
}
return []
}
function extractCodexWebSearchFailure(item: Record<string, any>): string | undefined {
// Codex web_search_call items can carry a status field. When the tool
// call fails (rate limit, upstream error, model-side guardrail), the
// parser should surface a meaningful error rather than the generic
// "No results found." fallback. Shape observed across recent payloads:
// { type: 'web_search_call', status: 'failed', error: { message?: string } }
// { type: 'web_search_call', status: 'failed', action: { error?: { message?: string } } }
if (item?.status !== 'failed') return undefined
const reason =
(typeof item.error?.message === 'string' && item.error.message) ||
(typeof item.action?.error?.message === 'string' &&
item.action.error.message) ||
(typeof item.error === 'string' && item.error) ||
undefined
return reason ? `Web search failed: ${reason}` : 'Web search failed.'
}
function makeOutputFromCodexWebSearchResponse(
response: Record<string, unknown>,
query: string,
@@ -214,18 +269,12 @@ function makeOutputFromCodexWebSearchResponse(
for (const item of output) {
if (item?.type === 'web_search_call') {
const sources = Array.isArray(item.action?.sources)
? item.action.sources
: []
for (const source of sources) {
if (typeof source?.url !== 'string' || !source.url) continue
sourceMap.set(source.url, {
title:
typeof source.title === 'string' && source.title
? source.title
: source.url,
url: source.url,
})
const failure = extractCodexWebSearchFailure(item)
if (failure) {
results.push(failure)
}
for (const source of getCodexSources(item)) {
addCodexSource(sourceMap, source)
}
continue
}
@@ -235,11 +284,12 @@ function makeOutputFromCodexWebSearchResponse(
}
for (const part of item.content) {
if (part?.type === 'output_text' && typeof part.text === 'string') {
const trimmed = part.text.trim()
if (trimmed) {
results.push(trimmed)
}
if (part?.type === 'output_text' || part?.type === 'text') {
pushCodexTextResult(results, part.text)
}
for (const source of getCodexSources(part)) {
addCodexSource(sourceMap, source)
}
const annotations = Array.isArray(part?.annotations)
@@ -247,23 +297,13 @@ function makeOutputFromCodexWebSearchResponse(
: []
for (const annotation of annotations) {
if (annotation?.type !== 'url_citation') continue
if (typeof annotation.url !== 'string' || !annotation.url) continue
sourceMap.set(annotation.url, {
title:
typeof annotation.title === 'string' && annotation.title
? annotation.title
: annotation.url,
url: annotation.url,
})
addCodexSource(sourceMap, annotation)
}
}
}
if (results.length === 0 && typeof response.output_text === 'string') {
const trimmed = response.output_text.trim()
if (trimmed) {
results.push(trimmed)
}
if (results.length === 0) {
pushCodexTextResult(results, response.output_text)
}
if (sourceMap.size > 0) {
@@ -273,6 +313,10 @@ function makeOutputFromCodexWebSearchResponse(
})
}
if (results.length === 0) {
results.push('No results found.')
}
return {
query,
results,
@@ -280,6 +324,10 @@ function makeOutputFromCodexWebSearchResponse(
}
}
export const __test = {
makeOutputFromCodexWebSearchResponse,
}
async function runCodexWebSearch(
input: Input,
signal: AbortSignal,
@@ -457,6 +505,19 @@ function shouldUseAdapterProvider(): boolean {
return getAvailableProviders().length > 0
}
/**
* Returns true when the current provider has a working native or Codex
* web-search fallback after an adapter failure. OpenAI shim providers
* (moonshot, minimax, nvidia-nim, openai, github, etc.) do NOT support
* Anthropic's web_search_20250305 tool, so falling through to the native
* path silently produces "Did 0 searches".
*/
function hasNativeSearchFallback(): boolean {
if (isCodexResponsesWebSearchEnabled()) return true
const provider = getAPIProvider()
return provider === 'firstParty' || provider === 'vertex' || provider === 'foundry'
}
// ---------------------------------------------------------------------------
// Tool export
// ---------------------------------------------------------------------------
@@ -609,6 +670,17 @@ export const WebSearchTool = buildTool({
// Auto mode: only fall through on transient errors (network, timeout, 5xx).
// Config / guardrail errors (SSRF, HTTPS, bad URL, etc.) must surface.
if (!isTransientError(err)) throw err
// No viable fallback for this provider — surface the adapter error
// instead of falling through to a broken native path.
if (!hasNativeSearchFallback()) {
const provider = getAPIProvider()
const errMsg = err instanceof Error ? err.message : String(err)
throw new Error(
`Web search is unavailable for provider "${provider}". ` +
`The search adapter failed (${errMsg}). ` +
`Try switching to a provider with built-in web search (e.g. Anthropic, Codex) or try again later.`,
)
}
console.error(
`[web-search] Adapter failed, falling through to native: ${err}`,
)

View File

@@ -1,6 +1,44 @@
import type { SearchInput, SearchProvider } from './types.js'
import { applyDomainFilters, type ProviderOutput } from './types.js'
// DuckDuckGo's HTML scraper aggressively blocks datacenter / repeat IPs with
// an "anomaly in the request" response. When that happens we surface an
// actionable error instead of the opaque scraper message so users know how
// to configure a working backend.
const DDG_ANOMALY_HINT =
'DuckDuckGo scraping is rate-limited from this network. ' +
'Configure a search backend with one of: ' +
'FIRECRAWL_API_KEY, TAVILY_API_KEY, EXA_API_KEY, YOU_API_KEY, ' +
'JINA_API_KEY, BING_API_KEY, MOJEEK_API_KEY, LINKUP_API_KEY — ' +
'or use an Anthropic / Vertex / Foundry provider for native web search.'
const MAX_RETRIES = 3
const INITIAL_BACKOFF_MS = 1000
function isAnomalyError(message: string): boolean {
return /anomaly in the request|likely making requests too quickly/i.test(
message,
)
}
function isRetryableDDGError(err: unknown): boolean {
if (!(err instanceof Error)) return false
const msg = err.message.toLowerCase()
return (
msg.includes('anomaly') ||
msg.includes('too quickly') ||
msg.includes('rate limit') ||
msg.includes('timeout') ||
msg.includes('econnreset') ||
msg.includes('etimedout') ||
msg.includes('econnaborted')
)
}
function sleep(ms: number): Promise<void> {
return new Promise(r => setTimeout(r, ms))
}
export const duckduckgoProvider: SearchProvider = {
name: 'duckduckgo',
@@ -19,22 +57,44 @@ export const duckduckgoProvider: SearchProvider = {
throw new Error('duck-duck-scrape package not installed. Run: npm install duck-duck-scrape')
}
if (signal?.aborted) throw new DOMException('Aborted', 'AbortError')
// TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches
const response = await search(input.query, { safeSearch: SafeSearchType.STRICT })
const hits = applyDomainFilters(
response.results.map(r => ({
title: r.title || r.url,
url: r.url,
description: r.description ?? undefined,
})),
input,
)
let lastErr: unknown
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
if (signal?.aborted) throw new DOMException('Aborted', 'AbortError')
try {
// TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches
const response = await search(input.query, { safeSearch: SafeSearchType.STRICT })
return {
hits,
providerName: 'duckduckgo',
durationSeconds: (performance.now() - start) / 1000,
const hits = applyDomainFilters(
response.results.map(r => ({
title: r.title || r.url,
url: r.url,
description: r.description ?? undefined,
})),
input,
)
return {
hits,
providerName: 'duckduckgo',
durationSeconds: (performance.now() - start) / 1000,
}
} catch (err) {
lastErr = err
const msg = err instanceof Error ? err.message : String(err)
if (isAnomalyError(msg)) {
throw new Error(DDG_ANOMALY_HINT)
}
if (!isRetryableDDGError(err) || attempt === MAX_RETRIES - 1) {
throw err
}
// Exponential backoff with jitter: 1s, 2s, 4s +/- 20%
const baseDelay = INITIAL_BACKOFF_MS * Math.pow(2, attempt)
const jitter = baseDelay * 0.2 * (Math.random() * 2 - 1)
await sleep(baseDelay + jitter)
}
}
throw lastErr
},
}

View File

@@ -1,6 +1,6 @@
import { execFileSync, spawn } from 'child_process'
import { constants as fsConstants, readFileSync, unlinkSync } from 'fs'
import { type FileHandle, mkdir, open, realpath } from 'fs/promises'
import { type FileHandle, mkdir, open, stat } from 'fs/promises'
import memoize from 'lodash-es/memoize.js'
import { isAbsolute, resolve } from 'path'
import { join as posixJoin } from 'path/posix'
@@ -217,22 +217,34 @@ export async function exec(
let cwd = pwd()
// Recover if the current working directory no longer exists on disk.
// This can happen when a command deletes its own CWD (e.g., temp dir cleanup).
// Recover if the current working directory no longer exists on disk,
// or was replaced by a non-directory (e.g., the path was renamed and a file
// was created in its place). realpath() succeeds on any existing path
// regardless of type, so we must also verify it's a directory — otherwise
// spawn would fail later with ENOTDIR / exit 126.
let cwdIsValidDir = false
try {
await realpath(cwd)
cwdIsValidDir = (await stat(cwd)).isDirectory()
} catch {
cwdIsValidDir = false
}
if (!cwdIsValidDir) {
const fallback = getOriginalCwd()
logForDebugging(
`Shell CWD "${cwd}" no longer exists, recovering to "${fallback}"`,
`Shell CWD "${cwd}" is not a valid directory, recovering to "${fallback}"`,
)
let fallbackIsValidDir = false
try {
await realpath(fallback)
fallbackIsValidDir = (await stat(fallback)).isDirectory()
} catch {
fallbackIsValidDir = false
}
if (fallbackIsValidDir) {
setCwdState(fallback)
cwd = fallback
} catch {
} else {
return createFailedCommand(
`Working directory "${cwd}" no longer exists. Please restart Claude from an existing directory.`,
`Working directory "${cwd}" is no longer a valid directory. Please restart Claude from an existing directory.`,
)
}
}

View File

@@ -12,7 +12,12 @@ export const MODEL_CONTEXT_WINDOW_DEFAULT = 200_000
// Fallback context window for unknown 3P models. Must be large enough that
// the effective context (this minus output token reservation) stays positive,
// otherwise auto-compact fires on every message (issue #635).
export const OPENAI_FALLBACK_CONTEXT_WINDOW = 128_000
// Override via CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW env var to avoid
// hardcoding when deploying models not yet in openaiContextWindows.ts.
export const OPENAI_FALLBACK_CONTEXT_WINDOW = (() => {
const v = parseInt(process.env.CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW ?? '', 10)
return !isNaN(v) && v > 0 ? v : 128_000
})()
// Maximum output tokens for compact operations
export const COMPACT_MAX_OUTPUT_TOKENS = 20_000

View File

@@ -0,0 +1,357 @@
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import { mkdtemp, rm, writeFile } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
type HookChainsModule = typeof import('./hookChains.js')
type ImportHarnessOptions = {
allowRemoteSessions?: boolean
teamFile?:
| {
name: string
members: Array<{ name: string }>
}
| null
teamName?: string
senderName?: string
replBridgeHandle?: unknown
}
const tempDirs: string[] = []
const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
async function createConfigFile(config: unknown): Promise<string> {
const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-int-'))
tempDirs.push(dir)
const filePath = join(dir, 'hook-chains.json')
await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
return filePath
}
async function importHookChainsHarness(
options: ImportHarnessOptions = {},
): Promise<{
mod: HookChainsModule
writeToMailboxSpy: ReturnType<typeof mock>
agentToolCallSpy: ReturnType<typeof mock>
}> {
mock.restore()
const allowRemoteSessions = options.allowRemoteSessions ?? true
const teamName = options.teamName ?? 'mesh-team'
const senderName = options.senderName ?? 'mesh-lead'
const replBridgeHandle = options.replBridgeHandle ?? null
const writeToMailboxSpy = mock(async () => {})
const agentToolCallSpy = mock(async () => ({
data: {
status: 'async_launched',
agentId: 'agent-fallback-1',
},
}))
mock.module('../services/analytics/index.js', () => ({
logEvent: () => {},
}))
mock.module('./telemetry/events.js', () => ({
logOTelEvent: async () => {},
}))
mock.module('../services/policyLimits/index.js', () => ({
isPolicyAllowed: () => allowRemoteSessions,
}))
mock.module('./swarm/teamHelpers.js', () => ({
readTeamFileAsync: async () => options.teamFile ?? null,
}))
mock.module('./teammateMailbox.js', () => ({
writeToMailbox: writeToMailboxSpy,
}))
mock.module('./teammate.js', () => ({
getAgentName: () => senderName,
getTeamName: () => teamName,
getTeammateColor: () => 'blue',
// Keep parity with the real module's surface so later tests that
// run after this file (mock.module is process-global and mock.restore
// does not undo module mocks in Bun) do not see undefined members.
isTeammate: () => false,
isPlanModeRequired: () => false,
getAgentId: () => undefined,
getParentSessionId: () => undefined,
}))
mock.module('../bridge/replBridgeHandle.js', () => ({
getReplBridgeHandle: () => replBridgeHandle,
}))
// Integration mock target requested in the task: fallback action can route
// through this mocked tool launcher from runtime callback wiring.
mock.module('../tools/AgentTool/AgentTool.js', () => ({
AgentTool: {
call: agentToolCallSpy,
},
}))
const mod = await import(`./hookChains.js?integration=${Date.now()}-${Math.random()}`)
return { mod, writeToMailboxSpy, agentToolCallSpy }
}
beforeEach(() => {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
})
afterEach(async () => {
mock.restore()
if (originalHookChainsEnabled === undefined) {
delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
} else {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
}
await Promise.all(
tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
)
})
describe('hookChains integration dispatch', () => {
test('end-to-end rule evaluation + action dispatch on TaskCompleted failure', async () => {
const { mod } = await importHookChainsHarness({
teamName: 'mesh-team',
senderName: 'mesh-lead',
teamFile: {
name: 'mesh-team',
members: [{ name: 'mesh-lead' }, { name: 'worker-a' }, { name: 'worker-b' }],
},
})
const configPath = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'task-failure-recovery',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{ type: 'spawn_fallback_agent' },
{ type: 'notify_team' },
],
},
],
})
const spawnSpy = mock(async () => ({ launched: true, agentId: 'agent-e2e-1' }))
const notifySpy = mock(async () => ({ sent: true, recipientCount: 2 }))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: {
task_id: 'task-001',
task_subject: 'Patch flaky build',
error: 'CI timeout',
},
},
runtime: {
onSpawnFallbackAgent: spawnSpy,
onNotifyTeam: notifySpy,
},
})
expect(result.enabled).toBe(true)
expect(result.matchedRuleIds).toEqual(['task-failure-recovery'])
expect(result.actionResults).toHaveLength(2)
expect(result.actionResults[0]?.status).toBe('executed')
expect(result.actionResults[1]?.status).toBe('executed')
expect(spawnSpy).toHaveBeenCalledTimes(1)
expect(notifySpy).toHaveBeenCalledTimes(1)
})
test('fallback spawn injects failure context into generated prompt', async () => {
const { mod, agentToolCallSpy } = await importHookChainsHarness()
const configPath = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'fallback-context',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{
type: 'spawn_fallback_agent',
description: 'Fallback for failed task',
},
],
},
],
})
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: {
task_id: 'task-ctx-1',
task_subject: 'Repair migration guard',
task_description: 'Fix regression in check ordering',
error: 'Task failed after retry budget exhausted',
},
},
runtime: {
onSpawnFallbackAgent: async request => {
const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
await (AgentTool.call as unknown as (...args: unknown[]) => Promise<unknown>)({
prompt: request.prompt,
description: request.description,
run_in_background: request.runInBackground,
subagent_type: request.agentType,
model: request.model,
})
return { launched: true, agentId: 'agent-fallback-ctx' }
},
},
})
expect(result.actionResults[0]?.status).toBe('executed')
expect(agentToolCallSpy).toHaveBeenCalledTimes(1)
const callInput = agentToolCallSpy.mock.calls[0]?.[0] as {
prompt: string
description: string
run_in_background: boolean
}
expect(callInput.description).toBe('Fallback for failed task')
expect(callInput.run_in_background).toBe(true)
expect(callInput.prompt).toContain('Event: TaskCompleted')
expect(callInput.prompt).toContain('Outcome: failed')
expect(callInput.prompt).toContain('Task subject: Repair migration guard')
expect(callInput.prompt).toContain('Failure details: Task failed after retry budget exhausted')
})
test('notify_team dispatches mailbox writes when team exists and skips when absent', async () => {
const withTeam = await importHookChainsHarness({
teamName: 'mesh-a',
senderName: 'lead-a',
teamFile: {
name: 'mesh-a',
members: [{ name: 'lead-a' }, { name: 'worker-1' }, { name: 'worker-2' }],
},
})
const configPathWithTeam = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'notify-existing-team',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'notify_team' }],
},
],
})
const withTeamResult = await withTeam.mod.dispatchHookChainsForEvent({
configPathOverride: configPathWithTeam,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-team-ok', error: 'boom' },
},
})
expect(withTeamResult.actionResults[0]?.status).toBe('executed')
expect(withTeam.writeToMailboxSpy).toHaveBeenCalledTimes(2)
const recipients = withTeam.writeToMailboxSpy.mock.calls.map(
call => call[0] as string,
)
expect(recipients.sort()).toEqual(['worker-1', 'worker-2'])
const withoutTeam = await importHookChainsHarness({
teamName: 'mesh-missing',
senderName: 'lead-missing',
teamFile: null,
})
const configPathWithoutTeam = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'notify-missing-team',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'notify_team' }],
},
],
})
const withoutTeamResult = await withoutTeam.mod.dispatchHookChainsForEvent({
configPathOverride: configPathWithoutTeam,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-team-missing', error: 'boom' },
},
})
expect(withoutTeamResult.actionResults[0]?.status).toBe('skipped')
expect(withoutTeamResult.actionResults[0]?.reason).toContain('Team file not found')
expect(withoutTeam.writeToMailboxSpy).not.toHaveBeenCalled()
})
test('warm_remote_capacity is a safe no-op when bridge is inactive', async () => {
const { mod } = await importHookChainsHarness({
allowRemoteSessions: true,
replBridgeHandle: null,
})
const configPath = await createConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'bridge-warmup-noop',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'warm_remote_capacity' }],
},
],
})
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-warm-1' },
},
})
expect(result.actionResults).toHaveLength(1)
expect(result.actionResults[0]?.status).toBe('skipped')
expect(result.actionResults[0]?.reason).toContain('Bridge is not active')
})
})

View File

@@ -0,0 +1,476 @@
import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
import { mkdtemp, rm, writeFile } from 'node:fs/promises'
import { tmpdir } from 'node:os'
import { join } from 'node:path'
type HookChainsModule = typeof import('./hookChains.js')
const tempDirs: string[] = []
const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
async function makeConfigFile(config: unknown): Promise<string> {
const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-'))
tempDirs.push(dir)
const filePath = join(dir, 'hook-chains.json')
await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
return filePath
}
async function importHookChainsModule(options?: {
allowRemoteSessions?: boolean
}): Promise<HookChainsModule> {
mock.restore()
const allowRemoteSessions = options?.allowRemoteSessions ?? true
mock.module('../services/analytics/index.js', () => ({
logEvent: () => {},
}))
mock.module('./telemetry/events.js', () => ({
logOTelEvent: async () => {},
}))
mock.module('../services/policyLimits/index.js', () => ({
isPolicyAllowed: () => allowRemoteSessions,
}))
return import(`./hookChains.js?test=${Date.now()}-${Math.random()}`)
}
beforeEach(() => {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
})
afterEach(async () => {
mock.restore()
if (originalHookChainsEnabled === undefined) {
delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
} else {
process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
}
await Promise.all(
tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
)
})
describe('hookChains schema validation', () => {
test('returns disabled config when env gate is unset', async () => {
delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
rules: [
{
id: 'env-gated-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(loaded.exists).toBe(false)
expect(loaded.config.enabled).toBe(false)
expect(loaded.config.rules).toHaveLength(0)
})
test('loads valid config and memoizes by mtime/size', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 5000,
defaultDedupWindowMs: 5000,
rules: [
{
id: 'task-failure-fallback',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{
type: 'spawn_fallback_agent',
description: 'Fallback recovery agent',
},
],
},
],
})
const first = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(first.exists).toBe(true)
expect(first.error).toBeUndefined()
expect(first.fromCache).toBe(false)
expect(first.config.enabled).toBe(true)
expect(first.config.rules).toHaveLength(1)
expect(first.config.rules[0]?.id).toBe('task-failure-fallback')
const second = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(second.exists).toBe(true)
expect(second.error).toBeUndefined()
expect(second.fromCache).toBe(true)
expect(second.config.rules).toHaveLength(1)
})
test('accepts wrapped { hookChains: ... } config shape', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
hookChains: {
version: 1,
enabled: true,
rules: [
{
id: 'wrapped-shape',
trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
actions: [{ type: 'notify_team' }],
},
],
},
})
const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(loaded.error).toBeUndefined()
expect(loaded.config.enabled).toBe(true)
expect(loaded.config.rules[0]?.id).toBe('wrapped-shape')
})
test('returns disabled config for invalid schema', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
rules: [
{
id: 'invalid-rule',
trigger: {
event: 'TaskCompleted',
outcome: 'failed',
outcomes: ['failed'],
},
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
expect(loaded.exists).toBe(true)
expect(loaded.error).toBeDefined()
expect(loaded.config.enabled).toBe(false)
expect(loaded.config.rules).toHaveLength(0)
})
})
describe('evaluateHookChainRules', () => {
test('matches by event + outcome + condition', async () => {
const mod = await importHookChainsModule()
const rules = [
{
id: 'post-tool-failure-rule',
trigger: { event: 'PostToolUseFailure', outcome: 'failed' },
condition: {
toolNames: ['Edit'],
errorIncludes: ['permission'],
eventFieldEquals: { 'meta.source': 'scheduler' },
},
actions: [{ type: 'spawn_fallback_agent' }],
},
]
const matches = mod.evaluateHookChainRules(rules as never, {
eventName: 'PostToolUseFailure',
outcome: 'failed',
payload: {
tool_name: 'Edit',
error: 'Permission denied by policy',
meta: { source: 'scheduler' },
},
})
expect(matches).toHaveLength(1)
expect(matches[0]?.rule.id).toBe('post-tool-failure-rule')
})
test('does not match when event/condition fail', async () => {
const mod = await importHookChainsModule()
const rules = [
{
id: 'rule-no-match',
trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
condition: { toolNames: ['Write'] },
actions: [{ type: 'spawn_fallback_agent' }],
},
]
const wrongEvent = mod.evaluateHookChainRules(rules as never, {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { tool_name: 'Write' },
})
expect(wrongEvent).toHaveLength(0)
const wrongCondition = mod.evaluateHookChainRules(rules as never, {
eventName: 'PostToolUseFailure',
outcome: 'failed',
payload: { tool_name: 'Edit' },
})
expect(wrongCondition).toHaveLength(0)
})
})
describe('dispatchHookChainsForEvent guard logic', () => {
test('dedup skips duplicate event/action within dedup window', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 4,
defaultCooldownMs: 0,
defaultDedupWindowMs: 60_000,
rules: [
{
id: 'dedup-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
cooldownMs: 0,
dedupWindowMs: 60_000,
actions: [{ id: 'spawn-1', type: 'spawn_fallback_agent' }],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-1' }))
const first = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-123', error: 'boom' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
const second = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-123', error: 'boom' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
expect(first.actionResults[0]?.status).toBe('executed')
expect(second.actionResults[0]?.status).toBe('skipped')
expect(second.actionResults[0]?.reason).toContain('dedup')
expect(spawn).toHaveBeenCalledTimes(1)
})
test('cooldown skips second dispatch when rule cooldown is active', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 4,
defaultCooldownMs: 60_000,
defaultDedupWindowMs: 0,
rules: [
{
id: 'cooldown-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
cooldownMs: 60_000,
dedupWindowMs: 0,
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-2' }))
const first = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-456' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
const second = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-789' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
expect(first.actionResults[0]?.status).toBe('executed')
expect(second.actionResults[0]?.status).toBe('skipped')
expect(second.actionResults[0]?.reason).toContain('cooldown')
expect(spawn).toHaveBeenCalledTimes(1)
})
test('depth limit blocks dispatch when chain depth reaches max', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 1,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'depth-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-3' }))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-depth' },
},
runtime: {
chainDepth: 1,
onSpawnFallbackAgent: spawn,
},
})
expect(result.enabled).toBe(true)
expect(result.matchedRuleIds).toHaveLength(0)
expect(result.actionResults).toHaveLength(0)
expect(spawn).not.toHaveBeenCalled()
})
})
describe('action dispatch skip scenarios', () => {
test('fails spawn_fallback_agent when launcher callback is missing', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'missing-launcher',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'spawn_fallback_agent' }],
},
],
})
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-missing-launcher' },
},
runtime: {},
})
expect(result.actionResults[0]?.status).toBe('failed')
expect(result.actionResults[0]?.reason).toContain('launcher')
})
test('skips disabled action and does not execute callback', async () => {
const mod = await importHookChainsModule()
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'disabled-action-rule',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [
{
type: 'spawn_fallback_agent',
enabled: false,
},
],
},
],
})
const spawn = mock(async () => ({ launched: true, agentId: 'agent-4' }))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-disabled' },
},
runtime: { onSpawnFallbackAgent: spawn },
})
expect(result.actionResults[0]?.status).toBe('skipped')
expect(result.actionResults[0]?.reason).toContain('disabled')
expect(spawn).not.toHaveBeenCalled()
})
test('skips warm_remote_capacity when policy denies remote sessions', async () => {
const mod = await importHookChainsModule({ allowRemoteSessions: false })
const configPath = await makeConfigFile({
version: 1,
enabled: true,
maxChainDepth: 3,
defaultCooldownMs: 0,
defaultDedupWindowMs: 0,
rules: [
{
id: 'policy-denied-remote-warm',
trigger: { event: 'TaskCompleted', outcome: 'failed' },
actions: [{ type: 'warm_remote_capacity' }],
},
],
})
const warm = mock(async () => ({
warmed: true,
environmentId: 'env-123',
}))
const result = await mod.dispatchHookChainsForEvent({
configPathOverride: configPath,
event: {
eventName: 'TaskCompleted',
outcome: 'failed',
payload: { task_id: 'task-policy-denied' },
},
runtime: { onWarmRemoteCapacity: warm },
})
expect(result.actionResults[0]?.status).toBe('skipped')
expect(result.actionResults[0]?.reason).toContain('policy')
expect(warm).not.toHaveBeenCalled()
})
})

1518
src/utils/hookChains.ts Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -10,6 +10,7 @@ import { wrapSpawn } from './ShellCommand.js'
import { TaskOutput } from './task/TaskOutput.js'
import { getCwd } from './cwd.js'
import { randomUUID } from 'crypto'
import { feature } from 'bun:bundle'
import { formatShellPrefixCommand } from './bash/shellPrefix.js'
import {
getHookEnvFilePath,
@@ -134,6 +135,7 @@ import { registerPendingAsyncHook } from './hooks/AsyncHookRegistry.js'
import { enqueuePendingNotification } from './messageQueueManager.js'
import {
extractTextContent,
createAssistantMessage,
getLastAssistantMessage,
wrapInSystemReminder,
} from './messages.js'
@@ -145,6 +147,7 @@ import {
import { createAttachmentMessage } from './attachments.js'
import { all } from './generators.js'
import { findToolByName, type Tools, type ToolUseContext } from '../Tool.js'
import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
import { execPromptHook } from './hooks/execPromptHook.js'
import type { Message, AssistantMessage } from '../types/message.js'
import { execAgentHook } from './hooks/execAgentHook.js'
@@ -162,9 +165,147 @@ import type { AppState } from '../state/AppState.js'
import { jsonStringify, jsonParse } from './slowOperations.js'
import { isEnvTruthy } from './envUtils.js'
import { errorMessage, getErrnoCode } from './errors.js'
import { getAgentName, getTeamName, getTeammateColor } from './teammate.js'
import type {
HookChainOutcome,
HookChainRuntimeContext,
SpawnFallbackAgentRequest,
SpawnFallbackAgentResponse,
} from './hookChains.js'
const TOOL_HOOK_EXECUTION_TIMEOUT_MS = 10 * 60 * 1000
function normalizeFallbackAgentModel(
model: string | undefined,
): 'sonnet' | 'opus' | 'haiku' | undefined {
if (model === 'sonnet' || model === 'opus' || model === 'haiku') {
return model
}
return undefined
}
async function launchFallbackAgentFromHookChains(
request: SpawnFallbackAgentRequest,
toolUseContext: ToolUseContext,
canUseTool: CanUseToolFn,
): Promise<SpawnFallbackAgentResponse> {
try {
const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
const normalizedModel = normalizeFallbackAgentModel(request.model)
const result = await AgentTool.call(
{
prompt: request.prompt,
description: request.description,
run_in_background: true,
...(request.agentType ? { subagent_type: request.agentType } : {}),
...(normalizedModel ? { model: normalizedModel } : {}),
},
toolUseContext,
canUseTool,
createAssistantMessage({ content: [] }),
)
const data = result.data as
| {
status?: string
agentId?: string
agent_id?: string
}
| undefined
const status = data?.status
if (
status === 'async_launched' ||
status === 'completed' ||
status === 'remote_launched' ||
status === 'teammate_spawned'
) {
return {
launched: true,
agentId: data?.agentId ?? data?.agent_id,
}
}
return {
launched: true,
reason:
status !== undefined
? `Fallback launched with status ${status}`
: undefined,
}
} catch (error) {
return {
launched: false,
reason: `Fallback launch failed: ${errorMessage(error)}`,
}
}
}
async function dispatchHookChainFromHookRuntime(args: {
eventName: 'PostToolUseFailure' | 'TaskCompleted'
outcome: HookChainOutcome
payload: Record<string, unknown>
signal?: AbortSignal
toolUseContext?: ToolUseContext
}): Promise<void> {
try {
if (!feature('HOOK_CHAINS')) {
return
}
const { dispatchHookChainsForEvent } = await import('./hookChains.js')
const runtime: HookChainRuntimeContext = {
signal: args.signal,
senderName: getAgentName() ?? undefined,
senderColor: getTeammateColor() ?? undefined,
teamName: getTeamName() ?? undefined,
}
const chainDepth = args.toolUseContext?.queryTracking?.depth
if (typeof chainDepth === 'number' && Number.isFinite(chainDepth)) {
runtime.chainDepth = chainDepth
}
const hookChainsCanUseTool = (
args.toolUseContext as
| (ToolUseContext & { hookChainsCanUseTool?: CanUseToolFn })
| undefined
)?.hookChainsCanUseTool
if (args.toolUseContext) {
runtime.onSpawnFallbackAgent = request => {
if (!hookChainsCanUseTool) {
return Promise.resolve({
launched: false,
reason:
'Fallback action requires canUseTool in this hook runtime context',
})
}
return launchFallbackAgentFromHookChains(
request,
args.toolUseContext!,
hookChainsCanUseTool,
)
}
}
await dispatchHookChainsForEvent({
event: {
eventName: args.eventName,
outcome: args.outcome,
payload: args.payload,
},
runtime,
})
} catch (error) {
logForDebugging(
`[hook-chains] Dispatch failed for ${args.eventName}: ${errorMessage(error)}`,
)
}
}
/**
* SessionEnd hooks run during shutdown/clear and need a much tighter bound
* than TOOL_HOOK_EXECUTION_TIMEOUT_MS. This value is used by callers as both
@@ -3502,9 +3643,11 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
): AsyncGenerator<AggregatedHookResult> {
const appState = toolUseContext.getAppState()
const sessionId = toolUseContext.agentId ?? getSessionId()
if (!hasHookForEvent('PostToolUseFailure', appState, sessionId)) {
return
}
const hasPostToolFailureHooks = hasHookForEvent(
'PostToolUseFailure',
appState,
sessionId,
)
const hookInput: PostToolUseFailureHookInput = {
...createBaseHookInput(permissionMode, undefined, toolUseContext),
@@ -3516,12 +3659,33 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
is_interrupt: isInterrupt,
}
yield* executeHooks({
hookInput,
toolUseID,
matchQuery: toolName,
let blockingHookCount = 0
if (hasPostToolFailureHooks) {
for await (const result of executeHooks({
hookInput,
toolUseID,
matchQuery: toolName,
signal,
timeoutMs,
toolUseContext,
})) {
if (result.blockingError) {
blockingHookCount++
}
yield result
}
}
await dispatchHookChainFromHookRuntime({
eventName: 'PostToolUseFailure',
outcome: 'failed',
payload: {
...hookInput,
hook_blocking_error_count: blockingHookCount,
hook_execution_skipped: !hasPostToolFailureHooks,
},
signal,
timeoutMs,
toolUseContext,
})
}
@@ -3807,12 +3971,36 @@ export async function* executeTaskCompletedHooks(
team_name: teamName,
}
yield* executeHooks({
let blockingHookCount = 0
let preventedContinuation = false
for await (const result of executeHooks({
hookInput,
toolUseID: randomUUID(),
signal,
timeoutMs,
toolUseContext,
})) {
if (result.blockingError) {
blockingHookCount++
}
if (result.preventContinuation) {
preventedContinuation = true
}
yield result
}
await dispatchHookChainFromHookRuntime({
eventName: 'TaskCompleted',
outcome:
blockingHookCount > 0 || preventedContinuation ? 'failed' : 'success',
payload: {
...hookInput,
hook_blocking_error_count: blockingHookCount,
hook_prevented_continuation: preventedContinuation,
},
signal,
toolUseContext,
})
}

View File

@@ -75,6 +75,7 @@ import type {
import { isAdvisorBlock } from './advisor.js'
import { isAgentSwarmsEnabled } from './agentSwarmsEnabled.js'
import { count } from './array.js'
import { isEnvTruthy } from './envUtils.js'
import {
type Attachment,
type HookAttachment,
@@ -3666,6 +3667,9 @@ Read the team config to discover your teammates' names. Check the task list peri
])
}
case 'todo_reminder': {
if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
return []
}
const todoItems = attachment.content
.map((todo, index) => `${index + 1}. [${todo.status}] ${todo.content}`)
.join('\n')
@@ -3686,6 +3690,9 @@ Read the team config to discover your teammates' names. Check the task list peri
if (!isTodoV2Enabled()) {
return []
}
if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
return []
}
const taskItems = attachment.content
.map(task => `#${task.id}. [${task.status}] ${task.subject}`)
.join('\n')

View File

@@ -1,7 +1,13 @@
import { afterEach, beforeEach, expect, test } from 'bun:test'
import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
import { saveGlobalConfig } from '../config.js'
import { getUserSpecifiedModelSetting } from './model.js'
import {
getDefaultHaikuModel,
getDefaultOpusModel,
getDefaultSonnetModel,
getSmallFastModel,
getUserSpecifiedModelSetting,
} from './model.js'
const SAVED_ENV = {
CLAUDE_CODE_USE_OPENAI: process.env.CLAUDE_CODE_USE_OPENAI,
@@ -28,6 +34,11 @@ function restoreEnv(key: keyof typeof SAVED_ENV): void {
}
beforeEach(() => {
// Other test files (notably modelOptions.github.test.ts) install a
// persistent mock.module for './providers.js' that overrides getAPIProvider
// globally. Without mock.restore() here, those overrides bleed into this
// suite and the provider-kind branches we're testing become unreachable.
mock.restore()
delete process.env.CLAUDE_CODE_USE_OPENAI
delete process.env.CLAUDE_CODE_USE_GEMINI
delete process.env.CLAUDE_CODE_USE_GITHUB
@@ -113,3 +124,76 @@ test('github provider still reads OPENAI_MODEL (regression guard)', () => {
expect(model).toBe('github:copilot')
})
// ---------------------------------------------------------------------------
// Default model helpers — must not fall through to claude-haiku-4-5 etc. for
// OpenAI-shim providers whose endpoints don't speak Anthropic model names.
// Hitting that fallthrough caused WebFetch to hang for 60s on MiniMax/Codex
// because queryHaiku() shipped an unknown model id to the shim endpoint.
// ---------------------------------------------------------------------------
test('getSmallFastModel returns OPENAI_MODEL for MiniMax (regression: WebFetch hang)', () => {
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.5-highspeed'
expect(getSmallFastModel()).toBe('MiniMax-M2.5-highspeed')
})
test('getSmallFastModel returns OPENAI_MODEL for Codex (regression)', () => {
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_BASE_URL = 'https://chatgpt.com/backend-api/codex'
process.env.OPENAI_MODEL = 'codexspark'
process.env.CODEX_API_KEY = 'codex-test'
process.env.CHATGPT_ACCOUNT_ID = 'acct_test'
expect(getSmallFastModel()).toBe('codexspark')
})
test('getSmallFastModel returns OPENAI_MODEL for NVIDIA NIM (regression)', () => {
process.env.NVIDIA_NIM = '1'
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
expect(getSmallFastModel()).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
})
test('getDefaultOpusModel returns OPENAI_MODEL for MiniMax', () => {
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.7'
expect(getDefaultOpusModel()).toBe('MiniMax-M2.7')
})
test('getDefaultSonnetModel returns OPENAI_MODEL for NVIDIA NIM', () => {
process.env.NVIDIA_NIM = '1'
process.env.CLAUDE_CODE_USE_OPENAI = '1'
process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
expect(getDefaultSonnetModel()).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
})
test('getDefaultHaikuModel returns OPENAI_MODEL for MiniMax', () => {
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.5-highspeed'
expect(getDefaultHaikuModel()).toBe('MiniMax-M2.5-highspeed')
})
test('default helpers do not leak claude-* names to shim providers', () => {
// Umbrella guard: for each OpenAI-shim provider, none of the default-model
// helpers may return an Anthropic-branded model name. That was the source
// of the WebFetch 60s hang — MiniMax received "claude-haiku-4-5" and sat
// on the connection.
process.env.MINIMAX_API_KEY = 'minimax-test'
process.env.OPENAI_MODEL = 'MiniMax-M2.7'
for (const fn of [
getSmallFastModel,
getDefaultOpusModel,
getDefaultSonnetModel,
getDefaultHaikuModel,
]) {
const model = fn()
expect(model.toLowerCase()).not.toContain('claude')
}
})

View File

@@ -52,10 +52,25 @@ export function getSmallFastModel(): ModelName {
if (getAPIProvider() === 'openai') {
return process.env.OPENAI_MODEL || 'gpt-4o-mini'
}
// Codex provider — OPENAI_MODEL is always set for Codex profiles; only fall
// back to a codex-spark alias when an override env strips it.
if (getAPIProvider() === 'codex') {
return process.env.OPENAI_MODEL || 'codexspark'
}
// For GitHub Copilot provider
if (getAPIProvider() === 'github') {
return process.env.OPENAI_MODEL || 'github:copilot'
}
// NVIDIA NIM — OPENAI_MODEL carries the user's active NIM model; use a
// small Meta Llama variant as the conservative fallback.
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'meta/llama-3.1-8b-instruct'
}
// MiniMax — OPENAI_MODEL carries the active MiniMax model; fall back to
// the fastest tier (M2.5-highspeed) when missing.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.5-highspeed'
}
return getDefaultHaikuModel()
}
@@ -171,6 +186,14 @@ export function getDefaultOpusModel(): ModelName {
if (getAPIProvider() === 'github') {
return process.env.OPENAI_MODEL || 'github:copilot'
}
// NVIDIA NIM
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'nvidia/llama-3.1-nemotron-70b-instruct'
}
// MiniMax — flagship tier for "opus"-equivalent.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.7'
}
// 3P providers (Bedrock, Vertex, Foundry) — kept as a separate branch
// even when values match, since 3P availability lags firstParty and
// these will diverge again at the next model launch.
@@ -205,6 +228,14 @@ export function getDefaultSonnetModel(): ModelName {
if (getAPIProvider() === 'github') {
return process.env.OPENAI_MODEL || 'github:copilot'
}
// NVIDIA NIM
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'nvidia/llama-3.1-nemotron-70b-instruct'
}
// MiniMax — mid tier for "sonnet"-equivalent.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.5'
}
// Default to Sonnet 4.5 for 3P since they may not have 4.6 yet
if (getAPIProvider() !== 'firstParty') {
return getModelStrings().sonnet45
@@ -237,6 +268,14 @@ export function getDefaultHaikuModel(): ModelName {
if (getAPIProvider() === 'gemini') {
return process.env.GEMINI_MODEL || 'gemini-2.0-flash-lite'
}
// NVIDIA NIM
if (getAPIProvider() === 'nvidia-nim') {
return process.env.OPENAI_MODEL || 'meta/llama-3.1-8b-instruct'
}
// MiniMax — fastest tier for "haiku"-equivalent.
if (getAPIProvider() === 'minimax') {
return process.env.OPENAI_MODEL || 'MiniMax-M2.5-highspeed'
}
// Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex)
return getModelStrings().haiku45

View File

@@ -413,16 +413,51 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
'moonshot-v1-128k': 32_768,
}
function lookupByModel<T>(table: Record<string, T>, model: string): T | undefined {
// External context-window overrides loaded once at startup.
// Set CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS to a JSON object mapping model name
// → context-window token count to add or override entries without editing
// this file. Example:
// CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS='{"my-corp/llm-v2":200000}'
const OPENAI_EXTERNAL_CONTEXT_WINDOWS: Record<string, number> = (() => {
try {
const raw = process.env.CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS
if (raw) {
const parsed = JSON.parse(raw)
if (typeof parsed === 'object' && parsed !== null) return parsed as Record<string, number>
}
} catch { /* ignore malformed JSON */ }
return {}
})()
// External max-output-token overrides.
// Set CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS to a JSON object mapping model name
// → max output token count.
const OPENAI_EXTERNAL_MAX_OUTPUT_TOKENS: Record<string, number> = (() => {
try {
const raw = process.env.CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS
if (raw) {
const parsed = JSON.parse(raw)
if (typeof parsed === 'object' && parsed !== null) return parsed as Record<string, number>
}
} catch { /* ignore malformed JSON */ }
return {}
})()
function lookupByModel<T>(table: Record<string, T>, externalTable: Record<string, T>, model: string): T | undefined {
// Try provider-qualified key first: "{OPENAI_MODEL}:{model}" so that
// e.g. "github:copilot:claude-haiku-4.5" can have different limits than
// a bare "claude-haiku-4.5" served by another provider.
const providerModel = process.env.OPENAI_MODEL?.trim()
if (providerModel && providerModel !== model) {
const qualified = `${providerModel}:${model}`
// External table takes precedence over the built-in table.
const externalQualified = lookupByKey(externalTable, qualified)
if (externalQualified !== undefined) return externalQualified
const qualifiedResult = lookupByKey(table, qualified)
if (qualifiedResult !== undefined) return qualifiedResult
}
const externalResult = lookupByKey(externalTable, model)
if (externalResult !== undefined) return externalResult
return lookupByKey(table, model)
}
@@ -446,7 +481,7 @@ function lookupByKey<T>(table: Record<string, T>, model: string): T | undefined
* "gpt-4o-2024-11-20" resolve to the base "gpt-4o" entry.
*/
export function getOpenAIContextWindow(model: string): number | undefined {
return lookupByModel(OPENAI_CONTEXT_WINDOWS, model)
return lookupByModel(OPENAI_CONTEXT_WINDOWS, OPENAI_EXTERNAL_CONTEXT_WINDOWS, model)
}
/**
@@ -454,5 +489,5 @@ export function getOpenAIContextWindow(model: string): number | undefined {
* Returns undefined if the model is not in the table.
*/
export function getOpenAIMaxOutputTokens(model: string): number | undefined {
return lookupByModel(OPENAI_MAX_OUTPUT_TOKENS, model)
return lookupByModel(OPENAI_MAX_OUTPUT_TOKENS, OPENAI_EXTERNAL_MAX_OUTPUT_TOKENS, model)
}

View File

@@ -19,7 +19,12 @@ export function getAPIProvider(): APIProvider {
if (isEnvTruthy(process.env.NVIDIA_NIM)) {
return 'nvidia-nim'
}
if (isEnvTruthy(process.env.MINIMAX_API_KEY)) {
// MiniMax is signalled by a real API key, not a '1'/'true' flag. Using
// isEnvTruthy() here silently treated every MiniMax user as 'firstParty'
// (or 'openai' once they set CLAUDE_CODE_USE_OPENAI via the profile),
// making every provider-kind-specific branch for 'minimax' elsewhere in
// the codebase unreachable. Presence check is the correct signal.
if (typeof process.env.MINIMAX_API_KEY === 'string' && process.env.MINIMAX_API_KEY.trim() !== '') {
return 'minimax'
}
return isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI)

View File

@@ -0,0 +1,86 @@
import { describe, expect, it, beforeEach } from 'bun:test'
import {
createCorrelationId,
logApiCallStart,
logApiCallEnd,
} from './requestLogging.js'
describe('requestLogging', () => {
describe('createCorrelationId', () => {
it('returns a non-empty string', () => {
const id = createCorrelationId()
expect(id).toBeTruthy()
expect(typeof id).toBe('string')
})
it('returns unique IDs', () => {
const id1 = createCorrelationId()
const id2 = createCorrelationId()
expect(id1).not.toBe(id2)
})
})
describe('logApiCallStart', () => {
it('returns correlation ID and start time', () => {
const result = logApiCallStart('openai', 'gpt-4o')
expect(result.correlationId).toBeTruthy()
expect(result.startTime).toBeGreaterThan(0)
})
it('logs without throwing', () => {
expect(() => logApiCallStart('ollama', 'llama3')).not.toThrow()
})
})
describe('logApiCallEnd', () => {
it('logs success without throwing', () => {
const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
expect(() =>
logApiCallEnd(
correlationId,
startTime,
'gpt-4o',
'success',
100,
50,
false,
),
).not.toThrow()
})
it('logs error without throwing', () => {
const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
expect(() =>
logApiCallEnd(
correlationId,
startTime,
'gpt-4o',
'error',
0,
0,
false,
undefined,
undefined,
'Network error',
),
).not.toThrow()
})
it('logs with all parameters without throwing', () => {
const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
expect(() =>
logApiCallEnd(
correlationId,
startTime,
'gpt-4o',
'success',
100,
50,
true,
'error message',
{ provider: 'openai' },
),
).not.toThrow()
})
})
})

View File

@@ -0,0 +1,89 @@
/**
* Structured Request Logging
*
* Uses existing logForDebugging for structured logging.
*/
import { randomUUID } from 'crypto'
import { logForDebugging } from './debug.js'
export interface RequestLog {
correlationId: string
timestamp: number
provider: string
model: string
duration: number
status: 'success' | 'error'
tokensIn: number
tokensOut: number
error?: string
streaming: boolean
}
export function createCorrelationId(): string {
return randomUUID()
}
export function logApiCallStart(
provider: string,
model: string,
): { correlationId: string; startTime: number } {
const correlationId = createCorrelationId()
const startTime = Date.now()
logForDebugging(
JSON.stringify({
type: 'api_call_start',
correlationId,
provider,
model,
timestamp: startTime,
}),
{ level: 'debug' },
)
return { correlationId, startTime }
}
export function logApiCallEnd(
correlationId: string,
startTime: number,
model: string,
status: 'success' | 'error',
tokensIn: number,
tokensOut: number,
streaming: boolean,
firstTokenMs?: number,
totalChunks?: number,
error?: string,
): void {
const duration = Date.now() - startTime
const logData: Record<string, unknown> = {
type: status === 'error' ? 'api_call_error' : 'api_call_end',
correlationId,
model,
duration_ms: duration,
status,
tokens_in: tokensIn,
tokens_out: tokensOut,
streaming,
}
if (firstTokenMs !== undefined) {
logData.first_token_ms = firstTokenMs
}
if (totalChunks !== undefined) {
logData.total_chunks = totalChunks
}
if (error) {
logData.error = error
}
logForDebugging(
JSON.stringify(logData),
{ level: status === 'error' ? 'error' : 'debug' },
)
}

View File

@@ -0,0 +1,61 @@
import { describe, expect, it, beforeEach } from 'bun:test'
import {
createStreamState,
processStreamChunk,
flushStreamBuffer,
getStreamStats,
} from './streamingOptimizer.js'
describe('streamingOptimizer', () => {
let state: ReturnType<typeof createStreamState>
beforeEach(() => {
state = createStreamState()
})
describe('createStreamState', () => {
it('creates initial state with zero counts', () => {
expect(state.chunkCount).toBe(0)
expect(state.firstTokenTime).toBeNull()
expect(state.startTime).toBeGreaterThan(0)
})
})
describe('processStreamChunk', () => {
it('tracks first token time on first chunk', () => {
processStreamChunk(state, 'hello')
expect(state.firstTokenTime).not.toBeNull()
expect(state.chunkCount).toBe(1)
})
it('increments chunk count', () => {
processStreamChunk(state, 'chunk1')
processStreamChunk(state, 'chunk2')
expect(state.chunkCount).toBe(2)
})
})
describe('getStreamStats', () => {
it('returns zero values for empty stream', () => {
const stats = getStreamStats(state)
expect(stats.totalChunks).toBe(0)
expect(stats.firstTokenMs).toBeNull()
expect(stats.durationMs).toBeGreaterThanOrEqual(0)
})
it('returns correct stats after processing chunks', () => {
processStreamChunk(state, 'test')
const stats = getStreamStats(state)
expect(stats.totalChunks).toBe(1)
expect(stats.firstTokenMs).toBeGreaterThanOrEqual(0)
expect(stats.durationMs).toBeGreaterThanOrEqual(0)
})
})
describe('flushStreamBuffer', () => {
it('returns empty string (no-op)', () => {
const result = flushStreamBuffer(state)
expect(result).toBe('')
})
})
})

View File

@@ -0,0 +1,51 @@
/**
* Streaming Stats Tracker
*
* Observational stats tracking for streaming responses.
* No buffering - purely tracks metrics for monitoring.
*/
export interface StreamStats {
totalChunks: number
firstTokenMs: number | null
durationMs: number
}
export interface StreamState {
chunkCount: number
firstTokenTime: number | null
startTime: number
}
export function createStreamState(): StreamState {
return {
chunkCount: 0,
firstTokenTime: null,
startTime: Date.now(),
}
}
export function processStreamChunk(state: StreamState, _chunk: string): void {
if (state.firstTokenTime === null) {
state.firstTokenTime = Date.now()
}
state.chunkCount++
}
export function flushStreamBuffer(_state: StreamState): string {
return '' // No-op - kept for API compatibility
}
export function getStreamStats(state: StreamState): StreamStats {
const now = Date.now()
const firstTokenMs = state.firstTokenTime
? now - state.firstTokenTime
: null
const durationMs = now - state.startTime
return {
totalChunks: state.chunkCount,
firstTokenMs,
durationMs,
}
}