fix: surface actionable error when DuckDuckGo web search is rate-limited

Non-Anthropic / non-codex providers (minimax, kimi, generic OpenAI-compatible) fell through to the DDG adapter when no paid search key was configured. DDG's scraper is blocked on most IPs, so web_search surfaced an opaque "anomaly in the request" error. Catch that response in the DDG provider and rethrow with the exact env vars that would unblock the tool, or the option to switch to a native-search provider. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(openai-shim): echo reasoning_content on assistant tool-call messages for Moonshot (#828 )
2026-04-22 22:22:59 +05:30 · 2026-04-22 22:47:57 +08:00 · 2026-04-22 22:16:47 +08:00 · 2026-04-22 19:48:33 +08:00 · 2026-04-22 19:40:23 +08:00 · 2026-04-22 15:36:07 +08:00
112 changed files with 11371 additions and 871 deletions
--- a/.env.example
+++ b/.env.example
@@ -267,6 +267,11 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
 # Disable "Co-authored-by" line in git commits made by OpenClaude
 # OPENCLAUDE_DISABLE_CO_AUTHORED_BY=1
 # Disable strict tool schema normalization for non-Gemini providers
 # Useful when MCP tools with complex optional params (e.g. list[dict])
 # trigger "Extra required key ... supplied" errors from OpenAI-compatible endpoints
 # OPENCLAUDE_DISABLE_STRICT_TOOLS=1
 # Custom timeout for API requests in milliseconds (default: varies)
 # API_TIMEOUT_MS=60000
--- a/.gitignore
+++ b/.gitignore
@@ -7,6 +7,8 @@ dist/
 .openclaude-profile.json
 reports/
 GEMINI.md
 CLAUDE.md
 package-lock.json
 /.claude
 coverage/
 agent.log
--- a/.release-please-manifest.json
+++ b/.release-please-manifest.json
@@ -1,3 +1,3 @@
 {
-  ".": "0.5.1"
+  ".": "0.6.0"
 }
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,40 @@
 # Changelog
 ## [0.6.0](https://github.com/Gitlawb/openclaude/compare/v0.5.2...v0.6.0) (2026-04-22)
 ### Features
 * add model caching and benchmarking utilities ([#671](https://github.com/Gitlawb/openclaude/issues/671)) ([2b15e16](https://github.com/Gitlawb/openclaude/commit/2b15e16421f793f954a92c53933a07094544b29d))
 * add thinking token extraction ([#798](https://github.com/Gitlawb/openclaude/issues/798)) ([268c039](https://github.com/Gitlawb/openclaude/commit/268c0398e4bf1ab898069c61500a2b3c226a0322))
 * **api:** compress old tool_result content for small-context providers ([#801](https://github.com/Gitlawb/openclaude/issues/801)) ([a6a3de5](https://github.com/Gitlawb/openclaude/commit/a6a3de5ac155fe9d00befbfcab98d439314effd8))
 * **api:** improve local provider reliability with readiness and self-healing ([#738](https://github.com/Gitlawb/openclaude/issues/738)) ([4cb963e](https://github.com/Gitlawb/openclaude/commit/4cb963e660dbd6ee438c04042700db05a9d32c59))
 * **api:** smart model routing primitive (cheap-for-simple, strong-for-hard) ([#785](https://github.com/Gitlawb/openclaude/issues/785)) ([e908864](https://github.com/Gitlawb/openclaude/commit/e908864da7e7c987a98053ac5d18d702e192db2b))
 * enable 15 additional feature flags in open build ([#667](https://github.com/Gitlawb/openclaude/issues/667)) ([6a62e3f](https://github.com/Gitlawb/openclaude/commit/6a62e3ff76ba9ba446b8e20cf2bb139ee76a9387))
 * native Anthropic API mode for Claude models on GitHub Copilot ([#579](https://github.com/Gitlawb/openclaude/issues/579)) ([fdef4a1](https://github.com/Gitlawb/openclaude/commit/fdef4a1b4ce218ded4937ca83b30acce7c726472))
 * **provider:** expose Atomic Chat in /provider picker with autodetect ([#810](https://github.com/Gitlawb/openclaude/issues/810)) ([ee19159](https://github.com/Gitlawb/openclaude/commit/ee19159c17b3de3b4a8b4a4541a6569f4261d54e))
 * **provider:** zero-config autodetection primitive ([#784](https://github.com/Gitlawb/openclaude/issues/784)) ([a5bfcbb](https://github.com/Gitlawb/openclaude/commit/a5bfcbbadf8e9a1fd42f3e103d295524b8da64b0))
 ### Bug Fixes
 * **api:** ensure strict role sequence and filter empty assistant messages after interruption ([#745](https://github.com/Gitlawb/openclaude/issues/745) regression) ([#794](https://github.com/Gitlawb/openclaude/issues/794)) ([06e7684](https://github.com/Gitlawb/openclaude/commit/06e7684eb56df8e694ac784575e163641931c44c))
 * Collapse all-text arrays to string for DeepSeek compatibility ([#806](https://github.com/Gitlawb/openclaude/issues/806)) ([761924d](https://github.com/Gitlawb/openclaude/commit/761924daa7e225fe8acf41651408c7cae639a511))
 * **model:** codex/nvidia-nim/minimax now read OPENAI_MODEL env ([#815](https://github.com/Gitlawb/openclaude/issues/815)) ([4581208](https://github.com/Gitlawb/openclaude/commit/458120889f6ce54cc9f0b287461d5e38eae48a20))
 * **provider:** saved profile ignored when stale CLAUDE_CODE_USE_* in shell ([#807](https://github.com/Gitlawb/openclaude/issues/807)) ([13de4e8](https://github.com/Gitlawb/openclaude/commit/13de4e85df7f5fadc8cd15a76076374dc112360b))
 * rename .claude.json to .openclaude.json with legacy fallback ([#582](https://github.com/Gitlawb/openclaude/issues/582)) ([4d4fb28](https://github.com/Gitlawb/openclaude/commit/4d4fb2880e4d0e3a62d8715e1ec13d932e736279))
 * replace discontinued gemini-2.5-pro-preview-03-25 with stable gemini-2.5-pro ([#802](https://github.com/Gitlawb/openclaude/issues/802)) ([64582c1](https://github.com/Gitlawb/openclaude/commit/64582c119d5d0278195271379da4a68d59a89c1f)), closes [#398](https://github.com/Gitlawb/openclaude/issues/398)
 * **security:** harden project settings trust boundary + MCP sanitization ([#789](https://github.com/Gitlawb/openclaude/issues/789)) ([ae3b723](https://github.com/Gitlawb/openclaude/commit/ae3b723f3b297b49925cada4728f3174aee8bf12))
 * **test:** autoCompact floor assertion is flag-sensitive ([#816](https://github.com/Gitlawb/openclaude/issues/816)) ([c13842e](https://github.com/Gitlawb/openclaude/commit/c13842e91c7227246520955de6ae0636b30def9a))
 * **ui:** prevent provider manager lag by deferring sync I/O ([#803](https://github.com/Gitlawb/openclaude/issues/803)) ([85eab27](https://github.com/Gitlawb/openclaude/commit/85eab2751e7d351bb0ed6a3fe0e15461d241c9cb))
 ## [0.5.2](https://github.com/Gitlawb/openclaude/compare/v0.5.1...v0.5.2) (2026-04-20)
 ### Bug Fixes
 * **api:** replace phrase-based reasoning sanitizer with tag-based filter ([#779](https://github.com/Gitlawb/openclaude/issues/779)) ([336ddcc](https://github.com/Gitlawb/openclaude/commit/336ddcc50d59d79ebff50993f2673652aecb0d7d))
 ## [0.5.1](https://github.com/Gitlawb/openclaude/compare/v0.5.0...v0.5.1) (2026-04-20)
--- a/README.md
+++ b/README.md
@@ -125,7 +125,7 @@ Advanced and source-build guides:
 | Codex OAuth | `/provider` | Opens ChatGPT sign-in in your browser and stores Codex credentials securely |
 | Codex | `/provider` | Uses existing Codex CLI auth, OpenClaude secure storage, or env credentials |
 | Ollama | `/provider`, env vars, or `ollama launch` | Local inference with no API key |
-| Atomic Chat | advanced setup | Local Apple Silicon backend |
+| Atomic Chat | `/provider`, env vars, or `bun run dev:atomic-chat` | Local Model Provider; auto-detects loaded models |
 | Bedrock / Vertex / Foundry | env vars | Additional provider integrations for supported environments |
 ## What Works
--- a/docs/hook-chains.md
+++ b/docs/hook-chains.md
@@ -0,0 +1,333 @@
 # Hook Chains (Self-Healing Agent Mesh MVP)
 Hook Chains provide an event-driven recovery layer for important workflow failures.
 When a matching hook event occurs, OpenClaude evaluates declarative rules and can dispatch remediation actions such as:
 - `spawn_fallback_agent`
 - `notify_team`
 - `warm_remote_capacity`
 ## Disabled-By-Default Rollout
 > **Rollout recommendation:** keep Hook Chains disabled until you validate rules in your environment.
 >
 > - Set top-level config to `"enabled": false` initially.
 > - Enable per environment when ready.
 > - Dispatch is gated by `feature('HOOK_CHAINS')`.
 > - Env gate defaults to off unless `CLAUDE_CODE_ENABLE_HOOK_CHAINS=1` is set.
 This keeps existing workflows unchanged while you tune guard windows and action behavior.
 ## Feature Overview
 Hook Chains are loaded from a deterministic config file and evaluated on dispatched hook events.
 MVP runtime trigger wiring:
 - `PostToolUseFailure` hooks dispatch Hook Chains with outcome `failed`.
 - `TaskCompleted` hooks dispatch Hook Chains with outcome:
  - `success` when completion hooks did not block.
  - `failed` when completion hooks returned blocking errors or prevented continuation.
 Default config path:
 - `.openclaude/hook-chains.json`
 Override path:
 - `CLAUDE_CODE_HOOK_CHAINS_CONFIG_PATH=/abs/or/relative/path/to/hook-chains.json`
 Global gate:
 - `feature('HOOK_CHAINS')` must be enabled in the build
 - `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0|1` (defaults to disabled when unset)
 ## Safety Guarantees
 The runtime is intentionally conservative:
 - **Depth guard:** chain dispatch is blocked when `chainDepth >= maxChainDepth`.
 - **Rule cooldown:** each rule can only re-fire after cooldown expires.
 - **Dedup window:** identical event/action combinations are suppressed for a window.
 - **Abort-safe behavior:** if the current signal is aborted, actions skip safely.
 - **Policy-aware remote warm:** `warm_remote_capacity` skips when remote sessions are policy denied.
 - **Bridge inactive no-op:** `warm_remote_capacity` safely skips when no active bridge handle exists.
 - **Missing team context safety:** `notify_team` skips with structured reason if no team context/team file is available.
 - **Fallback launcher safety:** `spawn_fallback_agent` fails with a structured reason when launch permissions/context are unavailable.
 ## Configuration Schema Reference
 Top-level object:
 ```json
 {
  "version": 1,
  "enabled": true,
  "maxChainDepth": 2,
  "defaultCooldownMs": 30000,
  "defaultDedupWindowMs": 30000,
  "rules": []
 }
 ```
 ### Top-Level Fields
 | Field | Type | Required | Notes |
 |---|---|---:|---|
 | `version` | `1` | No | Defaults to `1`. |
 | `enabled` | `boolean` | No | Global feature switch for this config file. |
 | `maxChainDepth` | `integer` | No | Global depth guard (default `2`, max `10`). |
 | `defaultCooldownMs` | `integer` | No | Default rule cooldown in ms (default `30000`). |
 | `defaultDedupWindowMs` | `integer` | No | Default action dedup window in ms (default `30000`). |
 | `rules` | `HookChainRule[]` | No | Defaults to `[]`. May be omitted or empty; when no rules are present, dispatch is a no-op and returns `enabled: false`. |
 > **Note:** An empty ruleset is valid and can be used to keep Hook Chains configured but effectively disabled until rules are added.
 ### Rule Object (`HookChainRule`)
 ```json
 {
  "id": "task-failure-recovery",
  "enabled": true,
  "trigger": {
    "event": "TaskCompleted",
    "outcome": "failed"
  },
  "condition": {
    "toolNames": ["Edit"],
    "taskStatuses": ["failed"],
    "errorIncludes": ["timeout", "permission denied"],
    "eventFieldEquals": {
      "meta.source": "scheduler"
    }
  },
  "cooldownMs": 60000,
  "dedupWindowMs": 30000,
  "maxDepth": 2,
  "actions": []
 }
 ```
 | Field | Type | Required | Notes |
 |---|---|---:|---|
 | `id` | `string` | Yes | Stable identifier used in telemetry/guards. |
 | `enabled` | `boolean` | No | Per-rule switch. |
 | `trigger.event` | `HookEvent` | Yes | Event name to match. |
 | `trigger.outcome` | `"success"|"failed"|"timeout"|"unknown"` | No | Single outcome matcher. |
 | `trigger.outcomes` | `Outcome[]` | No | Multi-outcome matcher. Use either `outcome` or `outcomes`. |
 | `condition` | `object` | No | Optional extra matching constraints. |
 | `cooldownMs` | `integer` | No | Overrides global cooldown for this rule. |
 | `dedupWindowMs` | `integer` | No | Overrides global dedup for this rule. |
 | `maxDepth` | `integer` | No | Per-rule depth cap. |
 | `actions` | `HookChainAction[]` | Yes | One or more actions to execute in order. |
 ### Condition Fields
 | Field | Type | Notes |
 |---|---|---|
 | `toolNames` | `string[]` | Matches `tool_name` / `toolName` in event payload. |
 | `taskStatuses` | `string[]` | Matches `task_status` / `taskStatus` / `status`. |
 | `errorIncludes` | `string[]` | Case-insensitive substring match against `error` / `reason` / `message`. |
 | `eventFieldEquals` | `Record<string, string\|number\|boolean>` | Dot-path equality against payload (example: `"meta.source": "scheduler"`). |
 ### Actions
 #### `spawn_fallback_agent`
 ```json
 {
  "type": "spawn_fallback_agent",
  "id": "fallback-1",
  "enabled": true,
  "dedupWindowMs": 30000,
  "description": "Fallback recovery for failed task",
  "promptTemplate": "Recover task ${TASK_SUBJECT}. Event=${EVENT_NAME}, outcome=${OUTCOME}, error=${ERROR}. Payload=${PAYLOAD_JSON}",
  "agentType": "general-purpose",
  "model": "sonnet"
 }
 ```
 #### `notify_team`
 ```json
 {
  "type": "notify_team",
  "id": "notify-ops",
  "enabled": true,
  "dedupWindowMs": 30000,
  "teamName": "mesh-team",
  "recipients": ["*"],
  "summary": "Hook chain ${RULE_ID} fired",
  "messageTemplate": "Event=${EVENT_NAME} outcome=${OUTCOME}\nTask=${TASK_ID}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
 }
 ```
 #### `warm_remote_capacity`
 ```json
 {
  "type": "warm_remote_capacity",
  "id": "warm-bridge",
  "enabled": true,
  "dedupWindowMs": 60000,
  "createDefaultEnvironmentIfMissing": false
 }
 ```
 ## Complete Example Configs
 ### 1) Retry via Fallback Agent
 ```json
 {
  "version": 1,
  "enabled": true,
  "maxChainDepth": 2,
  "defaultCooldownMs": 30000,
  "defaultDedupWindowMs": 30000,
  "rules": [
    {
      "id": "retry-task-via-fallback",
      "trigger": {
        "event": "TaskCompleted",
        "outcome": "failed"
      },
      "cooldownMs": 60000,
      "actions": [
        {
          "type": "spawn_fallback_agent",
          "id": "spawn-retry-agent",
          "description": "Retry failed task with fallback agent",
          "promptTemplate": "A task failed. Recover it safely.\nTask=${TASK_SUBJECT}\nDescription=${TASK_DESCRIPTION}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}",
          "agentType": "general-purpose",
          "model": "sonnet"
        }
      ]
    }
  ]
 }
 ```
 ### 2) Notify Only
 ```json
 {
  "version": 1,
  "enabled": true,
  "maxChainDepth": 2,
  "defaultCooldownMs": 30000,
  "defaultDedupWindowMs": 30000,
  "rules": [
    {
      "id": "notify-on-tool-failure",
      "trigger": {
        "event": "PostToolUseFailure",
        "outcome": "failed"
      },
      "condition": {
        "toolNames": ["Edit", "Write", "Bash"]
      },
      "actions": [
        {
          "type": "notify_team",
          "id": "notify-team-failure",
          "recipients": ["*"],
          "summary": "Tool failure detected",
          "messageTemplate": "Tool failure detected.\nEvent=${EVENT_NAME} outcome=${OUTCOME}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
        }
      ]
    }
  ]
 }
 ```
 ### 3) Combined Fallback + Notify + Bridge Warm
 ```json
 {
  "version": 1,
  "enabled": true,
  "maxChainDepth": 2,
  "defaultCooldownMs": 45000,
  "defaultDedupWindowMs": 30000,
  "rules": [
    {
      "id": "full-recovery-chain",
      "trigger": {
        "event": "TaskCompleted",
        "outcomes": ["failed", "timeout"]
      },
      "condition": {
        "errorIncludes": ["timeout", "capacity", "connection"]
      },
      "cooldownMs": 90000,
      "actions": [
        {
          "type": "spawn_fallback_agent",
          "id": "fallback-agent",
          "description": "Recover failed task execution",
          "promptTemplate": "Recover failed task and produce a concise fix summary.\nTask=${TASK_SUBJECT}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
        },
        {
          "type": "notify_team",
          "id": "notify-team",
          "recipients": ["*"],
          "summary": "Recovery chain triggered",
          "messageTemplate": "Recovery chain ${RULE_ID} fired.\nOutcome=${OUTCOME}\nTask=${TASK_SUBJECT}\nError=${ERROR}"
        },
        {
          "type": "warm_remote_capacity",
          "id": "warm-capacity",
          "createDefaultEnvironmentIfMissing": false
        }
      ]
    }
  ]
 }
 ```
 ## Template Variables
 The following placeholders are supported by `promptTemplate`, `summary`, and `messageTemplate`:
 - `${EVENT_NAME}`
 - `${OUTCOME}`
 - `${RULE_ID}`
 - `${TASK_SUBJECT}`
 - `${TASK_DESCRIPTION}`
 - `${TASK_ID}`
 - `${ERROR}`
 - `${PAYLOAD_JSON}`
 ## Troubleshooting
 ### Rule never triggers
 - Verify `trigger.event` and `trigger.outcome`/`trigger.outcomes` exactly match dispatched event data.
 - Check `condition` filters (especially `toolNames` and `eventFieldEquals` dot-path keys).
 - Confirm the config file is valid JSON and schema-valid.
 ### Actions show as skipped
 Common skip reasons:
 - `action disabled`
 - `rule cooldown active ...`
 - `dedup window active ...`
 - `max chain depth reached ...`
 - `No team context is available ...`
 - `Team file not found ...`
 - `Remote sessions are blocked by policy`
 - `Bridge is not active; warm_remote_capacity is a safe no-op`
 - `No fallback agent launcher is registered in runtime context`
 ### Config changes not reflected
 - Loader uses memoization by file mtime/size.
 - Ensure your editor writes the file fully and updates mtime.
 - If needed, force reload from the caller side with `forceReloadConfig: true`.
 ### Existing workflows changed unexpectedly
 - Set `"enabled": false` at top-level.
 - Or globally disable with `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0`.
 - Re-enable gradually after validating one rule at a time.
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "@gitlawb/openclaude",
-  "version": "0.5.1",
+  "version": "0.6.0",
  "description": "Claude Code opened to any LLM — OpenAI, Gemini, DeepSeek, Ollama, and 200+ models",
  "type": "module",
  "bin": {
--- a/scripts/build.ts
+++ b/scripts/build.ts
@@ -19,30 +19,46 @@ const version = pkg.version
 // Most Anthropic-internal features stay off; open-build features can be
 // selectively enabled here when their full source exists in the mirror.
 const featureFlags: Record<string, boolean> = {
-  VOICE_MODE: false,
+  // ── Disabled: require Anthropic infrastructure or missing source ─────
-  PROACTIVE: false,
+  VOICE_MODE: false,              // Push-to-talk STT via claude.ai OAuth endpoint
-  KAIROS: false,
+  PROACTIVE: false,               // Autonomous agent mode (missing proactive/ module)
-  BRIDGE_MODE: false,
+  KAIROS: false,                  // Persistent assistant/session mode (cloud backend)
-  DAEMON: false,
+  BRIDGE_MODE: false,             // Remote desktop bridge via CCR infrastructure
-  AGENT_TRIGGERS: false,
+  DAEMON: false,                  // Background daemon process (stubbed in open build)
-  MONITOR_TOOL: true,
+  AGENT_TRIGGERS: false,          // Scheduled remote agent triggers
-  ABLATION_BASELINE: false,
+  ABLATION_BASELINE: false,       // A/B testing harness for eval experiments
-  DUMP_SYSTEM_PROMPT: false,
+  CONTEXT_COLLAPSE: false,        // Context collapsing optimization (stubbed)
-  CACHED_MICROCOMPACT: false,
+  COMMIT_ATTRIBUTION: false,      // Co-Authored-By metadata in git commits
-  COORDINATOR_MODE: true,
+  UDS_INBOX: false,               // Unix Domain Socket inter-session messaging
-  BUILTIN_EXPLORE_PLAN_AGENTS: true,
+  BG_SESSIONS: false,             // Background sessions via tmux (stubbed)
-  CONTEXT_COLLAPSE: false,
+  WEB_BROWSER_TOOL: false,        // Built-in browser automation (source not mirrored)
-  COMMIT_ATTRIBUTION: false,
+  CHICAGO_MCP: false,             // Computer-use MCP (native Swift modules stubbed)
-  TEAMMEM: true,
+  COWORKER_TYPE_TELEMETRY: false, // Telemetry for agent/coworker type classification
-  UDS_INBOX: false,
+
-  BG_SESSIONS: false,
+  // ── Enabled: upstream defaults ──────────────────────────────────────
-  AWAY_SUMMARY: false,
+  COORDINATOR_MODE: true,             // Multi-agent coordinator with worker delegation
-  TRANSCRIPT_CLASSIFIER: false,
+  BUILTIN_EXPLORE_PLAN_AGENTS: true,  // Built-in Explore/Plan specialized subagents
-  WEB_BROWSER_TOOL: false,
+  BUDDY: true,                        // Buddy mode for paired programming
-  MESSAGE_ACTIONS: true,
+  MONITOR_TOOL: true,                 // MCP server monitoring/streaming tool
-  BUDDY: true,
+  TEAMMEM: true,                      // Team memory management
-  CHICAGO_MCP: false,
+  MESSAGE_ACTIONS: true,              // Message action buttons in the UI
-  COWORKER_TYPE_TELEMETRY: false,
+
  // ── Enabled: new activations ────────────────────────────────────────
  DUMP_SYSTEM_PROMPT: true,           // --dump-system-prompt CLI flag for debugging
  CACHED_MICROCOMPACT: true,          // Cache-aware tool result truncation optimization
  AWAY_SUMMARY: true,                 // "While you were away" recap after 5min blur
  TRANSCRIPT_CLASSIFIER: true,        // Auto-approval classifier for safe tool uses
  ULTRATHINK: true,                   // Deep thinking mode — type "ultrathink" to boost reasoning
  TOKEN_BUDGET: true,                 // Token budget tracking with usage warnings
  HISTORY_PICKER: true,               // Enhanced interactive prompt history picker
  QUICK_SEARCH: true,                 // Ctrl+G quick search across prompts
  SHOT_STATS: true,                   // Shot distribution stats in session summary
  EXTRACT_MEMORIES: true,             // Auto-extract durable memories from conversations
  FORK_SUBAGENT: true,                // Implicit context-forking when omitting subagent_type
  VERIFICATION_AGENT: true,           // Built-in read-only agent for test/verification
  MCP_SKILLS: true,                   // Discover skills dynamically from MCP server resources
  PROMPT_CACHE_BREAK_DETECTION: true, // Detect & log unexpected prompt cache invalidations
  HOOK_PROMPTS: true,                 // Allow tools to request interactive user prompts
 }
 // ── Pre-process: replace feature() calls with boolean literals ──────
--- a/scripts/no-telemetry-growthbook-stub.test.ts
+++ b/scripts/no-telemetry-growthbook-stub.test.ts
@@ -50,6 +50,23 @@ describe('growthbook stub — local feature flag overrides', () => {
    expect(stub.getAllGrowthBookFeatures()).toEqual({})
  })
  // ── Open-build defaults (_openBuildDefaults) ────────────────────
  test('returns open-build default when flags file is absent', () => {
    // tengu_passport_quail is in _openBuildDefaults as true; without a
    // flags file the stub should return the open-build override, not
    // the call-site defaultValue.
    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', false)).toBe(true)
    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_coral_fern', false)).toBe(true)
  })
  test('flags file overrides open-build defaults', () => {
    // User-provided feature-flags.json takes priority over _openBuildDefaults.
    writeFileSync(flagsFile, JSON.stringify({ tengu_passport_quail: false }))
    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', true)).toBe(false)
  })
  // ── Valid JSON object ────────────────────────────────────────────
  test('loads and returns values from a valid JSON file', () => {
--- a/scripts/no-telemetry-plugin.ts
+++ b/scripts/no-telemetry-plugin.ts
@@ -40,6 +40,151 @@ import _os from 'node:os';
 let _flags = undefined;
 // ── Open-build GrowthBook overrides ───────────────────────────────────
 // Override upstream defaultValue for runtime gates tied to build-time
 // features. Only keys that DIFFER from upstream belong here — the
 // catalog below is pure documentation and does NOT affect resolution.
 //
 // Priority: ~/.claude/feature-flags.json > _openBuildDefaults > defaultValue
 //
 // To override at runtime, create ~/.claude/feature-flags.json:
 //   { "tengu_some_flag": true }
 const _openBuildDefaults = {
  'tengu_sedge_lantern': true,  // AWAY_SUMMARY — "while you were away" recap (upstream: false)
  'tengu_hive_evidence': true,  // VERIFICATION_AGENT — read-only test/verification agent (upstream: false)
  'tengu_passport_quail': true, // EXTRACT_MEMORIES — enable memory extraction (upstream: false)
  'tengu_coral_fern': true,     // EXTRACT_MEMORIES — enable memory search in past context (upstream: false)
 };
 /* ── Known runtime feature keys (reference only) ───────────────────────
 * This catalog does NOT participate in flag resolution. It documents
 * the known GrowthBook keys and their upstream default values, scraped
 * from src/ call sites. It is NOT exhaustive — new keys may be added
 * upstream between catalog updates.
 *
 * Some keys have different defaults at different call sites — this is
 * intentional upstream (the server unifies the value at runtime).
 *
 * To activate any of these, add them to ~/.claude/feature-flags.json
 * or to _openBuildDefaults above.
 *
 * ── Reasoning & thinking ──────────────────────────────────────────────
 *   tengu_turtle_carbon            = true       ULTRATHINK deep thinking runtime gate
 *   tengu_thinkback                = gate       /thinkback replay command
 *
 * ── Agents & orchestration ────────────────────────────────────────────
 *   tengu_amber_flint              = true       Agent swarms coordination
 *   tengu_amber_stoat              = true       Built-in agent availability (Explore, Plan, etc.)
 *   tengu_agent_list_attach        = true       Attach file context to agent list
 *   tengu_auto_background_agents   = false      Auto-spawn background agents
 *   tengu_slim_subagent_claudemd   = true       Lighter ClaudeMD for subagents
 *   tengu_hive_evidence            = false      Verification agent / evidence tracking (4 call sites)
 *   tengu_ultraplan_model          = model cfg  ULTRAPLAN model selection (dynamic config)
 *
 * ── Memory & context ──────────────────────────────────────────────────
 *   tengu_passport_quail           = false      EXTRACT_MEMORIES main gate (isExtractModeActive)
 *   tengu_coral_fern               = false      EXTRACT_MEMORIES search in past context
 *   tengu_slate_thimble            = false      Memory dir paths (non-interactive sessions)
 *   tengu_herring_clock            = true/false Team memory paths (varies by call site)
 *   tengu_bramble_lintel           = null       Extract memories throttle (null → every turn)
 *   tengu_sedge_lantern            = false      AWAY_SUMMARY "while you were away" recap
 *   tengu_session_memory           = false      Session memory service
 *   tengu_sm_config                = {}         Session memory config (dynamic)
 *   tengu_sm_compact_config        = {}         Session memory compaction config (dynamic)
 *   tengu_cobalt_raccoon           = false      Reactive compaction (suppress auto-compact)
 *   tengu_pebble_leaf_prune        = false      Session storage pruning
 *
 * ── Kairos & cron ─────────────────────────────────────────────────────
 *   tengu_kairos_brief             = false      Brief layout mode (KAIROS)
 *   tengu_kairos_brief_config      = {}         Brief config (dynamic)
 *   tengu_kairos_cron              = true       Cron scheduler enable
 *   tengu_kairos_cron_durable      = true       Durable (disk-persistent) cron tasks
 *   tengu_kairos_cron_config       = {}         Cron jitter config (dynamic)
 *
 * ── Bridge & remote (require Anthropic infra) ─────────────────────────
 *   tengu_ccr_bridge               = false      CCR bridge connection
 *   tengu_ccr_bridge_multi_session = gate       Multi-session spawn mode
 *   tengu_ccr_mirror               = false      CCR session mirroring
 *   tengu_ccr_bundle_seed_enabled  = gate       Git bundle seeding for CCR
 *   tengu_ccr_bundle_max_bytes     = null       Bundle size limit (null → default)
 *   tengu_bridge_repl_v2           = false      Environment-less REPL bridge v2
 *   tengu_bridge_repl_v2_cse_shim_enabled = true CSE→Session tag retag shim
 *   tengu_bridge_min_version       = {min:'0'}  Min CLI version for bridge (dynamic)
 *   tengu_bridge_initial_history_cap = 200      Initial history cap for bridge
 *   tengu_bridge_system_init       = false      Bridge system initialization
 *   tengu_cobalt_harbor            = false      Auto-connect CCR at startup
 *   tengu_cobalt_lantern           = false      Remote setup preconditions
 *   tengu_remote_backend           = false      Remote TUI backend
 *   tengu_surreal_dali             = false      Remote agent tasks / triggers
 *
 * ── Prompt & API ──────────────────────────────────────────────────────
 *   tengu_attribution_header       = true       Attribution header in API requests
 *   tengu_basalt_3kr               = true       MCP instructions delta
 *   tengu_slate_prism              = true/false Message formatting (varies by call site)
 *   tengu_amber_prism              = false      Message content formatting
 *   tengu_amber_json_tools         = false      JSON format for tool schemas
 *   tengu_fgts                     = false      API feature gates
 *   tengu_otk_slot_v1              = false      One-time key slots for API auth
 *   tengu_cicada_nap_ms            = 0          Background GrowthBook refresh throttle (ms)
 *   tengu_miraculo_the_bard        = false      Service initialization gate
 *   tengu_immediate_model_command  = false      Immediate /model command execution
 *   tengu_chomp_inflection         = false      Prompt suggestions after responses
 *   tengu_tool_pear                = gate       API betas for tool use
 *   tengu-off-switch               = {act:false} Service kill switch (dynamic; uses dash)
 *
 * ── Permissions & security ────────────────────────────────────────────
 *   tengu_birch_trellis            = true       Bash auto-mode permissions config
 *   tengu_auto_mode_config         = {}         Auto-mode configuration (dynamic, many call sites)
 *   tengu_iron_gate_closed         = true       Permission iron gate (with refresh)
 *   tengu_destructive_command_warning = false    Warning for destructive bash commands
 *   tengu_disable_bypass_permissions_mode = security Security killswitch (always false in open build)
 *
 * ── UI & UX ───────────────────────────────────────────────────────────
 *   tengu_willow_mode              = 'off'      REPL rendering mode
 *   tengu_terminal_panel           = false      Terminal panel keybinding
 *   tengu_terminal_sidebar         = false      Terminal sidebar in REPL/config
 *   tengu_marble_sandcastle        = false      Fast mode gate
 *   tengu_jade_anvil_4             = false      Rate limit options UI ordering
 *   tengu_collage_kaleidoscope     = true       Native clipboard image paste (macOS)
 *   tengu_lapis_finch              = false      Plugin/hint recommendation
 *   tengu_lodestone_enabled        = false      Deep links claude-cli:// protocol
 *   tengu_copper_panda             = false      Skill improvement suggestions
 *   tengu_desktop_upsell           = {}         Desktop app upsell config (dynamic)
 *   tengu-top-of-feed-tip          = {}         Emergency tip of feed (dynamic; uses dash)
 *
 * ── File operations ───────────────────────────────────────────────────
 *   tengu_quartz_lantern           = false      File read/write dedup optimization
 *   tengu_moth_copse               = false      Attachments handling (variant A)
 *   tengu_marble_fox               = false      Attachments handling (variant B)
 *   tengu_scratch                  = gate       Scratchpad filesystem access / coordinator
 *
 * ── MCP & plugins ─────────────────────────────────────────────────────
 *   tengu_harbor                   = false      MCP channel allowlist verification
 *   tengu_harbor_permissions       = false      MCP channel permissions enforcement
 *   tengu_copper_bridge            = false      Chrome MCP bridge
 *   tengu_chrome_auto_enable       = false      Auto-enable Chrome MCP on startup
 *   tengu_glacier_2xr              = false      Enhanced tool search / ToolSearchTool
 *   tengu_malort_pedway            = {}         Computer-use (Chicago) config (dynamic)
 *
 * ── VSCode / IDE ──────────────────────────────────────────────────────
 *   tengu_quiet_fern               = false      VSCode browser support
 *   tengu_vscode_cc_auth           = false      VSCode in-band OAuth via claude_authenticate
 *   tengu_vscode_review_upsell     = gate       VSCode review upsell
 *   tengu_vscode_onboarding        = gate       VSCode onboarding experience
 *
 * ── Voice ─────────────────────────────────────────────────────────────
 *   tengu_amber_quartz_disabled    = false      VOICE_MODE kill-switch (false = voice allowed)
 *
 * ── Auto-updater (stubbed in open build) ──────────────────────────────
 *   tengu_version_config           = {min:'0'}  Min version enforcement (dynamic)
 *   tengu_max_version_config       = {}         Max version / deprecation config (dynamic)
 *
 * ── Telemetry & tracing ───────────────────────────────────────────────
 *   tengu_trace_lantern            = false      Beta session tracing
 *   tengu_chair_sermon             = gate       Analytics / message formatting gate
 *   tengu_strap_foyer              = false      Settings sync to cloud
 */
 function _loadFlags() {
  if (_flags !== undefined) return;
  try {
@@ -55,6 +200,7 @@ function _loadFlags() {
 function _getFlagValue(key, defaultValue) {
  _loadFlags();
  if (_flags != null && Object.hasOwn(_flags, key)) return _flags[key];
  if (Object.hasOwn(_openBuildDefaults, key)) return _openBuildDefaults[key];
  return defaultValue;
 }
--- a/scripts/system-check.test.ts
+++ b/scripts/system-check.test.ts
@@ -20,6 +20,23 @@ describe('formatReachabilityFailureDetail', () => {
    )
  })
  test('redacts credentials and sensitive query parameters in endpoint details', () => {
    const detail = formatReachabilityFailureDetail(
      'http://user:pass@localhost:11434/v1/models?token=abc123&mode=test',
      502,
      'bad gateway',
      {
        transport: 'chat_completions',
        requestedModel: 'llama3.1:8b',
        resolvedModel: 'llama3.1:8b',
      },
    )
    expect(detail).toBe(
      'Unexpected status 502 from http://redacted:redacted@localhost:11434/v1/models?token=redacted&mode=test. Body: bad gateway',
    )
  })
  test('adds alias/entitlement hint for codex model support 400s', () => {
    const detail = formatReachabilityFailureDetail(
      'https://chatgpt.com/backend-api/codex/responses',
--- a/scripts/system-check.ts
+++ b/scripts/system-check.ts
@@ -7,6 +7,11 @@ import {
  resolveProviderRequest,
  isLocalProviderUrl as isProviderLocalUrl,
 } from '../src/services/api/providerConfig.js'
 import {
  getLocalOpenAICompatibleProviderLabel,
  probeOllamaGenerationReadiness,
 } from '../src/utils/providerDiscovery.js'
 import { redactUrlForDisplay } from '../src/utils/urlRedaction.js'
 type CheckResult = {
  ok: boolean
@@ -69,7 +74,7 @@ export function formatReachabilityFailureDetail(
  },
 ): string {
  const compactBody = responseBody.trim().replace(/\s+/g, ' ').slice(0, 240)
-  const base = `Unexpected status ${status} from ${endpoint}.`
+  const base = `Unexpected status ${status} from ${redactUrlForDisplay(endpoint)}.`
  const bodySuffix = compactBody ? ` Body: ${compactBody}` : ''
  if (request.transport !== 'codex_responses' || status !== 400) {
@@ -255,7 +260,7 @@ function checkOpenAIEnv(): CheckResult[] {
    results.push(pass('OPENAI_MODEL', process.env.OPENAI_MODEL))
  }
-  results.push(pass('OPENAI_BASE_URL', request.baseUrl))
+  results.push(pass('OPENAI_BASE_URL', redactUrlForDisplay(request.baseUrl)))
  if (request.transport === 'codex_responses') {
    const credentials = resolveCodexApiCredentials(process.env)
@@ -308,7 +313,7 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    return pass('Provider reachability', 'Skipped (OpenAI-compatible mode disabled).')
  }
-  if (useGithub) {
+  if (useGithub && !useOpenAI) {
    return pass(
      'Provider reachability',
      'Skipped for GitHub Models (inference endpoint differs from OpenAI /models probe).',
@@ -326,6 +331,7 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
  const endpoint = request.transport === 'codex_responses'
    ? `${request.baseUrl}/responses`
    : `${request.baseUrl}/models`
  const redactedEndpoint = redactUrlForDisplay(endpoint)
  const controller = new AbortController()
  const timeout = setTimeout(() => controller.abort(), 4000)
@@ -375,7 +381,10 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    })
    if (response.status === 200 || response.status === 401 || response.status === 403) {
-      return pass('Provider reachability', `Reached ${endpoint} (status ${response.status}).`)
+      return pass(
        'Provider reachability',
        `Reached ${redactedEndpoint} (status ${response.status}).`,
      )
    }
    const responseBody = await response.text().catch(() => '')
@@ -391,12 +400,100 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    )
  } catch (error) {
    const message = error instanceof Error ? error.message : String(error)
-    return fail('Provider reachability', `Failed to reach ${endpoint}: ${message}`)
+    return fail(
      'Provider reachability',
      `Failed to reach ${redactedEndpoint}: ${message}`,
    )
  } finally {
    clearTimeout(timeout)
  }
 }
 async function checkProviderGenerationReadiness(): Promise<CheckResult> {
  const useGemini = isTruthy(process.env.CLAUDE_CODE_USE_GEMINI)
  const useOpenAI = isTruthy(process.env.CLAUDE_CODE_USE_OPENAI)
  const useGithub = isTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
  const useMistral = isTruthy(process.env.CLAUDE_CODE_USE_MISTRAL)
  if (!useGemini && !useOpenAI && !useGithub && !useMistral) {
    return pass('Provider generation readiness', 'Skipped (OpenAI-compatible mode disabled).')
  }
  if (useGithub && !useOpenAI) {
    return pass(
      'Provider generation readiness',
      'Skipped for GitHub Models (runtime generation uses a different endpoint flow).',
    )
  }
  if (useGemini || useMistral) {
    return pass(
      'Provider generation readiness',
      'Skipped for managed provider mode.',
    )
  }
  if (!useOpenAI) {
    return pass('Provider generation readiness', 'Skipped (OpenAI-compatible mode disabled).')
  }
  const request = resolveProviderRequest({
    model: process.env.OPENAI_MODEL,
    baseUrl: process.env.OPENAI_BASE_URL,
  })
  if (request.transport === 'codex_responses') {
    return pass(
      'Provider generation readiness',
      'Skipped for Codex responses (reachability probe already performs a lightweight generation request).',
    )
  }
  if (!isLocalBaseUrl(request.baseUrl)) {
    return pass('Provider generation readiness', 'Skipped for non-local provider URL.')
  }
  const localProviderLabel = getLocalOpenAICompatibleProviderLabel(request.baseUrl)
  if (localProviderLabel !== 'Ollama') {
    return pass(
      'Provider generation readiness',
      `Skipped for ${localProviderLabel} (no provider-specific generation probe).`,
    )
  }
  const readiness = await probeOllamaGenerationReadiness({
    baseUrl: request.baseUrl,
    model: request.requestedModel,
  })
  if (readiness.state === 'ready') {
    return pass(
      'Provider generation readiness',
      `Generated a test response with ${readiness.probeModel ?? request.requestedModel}.`,
    )
  }
  if (readiness.state === 'unreachable') {
    return fail(
      'Provider generation readiness',
      `Could not reach Ollama at ${redactUrlForDisplay(request.baseUrl)}.`,
    )
  }
  if (readiness.state === 'no_models') {
    return fail(
      'Provider generation readiness',
      'Ollama is reachable, but no installed models were found. Pull a model first (for example: ollama pull qwen2.5-coder:7b).',
    )
  }
  const detailSuffix = readiness.detail ? ` Detail: ${readiness.detail}.` : ''
  return fail(
    'Provider generation readiness',
    `Ollama is reachable, but generation failed for ${readiness.probeModel ?? request.requestedModel}.${detailSuffix}`,
  )
 }
 function isAtomicChatUrl(baseUrl: string): boolean {
  try {
    const parsed = new URL(baseUrl)
@@ -567,6 +664,7 @@ async function main(): Promise<void> {
  results.push(checkBuildArtifacts())
  results.push(...checkOpenAIEnv())
  results.push(await checkBaseUrlReachability())
  results.push(await checkProviderGenerationReadiness())
  results.push(checkOllamaProcessorMode())
  if (!options.json) {
--- a/src/Tool.ts
+++ b/src/Tool.ts
@@ -249,6 +249,11 @@ export type ToolUseContext = {
  /** When true, canUseTool must always be called even when hooks auto-approve.
   *  Used by speculation for overlay file path rewriting. */
  requireCanUseTool?: boolean
  /**
   * Optional callback used by hook-chain fallback actions that launch
   * AgentTool from hook runtime paths.
   */
  hookChainsCanUseTool?: CanUseToolFn
  messages: Message[]
  fileReadingLimits?: {
    maxTokens?: number
--- a/src/tests/security-hardening.test.ts
+++ b/src/tests/security-hardening.test.ts
@@ -0,0 +1,191 @@
 /**
 * Security hardening regression tests.
 *
 * Covers:
 * 1. MCP tool result Unicode sanitization
 * 2. Sandbox settings source filtering (exclude projectSettings)
 * 3. Plugin git clone/pull hooks disabled
 * 4. ANTHROPIC_FOUNDRY_API_KEY removed from SAFE_ENV_VARS
 * 5. WebFetch SSRF protection via ssrfGuardedLookup
 */
 import { describe, test, expect } from 'bun:test'
 import { resolve } from 'path'
 const SRC = resolve(import.meta.dir, '..')
 const file = (relative: string) => Bun.file(resolve(SRC, relative))
 // ---------------------------------------------------------------------------
 // Fix 1: MCP tool result Unicode sanitization
 // ---------------------------------------------------------------------------
 describe('MCP tool result sanitization', () => {
  test('transformResultContent sanitizes text content', async () => {
    const content = await file('services/mcp/client.ts').text()
    // Tool definitions are already sanitized (line ~1798)
    expect(content).toContain('recursivelySanitizeUnicode(result.tools)')
    // Tool results must also be sanitized
    expect(content).toMatch(
      /case 'text':[\s\S]*?recursivelySanitizeUnicode\(resultContent\.text\)/,
    )
  })
  test('resource text content is also sanitized', async () => {
    const content = await file('services/mcp/client.ts').text()
    expect(content).toMatch(
      /recursivelySanitizeUnicode\(\s*`\$\{prefix\}\$\{resource\.text\}`/,
    )
  })
 })
 // ---------------------------------------------------------------------------
 // Fix 2: Sandbox settings source filtering
 // ---------------------------------------------------------------------------
 describe('Sandbox settings trust boundary', () => {
  test('getSandboxEnabledSetting does not use getSettings_DEPRECATED', async () => {
    const content = await file('utils/sandbox/sandbox-adapter.ts').text()
    // Extract the getSandboxEnabledSetting function body
    const fnMatch = content.match(
      /function getSandboxEnabledSetting\(\)[^{]*\{([\s\S]*?)\n\}/,
    )
    expect(fnMatch).not.toBeNull()
    const fnBody = fnMatch![1]
    // Must NOT use getSettings_DEPRECATED (reads all sources including project)
    expect(fnBody).not.toContain('getSettings_DEPRECATED')
    // Must use getSettingsForSource for individual trusted sources
    expect(fnBody).toContain("getSettingsForSource('userSettings')")
    expect(fnBody).toContain("getSettingsForSource('policySettings')")
    // Must NOT read from projectSettings
    expect(fnBody).not.toContain("'projectSettings'")
  })
 })
 // ---------------------------------------------------------------------------
 // Fix 3: Plugin git hooks disabled
 // ---------------------------------------------------------------------------
 describe('Plugin git operations disable hooks', () => {
  test('gitClone includes core.hooksPath=/dev/null', async () => {
    const content = await file('utils/plugins/marketplaceManager.ts').text()
    // The clone args must disable hooks
    const cloneSection = content.slice(
      content.indexOf('export async function gitClone('),
      content.indexOf('export async function gitClone(') + 2000,
    )
    expect(cloneSection).toContain("'core.hooksPath=/dev/null'")
  })
  test('gitPull includes core.hooksPath=/dev/null', async () => {
    const content = await file('utils/plugins/marketplaceManager.ts').text()
    const pullSection = content.slice(
      content.indexOf('export async function gitPull('),
      content.indexOf('export async function gitPull(') + 2000,
    )
    expect(pullSection).toContain("'core.hooksPath=/dev/null'")
  })
  test('gitSubmoduleUpdate includes core.hooksPath=/dev/null', async () => {
    const content = await file('utils/plugins/marketplaceManager.ts').text()
    const subSection = content.slice(
      content.indexOf('async function gitSubmoduleUpdate('),
      content.indexOf('async function gitSubmoduleUpdate(') + 1000,
    )
    expect(subSection).toContain("'core.hooksPath=/dev/null'")
  })
 })
 // ---------------------------------------------------------------------------
 // Fix 4: ANTHROPIC_FOUNDRY_API_KEY not in SAFE_ENV_VARS
 // ---------------------------------------------------------------------------
 describe('SAFE_ENV_VARS excludes credentials', () => {
  test('ANTHROPIC_FOUNDRY_API_KEY is not in SAFE_ENV_VARS', async () => {
    const content = await file('utils/managedEnvConstants.ts').text()
    // Extract the SAFE_ENV_VARS set definition
    const safeStart = content.indexOf('export const SAFE_ENV_VARS')
    const safeEnd = content.indexOf('])', safeStart)
    const safeSection = content.slice(safeStart, safeEnd)
    expect(safeSection).not.toContain('ANTHROPIC_FOUNDRY_API_KEY')
  })
 })
 // ---------------------------------------------------------------------------
 // Fix 5: WebFetch SSRF protection
 // ---------------------------------------------------------------------------
 describe('WebFetch SSRF guard', () => {
  test('getWithPermittedRedirects uses ssrfGuardedLookup', async () => {
    const content = await file('tools/WebFetchTool/utils.ts').text()
    expect(content).toContain(
      "import { ssrfGuardedLookup } from '../../utils/hooks/ssrfGuard.js'",
    )
    // The axios.get call in getWithPermittedRedirects must include lookup
    const fnSection = content.slice(
      content.indexOf('export async function getWithPermittedRedirects('),
      content.indexOf('export async function getWithPermittedRedirects(') +
        1000,
    )
    expect(fnSection).toContain('lookup: ssrfGuardedLookup')
  })
 })
 // ---------------------------------------------------------------------------
 // Fix 6: Swarm permission file polling removed (security hardening)
 // ---------------------------------------------------------------------------
 describe('Swarm permission file polling removed', () => {
  test('useSwarmPermissionPoller hook no longer exists', async () => {
    const content = await file(
      'hooks/useSwarmPermissionPoller.ts',
    ).text()
    // The file-based polling hook must not exist — it read from an
    // unauthenticated resolved/ directory where any local process could
    // forge approval files.
    expect(content).not.toContain('function useSwarmPermissionPoller(')
    // The file-based processResponse must not exist
    expect(content).not.toContain('function processResponse(')
  })
  test('poller does not import from permissionSync', async () => {
    const content = await file(
      'hooks/useSwarmPermissionPoller.ts',
    ).text()
    // Must not import anything from permissionSync — all file-based
    // functions have been removed from this module's dependencies
    expect(content).not.toContain('permissionSync')
  })
  test('file-based permission functions are marked deprecated', async () => {
    const content = await file(
      'utils/swarm/permissionSync.ts',
    ).text()
    // All file-based functions must have @deprecated JSDoc
    const deprecatedFns = [
      'writePermissionRequest',
      'readPendingPermissions',
      'readResolvedPermission',
      'resolvePermission',
      'pollForResponse',
      'removeWorkerResponse',
    ]
    for (const fn of deprecatedFns) {
      // Find the function and check that @deprecated appears before it
      const fnIndex = content.indexOf(`export async function ${fn}(`)
      if (fnIndex === -1) continue // submitPermissionRequest is a const, not async function
      const preceding = content.slice(Math.max(0, fnIndex - 500), fnIndex)
      expect(preceding).toContain('@deprecated')
    }
  })
  test('mailbox-based functions are NOT deprecated', async () => {
    const content = await file(
      'utils/swarm/permissionSync.ts',
    ).text()
    // These are the active path — must not be deprecated
    const activeFns = [
      'sendPermissionRequestViaMailbox',
      'sendPermissionResponseViaMailbox',
    ]
    for (const fn of activeFns) {
      const fnIndex = content.indexOf(`export async function ${fn}(`)
      expect(fnIndex).not.toBe(-1)
      const preceding = content.slice(Math.max(0, fnIndex - 300), fnIndex)
      expect(preceding).not.toContain('@deprecated')
    }
  })
 })
--- a/src/commands/benchmark.ts
+++ b/src/commands/benchmark.ts
@@ -0,0 +1,56 @@
 import type { ToolUseContext } from '../Tool.js'
 import type { Command } from '../types/command.js'
 import {
  benchmarkModel,
  benchmarkMultipleModels,
  formatBenchmarkResults,
  isBenchmarkSupported,
 } from '../utils/model/benchmark.js'
 import { getOllamaModelOptions } from '../utils/model/ollamaModels.js'
 async function runBenchmark(
  model?: string,
  context?: ToolUseContext,
 ): Promise<void> {
  if (!isBenchmarkSupported()) {
    context?.stdout?.write(
      'Benchmark not supported for this provider.\n' +
        'Supported: OpenAI-compatible endpoints (Ollama, NVIDIA NIM, MiniMax)\n',
    )
    return
  }
  let modelsToBenchmark: string[]
  if (model) {
    modelsToBenchmark = [model]
  } else {
    const ollamaModels = getOllamaModelOptions()
    modelsToBenchmark = ollamaModels.slice(0, 3).map((m) => m.value)
  }
  context?.stdout?.write(`Benchmarking ${modelsToBenchmark.length} model(s)...\n`)
  const results = await benchmarkMultipleModels(
    modelsToBenchmark,
    (completed, total, result) => {
      context?.stdout?.write(
        `[${completed}/${total}] ${result.model}: ` +
          `${result.success ? result.tokensPerSecond.toFixed(1) + ' tps' : 'FAILED'}\n`,
      )
    },
  )
  context?.stdout?.write('\n' + formatBenchmarkResults(results) + '\n')
 }
 export const benchmark: Command = {
  name: 'benchmark',
  async onExecute(context: ToolUseContext): Promise<void> {
    const args = context.args ?? {}
    const model = args.model as string | undefined
    await runBenchmark(model, context)
  },
 }
--- a/src/commands/provider/provider.tsx
+++ b/src/commands/provider/provider.tsx
@@ -66,10 +66,44 @@ import {
 import {
  getOllamaChatBaseUrl,
  getLocalOpenAICompatibleProviderLabel,
-  hasLocalOllama,
+  probeOllamaGenerationReadiness,
-  listOllamaModels,
+  type OllamaGenerationReadiness,
 } from '../../utils/providerDiscovery.js'
 function describeOllamaReadinessIssue(
  readiness: OllamaGenerationReadiness,
  options?: {
    baseUrl?: string
    allowManualFallback?: boolean
  },
 ): string {
  const endpoint = options?.baseUrl ?? 'http://localhost:11434'
  if (readiness.state === 'unreachable') {
    return `Could not reach Ollama at ${endpoint}. Start Ollama first, then run /provider again.`
  }
  if (readiness.state === 'no_models') {
    const manualSuffix = options?.allowManualFallback
      ? ', or enter details manually'
      : ''
    return `Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first${manualSuffix}.`
  }
  if (readiness.state === 'generation_failed') {
    const modelHint = readiness.probeModel ?? 'the selected model'
    const detailSuffix = readiness.detail
      ? ` Details: ${readiness.detail}.`
      : ''
    const manualSuffix = options?.allowManualFallback
      ? ' You can also enter details manually.'
      : ''
    return `Ollama is reachable and models are installed, but a generation probe failed for ${modelHint}.${detailSuffix} Run "ollama run ${modelHint}" once and retry.${manualSuffix}`
  }
  return ''
 }
 type ProviderChoice = 'auto' | ProviderProfile | 'codex-oauth' | 'clear'
 type Step =
@@ -715,6 +749,7 @@ function AutoRecommendationStep({
    | {
        state: 'openai'
        defaultModel: string
        reason: string
      }
    | {
        state: 'error'
@@ -728,19 +763,27 @@ function AutoRecommendationStep({
    void (async () => {
      const defaultModel = getGoalDefaultOpenAIModel(goal)
      try {
-        const ollamaAvailable = await hasLocalOllama()
+        const readiness = await probeOllamaGenerationReadiness()
-        if (!ollamaAvailable) {
+        if (readiness.state !== 'ready') {
          if (!cancelled) {
-            setStatus({ state: 'openai', defaultModel })
+            setStatus({
              state: 'openai',
              defaultModel,
              reason: describeOllamaReadinessIssue(readiness),
            })
          }
          return
        }
-        const models = await listOllamaModels()
+        const recommended = recommendOllamaModel(readiness.models, goal)
        const recommended = recommendOllamaModel(models, goal)
        if (!recommended) {
          if (!cancelled) {
-            setStatus({ state: 'openai', defaultModel })
+            setStatus({
              state: 'openai',
              defaultModel,
              reason:
                'Ollama responded to a generation probe, but no recommended chat model matched this goal.',
            })
          }
          return
        }
@@ -796,10 +839,10 @@ function AutoRecommendationStep({
      <Dialog title="Auto setup fallback" onCancel={onCancel}>
        <Box flexDirection="column" gap={1}>
          <Text>
-            No viable local Ollama chat model was detected. Auto setup can
+            Auto setup can continue into OpenAI-compatible setup with a default model of{' '}
            continue into OpenAI-compatible setup with a default model of{' '}
            {status.defaultModel}.
          </Text>
          <Text dimColor>{status.reason}</Text>
          <Select
            options={[
              { label: 'Continue to OpenAI-compatible setup', value: 'continue' },
@@ -883,32 +926,19 @@ function OllamaModelStep({
    let cancelled = false
    void (async () => {
-      const available = await hasLocalOllama()
+      const readiness = await probeOllamaGenerationReadiness()
-      if (!available) {
+      if (readiness.state !== 'ready') {
        if (!cancelled) {
          setStatus({
            state: 'unavailable',
-            message:
+            message: describeOllamaReadinessIssue(readiness),
              'Could not reach Ollama at http://localhost:11434. Start Ollama first, then run /provider again.',
          })
        }
        return
      }
-      const models = await listOllamaModels()
+      const ranked = rankOllamaModels(readiness.models, 'balanced')
-      if (models.length === 0) {
+      const recommended = recommendOllamaModel(readiness.models, 'balanced')
        if (!cancelled) {
          setStatus({
            state: 'unavailable',
            message:
              'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first.',
          })
        }
        return
      }
      const ranked = rankOllamaModels(models, 'balanced')
      const recommended = recommendOllamaModel(models, 'balanced')
      if (!cancelled) {
        setStatus({
          state: 'ready',
--- a/src/components/ConsoleOAuthFlow.test.tsx
+++ b/src/components/ConsoleOAuthFlow.test.tsx
@@ -112,8 +112,10 @@ test('third-party provider branch opens the first-run provider manager', async (
  )
  expect(output).toContain('Set up provider')
  // Use alphabetically-early sentinels so they remain visible in the
  // 13-row test frame after the provider list was sorted A→Z.
  expect(output).toContain('Anthropic')
-  expect(output).toContain('OpenAI')
+  expect(output).toContain('Azure OpenAI')
-  expect(output).toContain('Ollama')
+  expect(output).toContain('DeepSeek')
-  expect(output).toContain('LM Studio')
+  expect(output).toContain('Google Gemini')
 })
--- a/src/components/ProviderManager.test.tsx
+++ b/src/components/ProviderManager.test.tsx
@@ -97,6 +97,47 @@ async function waitForCondition(
  throw new Error('Timed out waiting for ProviderManager test condition')
 }
 // Provider list is sorted alphabetically by label in the preset picker, so
 // reaching a given provider takes more keypresses than it used to. Keep the
 // target-by-label indirection here so these tests survive future list edits
 // without further churn.
 //
 // Order matches ProviderManager.renderPresetSelection() when
 // canUseCodexOAuth === true (default in mocked tests).
 const PRESET_ORDER = [
  'Alibaba Coding Plan',
  'Alibaba Coding Plan (China)',
  'Anthropic',
  'Atomic Chat',
  'Azure OpenAI',
  'Codex OAuth',
  'DeepSeek',
  'Google Gemini',
  'Groq',
  'LM Studio',
  'MiniMax',
  'Mistral',
  'Moonshot AI',
  'NVIDIA NIM',
  'Ollama',
  'OpenAI',
  'OpenRouter',
  'Together AI',
  'Custom',
 ] as const
 async function navigateToPreset(
  stdin: { write: (data: string) => void },
  label: (typeof PRESET_ORDER)[number],
 ): Promise<void> {
  const index = PRESET_ORDER.indexOf(label)
  if (index < 0) throw new Error(`Unknown preset label: ${label}`)
  for (let i = 0; i < index; i++) {
    stdin.write('j')
    await Bun.sleep(25)
  }
 }
 function createDeferred<T>(): {
  promise: Promise<T>
  resolve: (value: T) => void
@@ -149,17 +190,21 @@ function mockProviderManagerDependencies(
    applySavedProfileToCurrentSession?: (...args: unknown[]) => Promise<string | null>
    clearCodexCredentials?: () => { success: boolean; warning?: string }
    getProviderProfiles?: () => unknown[]
-    hasLocalOllama?: () => Promise<boolean>
+    probeOllamaGenerationReadiness?: () => Promise<{
-    listOllamaModels?: () => Promise<
+      state: 'ready' | 'unreachable' | 'no_models' | 'generation_failed'
-      Array<{
+      models: Array<
-        name: string
+        {
-        sizeBytes?: number | null
+          name: string
-        family?: string | null
+          sizeBytes?: number | null
-        families?: string[]
+          family?: string | null
-        parameterSize?: string | null
+          families?: string[]
-        quantizationLevel?: string | null
+          parameterSize?: string | null
-      }>
+          quantizationLevel?: string | null
-    >
+        }
      >
      probeModel?: string
      detail?: string
    }>
    codexSyncRead?: () => unknown
    codexAsyncRead?: () => Promise<unknown>
    updateProviderProfile?: (...args: unknown[]) => unknown
@@ -189,8 +234,12 @@ function mockProviderManagerDependencies(
  })
  mock.module('../utils/providerDiscovery.js', () => ({
-    hasLocalOllama: options?.hasLocalOllama ?? (async () => false),
+    probeOllamaGenerationReadiness:
-    listOllamaModels: options?.listOllamaModels ?? (async () => []),
+      options?.probeOllamaGenerationReadiness ??
      (async () => ({
        state: 'unreachable' as const,
        models: [],
      })),
  }))
  mock.module('../utils/githubModelsCredentials.js', () => ({
@@ -455,19 +504,22 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
    async () => undefined,
    {
      addProviderProfile,
-      hasLocalOllama: async () => true,
+      probeOllamaGenerationReadiness: async () => ({
-      listOllamaModels: async () => [
+        state: 'ready',
-        {
+        models: [
-          name: 'gemma4:31b-cloud',
+          {
-          family: 'gemma',
+            name: 'gemma4:31b-cloud',
-          parameterSize: '31b',
+            family: 'gemma',
-        },
+            parameterSize: '31b',
-        {
+          },
-          name: 'kimi-k2.5:cloud',
+          {
-          family: 'kimi',
+            name: 'kimi-k2.5:cloud',
-          parameterSize: '2.5b',
+            family: 'kimi',
-        },
+            parameterSize: '2.5b',
-      ],
+          },
        ],
        probeModel: 'gemma4:31b-cloud',
      }),
    },
  )
@@ -480,11 +532,10 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
  await waitForFrameOutput(
    mounted.getOutput,
-    frame => frame.includes('Set up provider') && frame.includes('Ollama'),
+    frame => frame.includes('Set up provider'),
  )
-  mounted.stdin.write('j')
+  await navigateToPreset(mounted.stdin, 'Ollama')
  await Bun.sleep(50)
  mounted.stdin.write('\r')
  const modelFrame = await waitForFrameOutput(
@@ -579,12 +630,7 @@ test('ProviderManager first-run Codex OAuth switches the current session after l
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )
-  mounted.stdin.write('j')
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  await Bun.sleep(25)
  mounted.stdin.write('j')
  await Bun.sleep(25)
  mounted.stdin.write('j')
  await Bun.sleep(25)
  mounted.stdin.write('\r')
  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -676,12 +722,7 @@ test('ProviderManager first-run Codex OAuth reports next-startup fallback when s
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )
-  mounted.stdin.write('j')
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  await Bun.sleep(25)
  mounted.stdin.write('j')
  await Bun.sleep(25)
  mounted.stdin.write('j')
  await Bun.sleep(25)
  mounted.stdin.write('\r')
  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -775,12 +816,7 @@ test('ProviderManager does not hijack a manual Codex profile when OAuth credenti
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )
-  mounted.stdin.write('j')
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  await Bun.sleep(25)
  mounted.stdin.write('j')
  await Bun.sleep(25)
  mounted.stdin.write('j')
  await Bun.sleep(25)
  mounted.stdin.write('\r')
  await waitForCondition(() => onDone.mock.calls.length > 0)
--- a/src/components/ProviderManager.tsx
+++ b/src/components/ProviderManager.tsx
@@ -37,13 +37,16 @@ import {
  readGithubModelsTokenAsync,
 } from '../utils/githubModelsCredentials.js'
 import {
-  hasLocalOllama,
+  probeAtomicChatReadiness,
-  listOllamaModels,
+  probeOllamaGenerationReadiness,
  type AtomicChatReadiness,
  type OllamaGenerationReadiness,
 } from '../utils/providerDiscovery.js'
 import {
  rankOllamaModels,
  recommendOllamaModel,
 } from '../utils/providerRecommendation.js'
 import { redactUrlForDisplay } from '../utils/urlRedaction.js'
 import { updateSettingsForSource } from '../utils/settings/settings.js'
 import {
  type OptionWithDescription,
@@ -52,7 +55,6 @@ import {
 import { Pane } from './design-system/Pane.js'
 import TextInput from './TextInput.js'
 import { useCodexOAuthFlow } from './useCodexOAuthFlow.js'
 import { useSetAppState } from '../state/AppState.js'
 export type ProviderManagerResult = {
  action: 'saved' | 'cancelled'
@@ -69,6 +71,7 @@ type Screen =
  | 'menu'
  | 'select-preset'
  | 'select-ollama-model'
  | 'select-atomic-chat-model'
  | 'codex-oauth'
  | 'form'
  | 'select-active'
@@ -89,6 +92,16 @@ type OllamaSelectionState =
    }
  | { state: 'unavailable'; message: string }
 type AtomicChatSelectionState =
  | { state: 'idle' }
  | { state: 'loading' }
  | {
      state: 'ready'
      options: OptionWithDescription<string>[]
      defaultValue?: string
    }
  | { state: 'unavailable'; message: string }
 const FORM_STEPS: Array<{
  key: DraftField
  label: string
@@ -222,6 +235,44 @@ function getGithubProviderSummary(
  return `github-models · ${GITHUB_PROVIDER_DEFAULT_BASE_URL} · ${getGithubProviderModel(processEnv)} · ${credentialSummary}${activeSuffix}`
 }
 function describeAtomicChatSelectionIssue(
  readiness: AtomicChatReadiness,
  baseUrl: string,
 ): string {
  if (readiness.state === 'unreachable') {
    return `Could not reach Atomic Chat at ${redactUrlForDisplay(baseUrl)}. Start the Atomic Chat app first, or enter the endpoint manually.`
  }
  if (readiness.state === 'no_models') {
    return 'Atomic Chat is running, but no models are loaded. Download and load a model inside the Atomic Chat app first, or enter details manually.'
  }
  return ''
 }
 function describeOllamaSelectionIssue(
  readiness: OllamaGenerationReadiness,
  baseUrl: string,
 ): string {
  if (readiness.state === 'unreachable') {
    return `Could not reach Ollama at ${redactUrlForDisplay(baseUrl)}. Start Ollama first, or enter the endpoint manually.`
  }
  if (readiness.state === 'no_models') {
    return 'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first, or enter details manually.'
  }
  if (readiness.state === 'generation_failed') {
    const modelHint = readiness.probeModel ?? 'the selected model'
    const detailSuffix = readiness.detail
      ? ` Details: ${readiness.detail}.`
      : ''
    return `Ollama is reachable and models are installed, but a generation probe failed for ${modelHint}.${detailSuffix} Run "ollama run ${modelHint}" once and retry, or enter details manually.`
  }
  return ''
 }
 function findCodexOAuthProfile(
  profiles: ProviderProfile[],
  profileId?: string,
@@ -333,10 +384,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  const initialIsGithubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
  const initialHasGithubCredential = initialGithubCredentialSource !== 'none'
-  const [profiles, setProfiles] = React.useState(() => getProviderProfiles())
+  // Deferred initialization: useState initializers run synchronously during
-  const [activeProfileId, setActiveProfileId] = React.useState(
+  // render, so getProviderProfiles() and getActiveProviderProfile() would block
-    () => getActiveProviderProfile()?.id,
+  // the UI on first mount (sync file I/O). Use empty initial values and load
-  )
+  // asynchronously in useEffect with queueMicrotask to keep UI responsive.
  const [profiles, setProfiles] = React.useState<ProviderProfile[]>([])
  const [activeProfileId, setActiveProfileId] = React.useState<string | undefined>()
  const [githubProviderAvailable, setGithubProviderAvailable] = React.useState(
    () => isGithubProviderAvailable(initialGithubCredentialSource),
  )
@@ -370,11 +423,88 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  const [ollamaSelection, setOllamaSelection] = React.useState<OllamaSelectionState>({
    state: 'idle',
  })
  const [atomicChatSelection, setAtomicChatSelection] =
    React.useState<AtomicChatSelectionState>({ state: 'idle' })
  // Deferred initialization: useState initializers run synchronously during
  // render, so getProviderProfiles() and getActiveProviderProfile() would block
  // the UI (sync file I/O). Defer to queueMicrotask after first render.
  // In test environment, skip defer to avoid timing issues with mocks.
  const [isInitializing, setIsInitializing] = React.useState(
    process.env.NODE_ENV !== 'test',
  )
  const [isActivating, setIsActivating] = React.useState(false)
  const isRefreshingRef = React.useRef(false)
  React.useEffect(() => {
    // Skip deferred initialization in test environment (mocks are synchronous)
    if (process.env.NODE_ENV === 'test') {
      setProfiles(getProviderProfiles())
      setActiveProfileId(getActiveProviderProfile()?.id)
      setIsInitializing(false)
      return
    }
    queueMicrotask(() => {
      const profilesData = getProviderProfiles()
      const activeId = getActiveProviderProfile()?.id
      setProfiles(profilesData)
      setActiveProfileId(activeId)
      setIsInitializing(false)
    })
  }, [])
  const currentStep = FORM_STEPS[formStepIndex] ?? FORM_STEPS[0]
  const currentStepKey = currentStep.key
  const currentValue = draft[currentStepKey]
  // Memoize menu options to prevent unnecessary re-renders when navigating
  // the select menu. Without this, each arrow key press creates a new options
  // array reference, causing Select to re-render and feel sluggish.
  const hasProfiles = profiles.length > 0
  const hasSelectableProviders = hasProfiles || githubProviderAvailable
  const menuOptions = React.useMemo(
    () => [
      {
        value: 'add',
        label: 'Add provider',
        description: 'Create a new provider profile',
      },
      {
        value: 'activate',
        label: 'Set active provider',
        description: 'Switch the active provider profile',
        disabled: !hasSelectableProviders,
      },
      {
        value: 'edit',
        label: 'Edit provider',
        description: 'Update URL, model, or key',
        disabled: !hasProfiles,
      },
      {
        value: 'delete',
        label: 'Delete provider',
        description: 'Remove a provider profile',
        disabled: !hasSelectableProviders,
      },
      ...(hasStoredCodexOAuthCredentials
        ? [
            {
              value: 'logout-codex-oauth',
              label: 'Log out Codex OAuth',
              description: 'Clear securely stored Codex OAuth credentials',
            },
          ]
        : []),
      {
        value: 'done',
        label: 'Done',
        description: 'Return to chat',
      },
    ],
    [hasSelectableProviders, hasProfiles, hasStoredCodexOAuthCredentials],
  )
  const refreshGithubProviderState = React.useCallback((): void => {
    const envCredentialSource = getGithubCredentialSourceFromEnv()
    const githubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
@@ -450,32 +580,21 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    setOllamaSelection({ state: 'loading' })
    void (async () => {
-      const available = await hasLocalOllama(draft.baseUrl)
+      const readiness = await probeOllamaGenerationReadiness({
-      if (!available) {
+        baseUrl: draft.baseUrl,
      })
      if (readiness.state !== 'ready') {
        if (!cancelled) {
          setOllamaSelection({
            state: 'unavailable',
-            message:
+            message: describeOllamaSelectionIssue(readiness, draft.baseUrl),
              'Could not reach Ollama. Start Ollama first, or enter the endpoint manually.',
          })
        }
        return
      }
-      const models = await listOllamaModels(draft.baseUrl)
+      const ranked = rankOllamaModels(readiness.models, 'balanced')
-      if (models.length === 0) {
+      const recommended = recommendOllamaModel(readiness.models, 'balanced')
        if (!cancelled) {
          setOllamaSelection({
            state: 'unavailable',
            message:
              'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first, or enter details manually.',
          })
        }
        return
      }
      const ranked = rankOllamaModels(models, 'balanced')
      const recommended = recommendOllamaModel(models, 'balanced')
      if (!cancelled) {
        setOllamaSelection({
          state: 'ready',
@@ -494,12 +613,61 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    }
  }, [draft.baseUrl, screen])
  React.useEffect(() => {
    if (screen !== 'select-atomic-chat-model') {
      return
    }
    let cancelled = false
    setAtomicChatSelection({ state: 'loading' })
    void (async () => {
      const readiness = await probeAtomicChatReadiness({
        baseUrl: draft.baseUrl,
      })
      if (readiness.state !== 'ready') {
        if (!cancelled) {
          setAtomicChatSelection({
            state: 'unavailable',
            message: describeAtomicChatSelectionIssue(readiness, draft.baseUrl),
          })
        }
        return
      }
      if (!cancelled) {
        setAtomicChatSelection({
          state: 'ready',
          defaultValue: readiness.models[0],
          options: readiness.models.map(model => ({
            label: model,
            value: model,
          })),
        })
      }
    })()
    return () => {
      cancelled = true
    }
  }, [draft.baseUrl, screen])
  function refreshProfiles(): void {
-    const nextProfiles = getProviderProfiles()
+    // Defer sync I/O to next microtask to prevent UI freeze.
-    setProfiles(nextProfiles)
+    // getProviderProfiles() and getActiveProviderProfile() read config files
-    setActiveProfileId(getActiveProviderProfile()?.id)
+    // synchronously, which can block the main thread on Windows (antivirus, disk cache).
-    refreshGithubProviderState()
+    // queueMicrotask ensures the current render completes first.
-    refreshCodexOAuthCredentialState()
+    if (isRefreshingRef.current) return
    isRefreshingRef.current = true
    queueMicrotask(() => {
      const nextProfiles = getProviderProfiles()
      setProfiles(nextProfiles)
      setActiveProfileId(getActiveProviderProfile()?.id)
      refreshGithubProviderState()
      refreshCodexOAuthCredentialState()
      isRefreshingRef.current = false
    })
  }
  function clearStartupProviderOverrideFromUserSettings(): string | null {
@@ -572,12 +740,24 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  async function activateSelectedProvider(profileId: string): Promise<void> {
    let providerLabel = 'provider'
    // Set loading state before sync I/O to keep UI responsive
    setIsActivating(true)
    setStatusMessage('Activating provider...')
    try {
      // Defer sync I/O to next microtask - UI renders loading state first.
      // setActiveProviderProfile(), activateGithubProvider(), and
      // clearStartupProviderOverrideFromUserSettings() all perform sync file writes
      // (saveGlobalConfig, saveProfileFile, updateSettingsForSource) which can
      // block the main thread on Windows (antivirus, disk cache, NTFS metadata).
      await new Promise<void>(resolve => queueMicrotask(resolve))
      if (profileId === GITHUB_PROVIDER_ID) {
        providerLabel = GITHUB_PROVIDER_LABEL
        const githubError = activateGithubProvider()
        if (githubError) {
          setErrorMessage(`Could not activate GitHub provider: ${githubError}`)
          setIsActivating(false)
          returnToMenu()
          return
        }
@@ -593,6 +773,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          mainLoopModel: GITHUB_PROVIDER_DEFAULT_MODEL,
        }))
        setStatusMessage(`Active provider: ${GITHUB_PROVIDER_LABEL}`)
        setIsActivating(false)
        returnToMenu()
        return
      }
@@ -600,6 +781,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      const active = setActiveProviderProfile(profileId)
      if (!active) {
        setErrorMessage('Could not change active provider.')
        setIsActivating(false)
        returnToMenu()
        return
      }
@@ -647,10 +829,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            ? `Active provider: ${active.name}. Warning: could not clear startup provider override (${settingsOverrideError}).`
            : `Active provider: ${active.name}`,
      )
      setIsActivating(false)
      returnToMenu()
    } catch (error) {
      refreshProfiles()
      setStatusMessage(undefined)
      setIsActivating(false)
      const detail = error instanceof Error ? error.message : String(error)
      setErrorMessage(`Could not finish activating ${providerLabel}: ${detail}`)
      returnToMenu()
@@ -774,6 +958,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      return
    }
    if (preset === 'atomic-chat') {
      setAtomicChatSelection({ state: 'loading' })
      setScreen('select-atomic-chat-model')
      return
    }
    setScreen('form')
  }
@@ -849,6 +1039,86 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    returnToMenu()
  }
  function renderAtomicChatSelection(): React.ReactNode {
    if (
      atomicChatSelection.state === 'loading' ||
      atomicChatSelection.state === 'idle'
    ) {
      return (
        <Box flexDirection="column" gap={1}>
          <Text color="remember" bold>
            Checking Atomic Chat
          </Text>
          <Text dimColor>Looking for loaded Atomic Chat models...</Text>
        </Box>
      )
    }
    if (atomicChatSelection.state === 'unavailable') {
      return (
        <Box flexDirection="column" gap={1}>
          <Text color="remember" bold>
            Atomic Chat setup
          </Text>
          <Text dimColor>{atomicChatSelection.message}</Text>
          <Select
            options={[
              {
                value: 'manual',
                label: 'Enter manually',
                description: 'Fill in the base URL and model yourself',
              },
              {
                value: 'back',
                label: 'Back',
                description: 'Choose another provider preset',
              },
            ]}
            onChange={(value: string) => {
              if (value === 'manual') {
                setFormStepIndex(0)
                setCursorOffset(draft.name.length)
                setScreen('form')
                return
              }
              setScreen('select-preset')
            }}
            onCancel={() => setScreen('select-preset')}
            visibleOptionCount={2}
          />
        </Box>
      )
    }
    return (
      <Box flexDirection="column" gap={1}>
        <Text color="remember" bold>
          Choose an Atomic Chat model
        </Text>
        <Text dimColor>
          Pick one of the models loaded in Atomic Chat to save into a local
          provider profile.
        </Text>
        <Select
          options={atomicChatSelection.options}
          defaultValue={atomicChatSelection.defaultValue}
          defaultFocusValue={atomicChatSelection.defaultValue}
          inlineDescriptions
          visibleOptionCount={Math.min(8, atomicChatSelection.options.length)}
          onChange={(value: string) => {
            const nextDraft = {
              ...draft,
              model: value,
            }
            setDraft(nextDraft)
            persistDraft(nextDraft)
          }}
          onCancel={() => setScreen('select-preset')}
        />
      </Box>
    )
  }
  function renderOllamaSelection(): React.ReactNode {
    if (ollamaSelection.state === 'loading' || ollamaSelection.state === 'idle') {
      return (
@@ -979,21 +1249,35 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  function renderPresetSelection(): React.ReactNode {
    const canUseCodexOAuth = !isBareMode()
    // Providers sorted alphabetically by label. `Custom` is pinned to the end
    // because it's the catch-all / escape hatch — users scanning the list
    // should always find known providers first. `Skip for now` (first-run
    // only) comes last, after Custom.
    const options = [
      {
        value: 'dashscope-intl',
        label: 'Alibaba Coding Plan',
        description: 'Alibaba DashScope International endpoint',
      },
      {
        value: 'dashscope-cn',
        label: 'Alibaba Coding Plan (China)',
        description: 'Alibaba DashScope China endpoint',
      },
      {
        value: 'anthropic',
        label: 'Anthropic',
        description: 'Native Claude API (x-api-key auth)',
      },
      {
-        value: 'ollama',
+        value: 'atomic-chat',
-        label: 'Ollama',
+        label: 'Atomic Chat',
-        description: 'Local or remote Ollama endpoint',
+        description: 'Local Model Provider',
      },
      {
-        value: 'openai',
+        value: 'azure-openai',
-        label: 'OpenAI',
+        label: 'Azure OpenAI',
-        description: 'OpenAI API with API key',
+        description: 'Azure OpenAI endpoint (model=deployment name)',
      },
      ...(canUseCodexOAuth
        ? [
@@ -1005,11 +1289,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            },
          ]
        : []),
      {
        value: 'moonshotai',
        label: 'Moonshot AI',
        description: 'Kimi OpenAI-compatible endpoint',
      },
      {
        value: 'deepseek',
        label: 'DeepSeek',
@@ -1020,50 +1299,30 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        label: 'Google Gemini',
        description: 'Gemini OpenAI-compatible endpoint',
      },
      {
        value: 'together',
        label: 'Together AI',
        description: 'Together chat/completions endpoint',
      },
      {
        value: 'groq',
        label: 'Groq',
        description: 'Groq OpenAI-compatible endpoint',
      },
      {
        value: 'mistral',
        label: 'Mistral',
        description: 'Mistral OpenAI-compatible endpoint',
      },
      {
        value: 'azure-openai',
        label: 'Azure OpenAI',
        description: 'Azure OpenAI endpoint (model=deployment name)',
      },
      {
        value: 'openrouter',
        label: 'OpenRouter',
        description: 'OpenRouter OpenAI-compatible endpoint',
      },
      {
        value: 'lmstudio',
        label: 'LM Studio',
        description: 'Local LM Studio endpoint',
      },
      {
-        value: 'dashscope-cn',
+        value: 'minimax',
-        label: 'Alibaba Coding Plan (China)',
+        label: 'MiniMax',
-        description: 'Alibaba DashScope China endpoint',
+        description: 'MiniMax API endpoint',
      },
      {
-        value: 'dashscope-intl',
+        value: 'mistral',
-        label: 'Alibaba Coding Plan',
+        label: 'Mistral',
-        description: 'Alibaba DashScope International endpoint',
+        description: 'Mistral OpenAI-compatible endpoint',
      },
      {
-        value: 'custom',
+        value: 'moonshotai',
-        label: 'Custom',
+        label: 'Moonshot AI',
-        description: 'Any OpenAI-compatible provider',
+        description: 'Kimi OpenAI-compatible endpoint',
      },
      {
        value: 'nvidia-nim',
@@ -1071,9 +1330,29 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        description: 'NVIDIA NIM endpoint',
      },
      {
-        value: 'minimax',
+        value: 'ollama',
-        label: 'MiniMax',
+        label: 'Ollama',
-        description: 'MiniMax API endpoint',
+        description: 'Local or remote Ollama endpoint',
      },
      {
        value: 'openai',
        label: 'OpenAI',
        description: 'OpenAI API with API key',
      },
      {
        value: 'openrouter',
        label: 'OpenRouter',
        description: 'OpenRouter OpenAI-compatible endpoint',
      },
      {
        value: 'together',
        label: 'Together AI',
        description: 'Together chat/completions endpoint',
      },
      {
        value: 'custom',
        label: 'Custom',
        description: 'Any OpenAI-compatible provider',
      },
      ...(mode === 'first-run'
        ? [
@@ -1165,49 +1444,10 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  }
  function renderMenu(): React.ReactNode {
    // Use memoized menuOptions from component scope
    const hasProfiles = profiles.length > 0
    const hasSelectableProviders = hasProfiles || githubProviderAvailable
    const options = [
      {
        value: 'add',
        label: 'Add provider',
        description: 'Create a new provider profile',
      },
      {
        value: 'activate',
        label: 'Set active provider',
        description: 'Switch the active provider profile',
        disabled: !hasSelectableProviders,
      },
      {
        value: 'edit',
        label: 'Edit provider',
        description: 'Update URL, model, or key',
        disabled: !hasProfiles,
      },
      {
        value: 'delete',
        label: 'Delete provider',
        description: 'Remove a provider profile',
        disabled: !hasSelectableProviders,
      },
      ...(hasStoredCodexOAuthCredentials
        ? [
            {
              value: 'logout-codex-oauth',
              label: 'Log out Codex OAuth',
              description: 'Clear securely stored Codex OAuth credentials',
            },
          ]
        : []),
      {
        value: 'done',
        label: 'Done',
        description: 'Return to chat',
      },
    ]
    return (
      <Box flexDirection="column" gap={1}>
        <Text color="remember" bold>
@@ -1244,7 +1484,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          )}
        </Box>
        <Select
-          options={options}
+          options={menuOptions}
          onChange={(value: string) => {
            setErrorMessage(undefined)
            switch (value) {
@@ -1257,7 +1497,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
                }
                break
              case 'edit':
-                if (profiles.length > 0) {
+                if (hasProfiles) {
                  setScreen('select-edit')
                }
                break
@@ -1314,7 +1554,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          }}
          onCancel={() => closeWithCancelled('Provider manager closed')}
          defaultFocusValue={menuFocusValue}
-          visibleOptionCount={options.length}
+          visibleOptionCount={menuOptions.length}
        />
      </Box>
    )
@@ -1393,6 +1633,9 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    case 'select-ollama-model':
      content = renderOllamaSelection()
      break
    case 'select-atomic-chat-model':
      content = renderAtomicChatSelection()
      break
    case 'codex-oauth':
      content = (
        <CodexOAuthSetup
@@ -1550,5 +1793,21 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      break
  }
-  return <Pane color="permission">{content}</Pane>
+  return (
    <Pane color="permission">
      {isInitializing ? (
        <Box flexDirection="column" gap={1}>
          <Text color="remember" bold>Loading providers...</Text>
          <Text dimColor>Reading provider profiles from disk.</Text>
        </Box>
      ) : isActivating ? (
        <Box flexDirection="column" gap={1}>
          <Text color="remember" bold>Activating provider...</Text>
          <Text dimColor>Please wait while the provider is being configured.</Text>
        </Box>
      ) : (
        content
      )}
    </Pane>
  )
 }
--- a/src/components/Settings/Config.tsx
+++ b/src/components/Settings/Config.tsx
@@ -281,6 +281,24 @@ export function Config({
        enabled: autoCompactEnabled
      });
    }
  }, {
    id: 'toolHistoryCompressionEnabled',
    label: 'Tool history compression',
    value: globalConfig.toolHistoryCompressionEnabled,
    type: 'boolean' as const,
    onChange(toolHistoryCompressionEnabled: boolean) {
      saveGlobalConfig(current => ({
        ...current,
        toolHistoryCompressionEnabled
      }));
      setGlobalConfig({
        ...getGlobalConfig(),
        toolHistoryCompressionEnabled
      });
      logEvent('tengu_tool_history_compression_setting_changed', {
        enabled: toolHistoryCompressionEnabled
      });
    }
  }, {
    id: 'spinnerTipsEnabled',
    label: 'Show tips',
@@ -1158,6 +1176,9 @@ export function Config({
    if (globalConfig.autoCompactEnabled !== initialConfig.current.autoCompactEnabled) {
      formattedChanges.push(`${globalConfig.autoCompactEnabled ? 'Enabled' : 'Disabled'} auto-compact`);
    }
    if (globalConfig.toolHistoryCompressionEnabled !== initialConfig.current.toolHistoryCompressionEnabled) {
      formattedChanges.push(`${globalConfig.toolHistoryCompressionEnabled ? 'Enabled' : 'Disabled'} tool history compression`);
    }
    if (globalConfig.respectGitignore !== initialConfig.current.respectGitignore) {
      formattedChanges.push(`${globalConfig.respectGitignore ? 'Enabled' : 'Disabled'} respect .gitignore in file picker`);
    }
--- a/src/components/StartupScreen.ts
+++ b/src/components/StartupScreen.ts
@@ -123,6 +123,8 @@ function detectProvider(): { name: string; model: string; baseUrl: string; isLoc
      name = 'MiniMax'
    else if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex'))
      name = 'Codex'
    else if (/moonshot/i.test(baseUrl) || /kimi/i.test(rawModel))
      name = 'Moonshot (Kimi)'
    else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))
      name = 'DeepSeek'
    else if (/openrouter/i.test(baseUrl))
--- a/src/components/memory/memoryFileSelectorPaths.test.ts
+++ b/src/components/memory/memoryFileSelectorPaths.test.ts
@@ -53,17 +53,20 @@ describe('getProjectMemoryPathForSelector', () => {
  })
  test('defaults to a new AGENTS.md in the current cwd when no project file is loaded', () => {
-    expect(getProjectMemoryPathForSelector([], '/repo/packages/app')).toBe(
+    const cwd = join('/repo', 'packages', 'app')
-      '/repo/packages/app/AGENTS.md',
+    expect(getProjectMemoryPathForSelector([], cwd)).toBe(
      join(cwd, 'AGENTS.md'),
    )
  })
  test('ignores loaded project instruction files outside the current cwd ancestry', () => {
    const outsideRepoPath = join('/other-worktree', 'AGENTS.md')
    const cwd = join('/repo', 'packages', 'app')
    expect(
      getProjectMemoryPathForSelector(
-        [projectFile('/other-worktree/AGENTS.md')],
+        [projectFile(outsideRepoPath)],
-        '/repo/packages/app',
+        cwd,
      ),
-    ).toBe('/repo/packages/app/AGENTS.md')
+    ).toBe(join(cwd, 'AGENTS.md'))
  })
 })
--- a/src/constants/prompts.ts
+++ b/src/constants/prompts.ts
@@ -823,6 +823,11 @@ function getFunctionResultClearingSection(model: string): string | null {
    return null
  }
  const config = getCachedMCConfigForFRC()
  if (!config) {
    // External/stub builds return null from getCachedMCConfig — abort the
    // section rather than trying to read .supportedModels off null.
    return null
  }
  const isModelSupported = config.supportedModels?.some(pattern =>
    model.includes(pattern),
  )
--- a/src/hooks/useOfficialMarketplaceNotification.tsx
+++ b/src/hooks/useOfficialMarketplaceNotification.tsx
@@ -19,7 +19,7 @@ async function _temp() {
    logForDebugging("Showing marketplace config save failure notification");
    notifs.push({
      key: "marketplace-config-save-failed",
-      jsx: <Text color="error">Failed to save marketplace retry info · Check ~/.claude.json permissions</Text>,
+      jsx: <Text color="error">Failed to save marketplace retry info · Check ~/.openclaude.json permissions</Text>,
      priority: "immediate",
      timeoutMs: 10000
    });
--- a/src/hooks/usePasteHandler.test.ts
+++ b/src/hooks/usePasteHandler.test.ts
@@ -1,5 +1,8 @@
 import { expect, test } from 'bun:test'
-import { supportsClipboardImageFallback } from './usePasteHandler.ts'
+import {
  shouldHandleInputAsPaste,
  supportsClipboardImageFallback,
 } from './usePasteHandler.ts'
 test('supports clipboard image fallback on Windows', () => {
  expect(supportsClipboardImageFallback('windows')).toBe(true)
@@ -20,3 +23,42 @@ test('does not support clipboard image fallback on WSL', () => {
 test('does not support clipboard image fallback on unknown platforms', () => {
  expect(supportsClipboardImageFallback('unknown')).toBe(false)
 })
 test('does not treat a bracketed paste as pending when no paste handlers are provided', () => {
  expect(
    shouldHandleInputAsPaste({
      hasTextPasteHandler: false,
      hasImagePasteHandler: false,
      inputLength: 'kimi-k2.5'.length,
      pastePending: false,
      hasImageFilePath: false,
      isFromPaste: true,
    }),
  ).toBe(false)
 })
 test('treats bracketed text paste as pending when a text paste handler exists', () => {
  expect(
    shouldHandleInputAsPaste({
      hasTextPasteHandler: true,
      hasImagePasteHandler: false,
      inputLength: 'kimi-k2.5'.length,
      pastePending: false,
      hasImageFilePath: false,
      isFromPaste: true,
    }),
  ).toBe(true)
 })
 test('treats image path paste as pending when only an image handler exists', () => {
  expect(
    shouldHandleInputAsPaste({
      hasTextPasteHandler: false,
      hasImagePasteHandler: true,
      inputLength: 'C:\\Users\\jat\\image.png'.length,
      pastePending: false,
      hasImageFilePath: true,
      isFromPaste: false,
    }),
  ).toBe(true)
 })
--- a/src/hooks/usePasteHandler.ts
+++ b/src/hooks/usePasteHandler.ts
@@ -35,6 +35,24 @@ type PasteHandlerProps = {
  ) => void
 }
 export function shouldHandleInputAsPaste(options: {
  hasTextPasteHandler: boolean
  hasImagePasteHandler: boolean
  inputLength: number
  pastePending: boolean
  hasImageFilePath: boolean
  isFromPaste: boolean
 }): boolean {
  return (
    (options.hasTextPasteHandler &&
      (options.inputLength > PASTE_THRESHOLD ||
        options.pastePending ||
        options.hasImageFilePath ||
        options.isFromPaste)) ||
    (options.hasImagePasteHandler && options.hasImageFilePath)
  )
 }
 export function usePasteHandler({
  onPaste,
  onInput,
@@ -236,11 +254,6 @@ export function usePasteHandler({
    // The keypress parser sets isPasted=true for content within bracketed paste.
    const isFromPaste = event.keypress.isPasted
    // If this is pasted content, set isPasting state for UI feedback
    if (isFromPaste) {
      setIsPasting(true)
    }
    // Handle large pastes (>PASTE_THRESHOLD chars)
    // Usually we get one or two input characters at a time. If we
    // get more than the threshold, the user has probably pasted.
@@ -268,6 +281,7 @@ export function usePasteHandler({
      canFallbackToClipboardImage &&
      onImagePaste
    ) {
      setIsPasting(true)
      checkClipboardForImage()
      // Reset isPasting since there's no text content to process
      setIsPasting(false)
@@ -275,14 +289,17 @@ export function usePasteHandler({
    }
    // Check if we should handle as paste (from bracketed paste, large input, or continuation)
-    const shouldHandleAsPaste =
+    const shouldHandleAsPaste = shouldHandleInputAsPaste({
-      onPaste &&
+      hasTextPasteHandler: Boolean(onPaste),
-      (input.length > PASTE_THRESHOLD ||
+      hasImagePasteHandler: Boolean(onImagePaste),
-        pastePendingRef.current ||
+      inputLength: input.length,
-        hasImageFilePath ||
+      pastePending: pastePendingRef.current,
-        isFromPaste)
+      hasImageFilePath,
      isFromPaste,
    })
    if (shouldHandleAsPaste) {
      setIsPasting(true)
      pastePendingRef.current = true
      setPasteState(({ chunks, timeoutId }) => {
        return {
--- a/src/hooks/useSwarmPermissionPoller.ts
+++ b/src/hooks/useSwarmPermissionPoller.ts
@@ -1,34 +1,23 @@
 /**
- * Swarm Permission Poller Hook
+ * Swarm Permission Callback Registry
 *
- * This hook polls for permission responses from the team leader when running
+ * Manages callback registrations for permission requests and responses
- * as a worker agent in a swarm. When a response is received, it calls the
+ * in agent swarms. Responses are delivered exclusively via the mailbox
- * appropriate callback (onAllow/onReject) to continue execution.
+ * system (useInboxPoller → processMailboxPermissionResponse).
 *
- * This hook should be used in conjunction with the worker-side integration
+ * The legacy file-based polling (resolved/ directory) has been removed
- * in useCanUseTool.ts, which creates pending requests that this hook monitors.
+ * because it created an unauthenticated attack surface — any local process
 * could forge approval files. The mailbox path is the sole active channel.
 */
 import { useCallback, useEffect, useRef } from 'react'
 import { useInterval } from 'usehooks-ts'
 import { logForDebugging } from '../utils/debug.js'
 import { errorMessage } from '../utils/errors.js'
 import {
  type PermissionUpdate,
  permissionUpdateSchema,
 } from '../utils/permissions/PermissionUpdateSchema.js'
 import {
  isSwarmWorker,
  type PermissionResponse,
  pollForResponse,
  removeWorkerResponse,
 } from '../utils/swarm/permissionSync.js'
 import { getAgentName, getTeamName } from '../utils/teammate.js'
 const POLL_INTERVAL_MS = 500
 /**
- * Validate permissionUpdates from external sources (mailbox IPC, disk polling).
+ * Validate permissionUpdates from external sources (mailbox IPC).
 * Malformed entries from buggy/old teammate processes are filtered out rather
 * than propagated unchecked into callback.onAllow().
 */
@@ -225,106 +214,9 @@ export function processSandboxPermissionResponse(params: {
  return true
 }
-/**
+// Legacy file-based polling (useSwarmPermissionPoller, processResponse)
- * Process a permission response by invoking the registered callback
+// has been removed. Permission responses are now delivered exclusively
- */
+// via the mailbox system:
-function processResponse(response: PermissionResponse): boolean {
+//   Leader: sendPermissionResponseViaMailbox() → writeToMailbox()
-  const callback = pendingCallbacks.get(response.requestId)
+//   Worker: useInboxPoller → processMailboxPermissionResponse()
-
+// See: fix(security) — remove unauthenticated file-based permission channel
  if (!callback) {
    logForDebugging(
      `[SwarmPermissionPoller] No callback registered for request ${response.requestId}`,
    )
    return false
  }
  logForDebugging(
    `[SwarmPermissionPoller] Processing response for request ${response.requestId}: ${response.decision}`,
  )
  // Remove from registry before invoking callback
  pendingCallbacks.delete(response.requestId)
  if (response.decision === 'approved') {
    const permissionUpdates = parsePermissionUpdates(response.permissionUpdates)
    const updatedInput = response.updatedInput
    callback.onAllow(updatedInput, permissionUpdates)
  } else {
    callback.onReject(response.feedback)
  }
  return true
 }
 /**
 * Hook that polls for permission responses when running as a swarm worker.
 *
 * This hook:
 * 1. Only activates when isSwarmWorker() returns true
 * 2. Polls every 500ms for responses
 * 3. When a response is found, invokes the registered callback
 * 4. Cleans up the response file after processing
 */
 export function useSwarmPermissionPoller(): void {
  const isProcessingRef = useRef(false)
  const poll = useCallback(async () => {
    // Don't poll if not a swarm worker
    if (!isSwarmWorker()) {
      return
    }
    // Prevent concurrent polling
    if (isProcessingRef.current) {
      return
    }
    // Don't poll if no callbacks are registered
    if (pendingCallbacks.size === 0) {
      return
    }
    isProcessingRef.current = true
    try {
      const agentName = getAgentName()
      const teamName = getTeamName()
      if (!agentName || !teamName) {
        return
      }
      // Check each pending request for a response
      for (const [requestId, _callback] of pendingCallbacks) {
        const response = await pollForResponse(requestId, agentName, teamName)
        if (response) {
          // Process the response
          const processed = processResponse(response)
          if (processed) {
            // Clean up the response from the worker's inbox
            await removeWorkerResponse(requestId, agentName, teamName)
          }
        }
      }
    } catch (error) {
      logForDebugging(
        `[SwarmPermissionPoller] Error during poll: ${errorMessage(error)}`,
      )
    } finally {
      isProcessingRef.current = false
    }
  }, [])
  // Only poll if we're a swarm worker
  const shouldPoll = isSwarmWorker()
  useInterval(() => void poll(), shouldPoll ? POLL_INTERVAL_MS : null)
  // Initial poll on mount
  useEffect(() => {
    if (isSwarmWorker()) {
      void poll()
    }
  }, [poll])
 }
--- a/src/ink/termio/osc.test.ts
+++ b/src/ink/termio/osc.test.ts
@@ -11,14 +11,16 @@ const execFileNoThrowMock = mock(
  async () => ({ code: 0, stdout: '', stderr: '' }),
 )
-mock.module('../../utils/execFileNoThrow.js', () => ({
+function installOscMocks(): void {
-  execFileNoThrow: execFileNoThrowMock,
+  mock.module('../../utils/execFileNoThrow.js', () => ({
-  execFileNoThrowWithCwd: execFileNoThrowMock,
+    execFileNoThrow: execFileNoThrowMock,
-}))
+    execFileNoThrowWithCwd: execFileNoThrowMock,
  }))
-mock.module('../../utils/tempfile.js', () => ({
+  mock.module('../../utils/tempfile.js', () => ({
-  generateTempFilePath: generateTempFilePathMock,
+    generateTempFilePath: generateTempFilePathMock,
-}))
+  }))
 }
 async function importFreshOscModule() {
  return import(`./osc.ts?ts=${Date.now()}-${Math.random()}`)
@@ -45,6 +47,7 @@ async function waitForExecCall(
 describe('Windows clipboard fallback', () => {
  beforeEach(() => {
    installOscMocks()
    execFileNoThrowMock.mockClear()
    generateTempFilePathMock.mockClear()
    process.env = { ...originalEnv }
@@ -62,14 +65,12 @@ describe('Windows clipboard fallback', () => {
    const { setClipboard } = await importFreshOscModule()
    await setClipboard('Привет мир')
-    await flushClipboardCopy()
+    const windowsCall = await waitForExecCall('powershell')
    expect(execFileNoThrowMock.mock.calls.some(([cmd]) => cmd === 'clip')).toBe(
      false,
    )
-    expect(
+    expect(windowsCall).toBeDefined()
      execFileNoThrowMock.mock.calls.some(([cmd]) => cmd === 'powershell'),
    ).toBe(true)
  })
  test('passes Windows clipboard text through a UTF-8 temp file instead of stdin', async () => {
@@ -97,6 +98,7 @@ describe('Windows clipboard fallback', () => {
 describe('clipboard path behavior remains stable', () => {
  beforeEach(() => {
    installOscMocks()
    execFileNoThrowMock.mockClear()
    process.env = { ...originalEnv }
    delete process.env['SSH_CONNECTION']
--- a/src/migrations/resetAutoModeOptInForDefaultOffer.ts
+++ b/src/migrations/resetAutoModeOptInForDefaultOffer.ts
@@ -12,7 +12,7 @@ import {
 * One-shot migration: clear skipAutoPermissionPrompt for users who accepted
 * the old 2-option AutoModeOptInDialog but don't have auto as their default.
 * Re-surfaces the dialog so they see the new "make it my default mode" option.
- * Guard lives in GlobalConfig (~/.claude.json), not settings.json, so it
+ * Guard lives in GlobalConfig (~/.openclaude.json), not settings.json, so it
 * survives settings resets and doesn't re-arm itself.
 *
 * Only runs when tengu_auto_mode_config.enabled === 'enabled'. For 'opt-in'
--- a/src/screens/REPL.tsx
+++ b/src/screens/REPL.tsx
@@ -3873,7 +3873,7 @@ export function REPL({
  // empty to non-empty, not on every length change -- otherwise a render loop
  // (concurrent onQuery thrashing, etc.) spams saveGlobalConfig, which hits
  // ELOCKED under concurrent sessions and falls back to unlocked writes.
-  // That write storm is the primary trigger for ~/.claude.json corruption
+  // That write storm is the primary trigger for ~/.openclaude.json corruption
  // (GH #3117).
  const hasCountedQueueUseRef = useRef(false);
  useEffect(() => {
--- a/src/services/analytics/growthbook.ts
+++ b/src/services/analytics/growthbook.ts
@@ -334,7 +334,7 @@ async function processRemoteEvalPayload(
  // Empty object is truthy — without the length check, `{features: {}}`
  // (transient server bug, truncated response) would pass, clear the maps
  // below, return true, and syncRemoteEvalToDisk would wholesale-write `{}`
-  // to disk: total flag blackout for every process sharing ~/.claude.json.
+  // to disk: total flag blackout for every process sharing ~/.openclaude.json.
  if (!payload?.features || Object.keys(payload.features).length === 0) {
    return false
  }
--- a/src/services/api/claude.ts
+++ b/src/services/api/claude.ts
@@ -23,6 +23,7 @@ import { randomUUID } from 'crypto'
 import {
  getAPIProvider,
  isFirstPartyAnthropicBaseUrl,
  isGithubNativeAnthropicMode,
 } from 'src/utils/model/providers.js'
 import {
  getAttributionHeader,
@@ -334,8 +335,13 @@ export function getPromptCachingEnabled(model: string): boolean {
  // Prompt caching is an Anthropic-specific feature. Third-party providers
  // do not understand cache_control blocks and strict backends (e.g. Azure
  // Foundry) reject or flag requests that contain them.
  //
  // Exception: when the GitHub provider is configured in native Anthropic API
  // mode (CLAUDE_CODE_GITHUB_ANTHROPIC_API=1), requests are sent in Anthropic
  // format, so cache_control blocks are supported.
  const provider = getAPIProvider()
-  if (provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex') {
+  const isNativeGithub = isGithubNativeAnthropicMode(model)
  if (provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex' && !isNativeGithub) {
    return false
  }
@@ -1211,7 +1217,7 @@ async function* queryModel(
    cachedMCEnabled = featureEnabled && modelSupported
    const config = getCachedMCConfig()
    logForDebugging(
-      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config.supportedModels)}`,
+      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config?.supportedModels)}`,
    )
  }
--- a/src/services/api/client.ts
+++ b/src/services/api/client.ts
@@ -14,6 +14,7 @@ import { getSmallFastModel } from 'src/utils/model/model.js'
 import {
  getAPIProvider,
  isFirstPartyAnthropicBaseUrl,
  isGithubNativeAnthropicMode,
 } from 'src/utils/model/providers.js'
 import { getProxyFetchOptions } from 'src/utils/proxy.js'
 import {
@@ -174,6 +175,25 @@ export async function getAnthropicClient({
      providerOverride,
    }) as unknown as Anthropic
  }
  // GitHub provider in native Anthropic API mode: send requests in Anthropic
  // format so cache_control blocks are honoured and prompt caching works.
  // Requires the GitHub endpoint (OPENAI_BASE_URL) to support Anthropic's
  // messages API — set CLAUDE_CODE_GITHUB_ANTHROPIC_API=1 to opt in.
  if (isGithubNativeAnthropicMode(model)) {
    const githubBaseUrl =
      process.env.OPENAI_BASE_URL?.replace(/\/$/, '') ??
      'https://api.githubcopilot.com'
    const githubToken =
      process.env.GITHUB_TOKEN ?? process.env.GH_TOKEN ?? ''
    const nativeArgs: ConstructorParameters<typeof Anthropic>[0] = {
      ...ARGS,
      baseURL: githubBaseUrl,
      authToken: githubToken,
      // No apiKey — we authenticate via Bearer token (authToken)
      apiKey: null,
    }
    return new Anthropic(nativeArgs)
  }
  if (
    isEnvTruthy(process.env.CLAUDE_CODE_USE_OPENAI) ||
    isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) ||
--- a/src/services/api/codexShim.test.ts
+++ b/src/services/api/codexShim.test.ts
@@ -547,7 +547,7 @@ describe('Codex request translation', () => {
    ])
  })
-  test('strips leaked reasoning preamble from completed Codex text responses', () => {
+  test('strips <think> tag block from completed Codex text responses', () => {
    const message = convertCodexResponseToAnthropicMessage(
      {
        id: 'resp_1',
@@ -560,7 +560,7 @@ describe('Codex request translation', () => {
              {
                type: 'output_text',
                text:
-                  'The user just said "hey" - a simple greeting. I should respond briefly and friendly.\n\nHey! How can I help you today?',
+                  '<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?',
              },
            ],
          },
@@ -578,6 +578,37 @@ describe('Codex request translation', () => {
    ])
  })
  test('strips unterminated <think> tag at block boundary in Codex completed response', () => {
    const message = convertCodexResponseToAnthropicMessage(
      {
        id: 'resp_1',
        model: 'gpt-5.4',
        output: [
          {
            type: 'message',
            role: 'assistant',
            content: [
              {
                type: 'output_text',
                text:
                  'Here is the answer.\n<think>wait, let me reconsider the user request',
              },
            ],
          },
        ],
        usage: { input_tokens: 12, output_tokens: 4 },
      },
      'gpt-5.4',
    )
    expect(message.content).toEqual([
      {
        type: 'text',
        text: 'Here is the answer.',
      },
    ])
  })
  test('translates Codex SSE text stream into Anthropic events', async () => {
    const responseText = [
      'event: response.output_item.added',
@@ -609,7 +640,7 @@ describe('Codex request translation', () => {
    ])
  })
-  test('strips leaked reasoning preamble from Codex SSE text stream', async () => {
+  test('strips <think> tag block from Codex SSE text stream', async () => {
    const responseText = [
      'event: response.output_item.added',
      'data: {"type":"response.output_item.added","item":{"id":"msg_1","type":"message","status":"in_progress","content":[],"role":"assistant"},"output_index":0,"sequence_number":0}',
@@ -618,13 +649,13 @@ describe('Codex request translation', () => {
      'data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_1","output_index":0,"part":{"type":"output_text","text":""},"sequence_number":1}',
      '',
      'event: response.output_text.delta',
-      'data: {"type":"response.output_text.delta","content_index":0,"delta":"The user just said \\"hey\\" - a simple greeting. I should respond briefly and friendly.\\n\\nHey! How can I help you today?","item_id":"msg_1","output_index":0,"sequence_number":2}',
+      'data: {"type":"response.output_text.delta","content_index":0,"delta":"<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?","item_id":"msg_1","output_index":0,"sequence_number":2}',
      '',
      'event: response.output_item.done',
-      'data: {"type":"response.output_item.done","item":{"id":"msg_1","type":"message","status":"completed","content":[{"type":"output_text","text":"The user just said \\"hey\\" - a simple greeting. I should respond briefly and friendly.\\n\\nHey! How can I help you today?"}],"role":"assistant"},"output_index":0,"sequence_number":3}',
+      'data: {"type":"response.output_item.done","item":{"id":"msg_1","type":"message","status":"completed","content":[{"type":"output_text","text":"<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?"}],"role":"assistant"},"output_index":0,"sequence_number":3}',
      '',
      'event: response.completed',
-      'data: {"type":"response.completed","response":{"id":"resp_1","status":"completed","model":"gpt-5.4","output":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"The user just said \\"hey\\" - a simple greeting. I should respond briefly and friendly.\\n\\nHey! How can I help you today?"}]}],"usage":{"input_tokens":2,"output_tokens":1}},"sequence_number":4}',
+      'data: {"type":"response.completed","response":{"id":"resp_1","status":"completed","model":"gpt-5.4","output":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?"}]}],"usage":{"input_tokens":2,"output_tokens":1}},"sequence_number":4}',
      '',
    ].join('\n')
@@ -646,6 +677,50 @@ describe('Codex request translation', () => {
      }
    }
-    expect(textDeltas).toEqual(['Hey! How can I help you today?'])
+    expect(textDeltas.join('')).toBe('Hey! How can I help you today?')
  })
  test('preserves prose without tags (no phrase-based false positive)', async () => {
    // Regression test: older phrase-based sanitizer would incorrectly strip text
    // starting with "I should" or "The user". The tag-based approach leaves it alone.
    const responseText = [
      'event: response.output_item.added',
      'data: {"type":"response.output_item.added","item":{"id":"msg_1","type":"message","status":"in_progress","content":[],"role":"assistant"},"output_index":0,"sequence_number":0}',
      '',
      'event: response.content_part.added',
      'data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_1","output_index":0,"part":{"type":"output_text","text":""},"sequence_number":1}',
      '',
      'event: response.output_text.delta',
      'data: {"type":"response.output_text.delta","content_index":0,"delta":"I should note that the user role requires a briefly concise friendly response format.","item_id":"msg_1","output_index":0,"sequence_number":2}',
      '',
      'event: response.output_item.done',
      'data: {"type":"response.output_item.done","item":{"id":"msg_1","type":"message","status":"completed","content":[{"type":"output_text","text":"I should note that the user role requires a briefly concise friendly response format."}],"role":"assistant"},"output_index":0,"sequence_number":3}',
      '',
      'event: response.completed',
      'data: {"type":"response.completed","response":{"id":"resp_1","status":"completed","model":"gpt-5.4","output":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"I should note that the user role requires a briefly concise friendly response format."}]}],"usage":{"input_tokens":2,"output_tokens":1}},"sequence_number":4}',
      '',
    ].join('\n')
    const stream = new ReadableStream({
      start(controller) {
        controller.enqueue(new TextEncoder().encode(responseText))
        controller.close()
      },
    })
    const textDeltas: string[] = []
    for await (const event of codexStreamToAnthropic(
      new Response(stream),
      'gpt-5.4',
    )) {
      const delta = (event as { delta?: { type?: string; text?: string } }).delta
      if (delta?.type === 'text_delta' && typeof delta.text === 'string') {
        textDeltas.push(delta.text)
      }
    }
    expect(textDeltas.join('')).toBe(
      'I should note that the user role requires a briefly concise friendly response format.',
    )
  })
 })
--- a/src/services/api/codexShim.ts
+++ b/src/services/api/codexShim.ts
@@ -1,4 +1,5 @@
 import { APIError } from '@anthropic-ai/sdk'
 import { compressToolHistory } from './compressToolHistory.js'
 import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
 import type {
  ResolvedCodexCredentials,
@@ -6,10 +7,9 @@ import type {
 } from './providerConfig.js'
 import { sanitizeSchemaForOpenAICompat } from './openaiSchemaSanitizer.js'
 import {
-  looksLikeLeakedReasoningPrefix,
+  createThinkTagFilter,
-  shouldBufferPotentialReasoningPrefix,
+  stripThinkTags,
-  stripLeakedReasoningPreamble,
+} from './thinkTagSanitizer.js'
 } from './reasoningLeakSanitizer.js'
 export interface AnthropicUsage {
  input_tokens: number
@@ -485,13 +485,15 @@ export async function performCodexRequest(options: {
  defaultHeaders: Record<string, string>
  signal?: AbortSignal
 }): Promise<Response> {
-  const input = convertAnthropicMessagesToResponsesInput(
+  const compressedMessages = compressToolHistory(
    options.params.messages as Array<{
      role?: string
      message?: { role?: string; content?: unknown }
      content?: unknown
    }>,
    options.request.resolvedModel,
  )
  const input = convertAnthropicMessagesToResponsesInput(compressedMessages)
  const body: Record<string, unknown> = {
    model: options.request.resolvedModel,
    input: input.length > 0
@@ -734,25 +736,22 @@ export async function* codexStreamToAnthropic(
    { index: number; toolUseId: string }
  >()
  let activeTextBlockIndex: number | null = null
-  let activeTextBuffer = ''
+  const thinkFilter = createThinkTagFilter()
  let textBufferMode: 'none' | 'pending' | 'strip' = 'none'
  let nextContentBlockIndex = 0
  let sawToolUse = false
  let finalResponse: Record<string, any> | undefined
  const closeActiveTextBlock = async function* () {
    if (activeTextBlockIndex === null) return
-    if (textBufferMode !== 'none') {
+    const tail = thinkFilter.flush()
-      const sanitized = stripLeakedReasoningPreamble(activeTextBuffer)
+    if (tail) {
-      if (sanitized) {
+      yield {
-        yield {
+        type: 'content_block_delta',
-          type: 'content_block_delta',
+        index: activeTextBlockIndex,
-          index: activeTextBlockIndex,
+        delta: {
-          delta: {
+          type: 'text_delta',
-            type: 'text_delta',
+          text: tail,
-            text: sanitized,
+        },
          },
        }
      }
    }
    yield {
@@ -760,8 +759,6 @@ export async function* codexStreamToAnthropic(
      index: activeTextBlockIndex,
    }
    activeTextBlockIndex = null
    activeTextBuffer = ''
    textBufferMode = 'none'
  }
  const startTextBlockIfNeeded = async function* () {
@@ -837,43 +834,17 @@ export async function* codexStreamToAnthropic(
    if (event.event === 'response.output_text.delta') {
      yield* startTextBlockIfNeeded()
      activeTextBuffer += payload.delta ?? ''
      if (activeTextBlockIndex !== null) {
-        if (
+        const visible = thinkFilter.feed(payload.delta ?? '')
-          textBufferMode === 'strip' ||
+        if (visible) {
          looksLikeLeakedReasoningPrefix(activeTextBuffer)
        ) {
          textBufferMode = 'strip'
          continue
        }
        if (textBufferMode === 'pending') {
          if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
            continue
          }
          yield {
            type: 'content_block_delta',
            index: activeTextBlockIndex,
            delta: {
              type: 'text_delta',
-              text: activeTextBuffer,
+              text: visible,
            },
          }
          textBufferMode = 'none'
          continue
        }
        if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
          textBufferMode = 'pending'
          continue
        }
        yield {
          type: 'content_block_delta',
          index: activeTextBlockIndex,
          delta: {
            type: 'text_delta',
            text: payload.delta ?? '',
          },
        }
      }
      continue
@@ -969,7 +940,7 @@ export function convertCodexResponseToAnthropicMessage(
        if (part?.type === 'output_text') {
          content.push({
            type: 'text',
-            text: stripLeakedReasoningPreamble(part.text ?? ''),
+            text: stripThinkTags(part.text ?? ''),
          })
        }
      }
--- a/src/services/api/compressToolHistory.test.ts
+++ b/src/services/api/compressToolHistory.test.ts
@@ -0,0 +1,572 @@
 import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
 import { compressToolHistory, getTiers } from './compressToolHistory.js'
 // Mock the two dependencies so tests are deterministic and don't read disk config.
 const mockState = {
  enabled: true,
  effectiveWindow: 100_000,
 }
 mock.module('../../utils/config.js', () => ({
  getGlobalConfig: () => ({
    toolHistoryCompressionEnabled: mockState.enabled,
  }),
 }))
 mock.module('../compact/autoCompact.js', () => ({
  getEffectiveContextWindowSize: () => mockState.effectiveWindow,
 }))
 beforeEach(() => {
  mockState.enabled = true
  mockState.effectiveWindow = 100_000
 })
 afterEach(() => {
  mockState.enabled = true
  mockState.effectiveWindow = 100_000
 })
 type Block = Record<string, unknown>
 type Msg = { role: string; content: Block[] | string }
 function bigText(n: number): string {
  return 'x'.repeat(n)
 }
 function buildToolExchange(id: number, resultLength: number): Msg[] {
  return [
    {
      role: 'assistant',
      content: [
        {
          type: 'tool_use',
          id: `toolu_${id}`,
          name: 'Read',
          input: { file_path: `/path/to/file${id}.ts` },
        },
      ],
    },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: `toolu_${id}`,
          content: bigText(resultLength),
        },
      ],
    },
  ]
 }
 function buildConversation(numToolExchanges: number, resultLength = 5_000): Msg[] {
  const out: Msg[] = [{ role: 'user', content: 'Initial request' }]
  for (let i = 0; i < numToolExchanges; i++) {
    out.push(...buildToolExchange(i, resultLength))
  }
  return out
 }
 function getResultMessages(messages: Msg[]): Msg[] {
  return messages.filter(
    m => Array.isArray(m.content) && m.content.some((b: any) => b.type === 'tool_result'),
  )
 }
 function getResultBlock(msg: Msg): Block {
  return (msg.content as Block[]).find((b: any) => b.type === 'tool_result') as Block
 }
 function getResultText(msg: Msg): string {
  const block = getResultBlock(msg)
  const c = block.content
  if (typeof c === 'string') return c
  if (Array.isArray(c)) {
    return c
      .filter((b: any) => b.type === 'text')
      .map((b: any) => b.text)
      .join('\n')
  }
  return ''
 }
 // ---------- getTiers ----------
 test('getTiers: < 16k window → recent=2, mid=3', () => {
  expect(getTiers(8_000)).toEqual({ recent: 2, mid: 3 })
 })
 test('getTiers: 16k–32k → recent=3, mid=5', () => {
  expect(getTiers(20_000)).toEqual({ recent: 3, mid: 5 })
 })
 test('getTiers: 32k–64k → recent=4, mid=8', () => {
  expect(getTiers(48_000)).toEqual({ recent: 4, mid: 8 })
 })
 test('getTiers: 64k–128k (Copilot gpt-4o) → recent=5, mid=10', () => {
  expect(getTiers(100_000)).toEqual({ recent: 5, mid: 10 })
 })
 test('getTiers: 128k–256k (Copilot Claude) → recent=8, mid=15', () => {
  expect(getTiers(200_000)).toEqual({ recent: 8, mid: 15 })
 })
 test('getTiers: 256k–500k → recent=12, mid=25', () => {
  expect(getTiers(400_000)).toEqual({ recent: 12, mid: 25 })
 })
 test('getTiers: ≥ 500k (gpt-4.1 1M) → recent=25, mid=50', () => {
  expect(getTiers(1_000_000)).toEqual({ recent: 25, mid: 50 })
 })
 // ---------- master switch ----------
 test('pass-through when toolHistoryCompressionEnabled is false', () => {
  mockState.enabled = false
  const messages = buildConversation(20)
  const result = compressToolHistory(messages, 'gpt-4o')
  expect(result).toBe(messages) // same reference (no transformation)
 })
 test('pass-through when total tool_results <= recent tier', () => {
  // 100k effective → recent=5; only 4 exchanges → no compression
  const messages = buildConversation(4)
  const result = compressToolHistory(messages, 'gpt-4o')
  expect(result).toBe(messages)
 })
 // ---------- per-tier behavior ----------
 test('recent tier: tool_result content untouched', () => {
  // 100k effective → recent=5, mid=10. With 6 exchanges, only the oldest is touched.
  const messages = buildConversation(6, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // Last 5 should be untouched (full 5000 chars)
  for (let i = resultMsgs.length - 5; i < resultMsgs.length; i++) {
    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
  }
 })
 test('mid tier: long content truncated to MID_MAX_CHARS with marker', () => {
  // 100k → recent=5, mid=10. 10 exchanges: 5 recent + 5 mid (none old).
  const messages = buildConversation(10, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // First 5 are mid tier — should be truncated to ~2000 chars + marker
  for (let i = 0; i < 5; i++) {
    const text = getResultText(resultMsgs[i])
    expect(text).toContain('[…truncated')
    expect(text).toContain('chars from tool history]')
    // Should be roughly 2000 chars + marker (under 2200)
    expect(text.length).toBeLessThan(2_200)
    expect(text.length).toBeGreaterThan(2_000)
  }
 })
 test('mid tier: short content (< MID_MAX_CHARS) untouched', () => {
  const messages = buildConversation(10, 500) // 500 < MID_MAX_CHARS
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  for (let i = 0; i < 5; i++) {
    expect(getResultText(resultMsgs[i])).toBe(bigText(500))
  }
 })
 test('old tier: content replaced with stub [name args={...} → N chars omitted]', () => {
  // 100k → recent=5, mid=10, old=rest. 20 exchanges → 5 old + 10 mid + 5 recent.
  const messages = buildConversation(20, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // First 5 are old tier — should be stubs
  for (let i = 0; i < 5; i++) {
    const text = getResultText(resultMsgs[i])
    expect(text).toMatch(/^\[Read args=\{.*\} → 5000 chars omitted\]$/)
  }
 })
 test('old tier: stub args truncated to 200 chars', () => {
  const longArg = bigText(500)
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [
        {
          type: 'tool_use',
          id: 'toolu_x',
          name: 'Bash',
          input: { command: longArg },
        },
      ],
    },
    {
      role: 'user',
      content: [
        { type: 'tool_result', tool_use_id: 'toolu_x', content: 'output' },
      ],
    },
    // Pad with enough recent exchanges to push the above into old tier
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  const text = getResultText(resultMsgs[0])
  // Stub format: [Bash args=<json≤200chars> → N chars omitted]
  // The args portion (between args= and →) must be ≤ 200 chars.
  const argsMatch = text.match(/args=(.*?) →/)
  expect(argsMatch).not.toBeNull()
  expect(argsMatch![1].length).toBeLessThanOrEqual(200)
 })
 test('old tier: orphan tool_result (no matching tool_use) falls back to "tool"', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    // Orphan: tool_result without matching tool_use in history
    {
      role: 'user',
      content: [
        { type: 'tool_result', tool_use_id: 'orphan_id', content: 'data' },
      ],
    },
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  const text = getResultText(resultMsgs[0])
  expect(text).toMatch(/^\[tool args=\{\} → 4 chars omitted\]$/)
 })
 // ---------- structural preservation ----------
 test('tool_use blocks always preserved', () => {
  const messages = buildConversation(20, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const useCount = (msgs: Msg[]) =>
    msgs.reduce((sum, m) => {
      if (!Array.isArray(m.content)) return sum
      return sum + m.content.filter((b: any) => b.type === 'tool_use').length
    }, 0)
  expect(useCount(result as Msg[])).toBe(useCount(messages))
 })
 test('text blocks always preserved', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'first' },
    {
      role: 'assistant',
      content: [
        { type: 'text', text: 'reasoning before tool' },
        { type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
      ],
    },
    {
      role: 'user',
      content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
    },
    ...buildConversation(20, 5_000).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const assistantMsg = (result as Msg[])[1]
  const textBlock = (assistantMsg.content as Block[]).find((b: any) => b.type === 'text')
  expect(textBlock).toEqual({ type: 'text', text: 'reasoning before tool' })
 })
 test('thinking blocks always preserved', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'first' },
    {
      role: 'assistant',
      content: [
        { type: 'thinking', thinking: 'internal reasoning', signature: 'sig' },
        { type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
      ],
    },
    {
      role: 'user',
      content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
    },
    ...buildConversation(20, 5_000).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const assistantMsg = (result as Msg[])[1]
  const thinking = (assistantMsg.content as Block[]).find((b: any) => b.type === 'thinking')
  expect(thinking).toEqual({
    type: 'thinking',
    thinking: 'internal reasoning',
    signature: 'sig',
  })
 })
 test('non-array content (string) handled gracefully', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'plain string content' },
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  expect((result as Msg[])[0].content).toBe('plain string content')
 })
 test('empty content array handled gracefully', () => {
  const messages: Msg[] = [
    { role: 'user', content: [] },
    ...buildConversation(20, 100).slice(1),
  ]
  expect(() => compressToolHistory(messages, 'gpt-4o')).not.toThrow()
 })
 // ---------- message shape compatibility ----------
 test('wrapped shape ({ message: { role, content } }) handled', () => {
  type WrappedMsg = { message: { role: string; content: Block[] | string } }
  const wrap = (m: Msg): WrappedMsg => ({ message: { role: m.role, content: m.content } })
  const messages = buildConversation(20, 5_000).map(wrap)
  const result = compressToolHistory(messages as any, 'gpt-4o')
  // First wrapped tool-result message should have stub content (old tier)
  const firstResultMsg = (result as WrappedMsg[]).find(
    m =>
      Array.isArray(m.message.content) &&
      m.message.content.some((b: any) => b.type === 'tool_result'),
  )
  const block = (firstResultMsg!.message.content as Block[]).find(
    (b: any) => b.type === 'tool_result',
  ) as Block
  const text = ((block.content as Block[])[0] as any).text
  expect(text).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
 })
 test('flat shape ({ role, content }) handled', () => {
  const messages = buildConversation(20, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  expect(getResultText(resultMsgs[0])).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
 })
 // ---------- tier boundary correctness ----------
 test('tier boundaries: 6 exchanges → 1 mid + 5 recent (recent=5)', () => {
  const messages = buildConversation(6, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // Oldest: mid (truncated)
  expect(getResultText(resultMsgs[0])).toContain('[…truncated')
  // Last 5: untouched
  for (let i = 1; i < 6; i++) {
    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
  }
 })
 test('tier boundaries: 16 exchanges → 1 old + 10 mid + 5 recent', () => {
  const messages = buildConversation(16, 5_000)
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // Oldest 1: stub (old tier)
  expect(getResultText(resultMsgs[0])).toMatch(/^\[Read .*chars omitted\]$/)
  // Next 10: mid (truncated)
  for (let i = 1; i < 11; i++) {
    expect(getResultText(resultMsgs[i])).toContain('[…truncated')
  }
  // Last 5: untouched
  for (let i = 11; i < 16; i++) {
    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
  }
 })
 test('large window (1M) with 30 exchanges: all untouched (recent=25 ≥ 30 - 5)', () => {
  // ≥500k → recent=25, mid=50. 30 exchanges → 5 mid + 25 recent. None old.
  mockState.effectiveWindow = 1_000_000
  const messages = buildConversation(30, 5_000)
  const result = compressToolHistory(messages, 'gpt-4.1')
  const resultMsgs = getResultMessages(result)
  // Last 25: untouched
  for (let i = 5; i < 30; i++) {
    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
  }
 })
 // ---------- attribute preservation ----------
 test('is_error flag preserved in mid tier', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
    },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: 'toolu_err',
          is_error: true,
          content: bigText(5_000),
        },
      ],
    },
    // Pad with enough recent exchanges to push the above into MID tier
    ...buildConversation(10, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
  expect(block.is_error).toBe(true)
  expect(getResultText(resultMsgs[0])).toContain('[…truncated')
 })
 test('is_error flag preserved in old tier (stub)', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
    },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: 'toolu_err',
          is_error: true,
          content: bigText(5_000),
        },
      ],
    },
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
  expect(block.is_error).toBe(true)
  expect(getResultText(resultMsgs[0])).toMatch(/^\[Bash .*chars omitted\]$/)
 })
 // ---------- COMPACTABLE_TOOLS filter ----------
 test('non-compactable tool (e.g. Task/Agent) is NEVER compressed', () => {
  // Build conversation where the OLDEST exchange uses a non-compactable tool name
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [
        { type: 'tool_use', id: 'task_1', name: 'Task', input: { goal: 'plan' } },
      ],
    },
    {
      role: 'user',
      content: [
        { type: 'tool_result', tool_use_id: 'task_1', content: bigText(5_000) },
      ],
    },
    // Pad with 20 compactable exchanges to push Task into old tier
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // First tool_result is for Task (non-compactable) → must remain full
  expect(getResultText(resultMsgs[0]).length).toBe(5_000)
  expect(getResultText(resultMsgs[0])).not.toContain('chars omitted')
  expect(getResultText(resultMsgs[0])).not.toContain('[…truncated')
 })
 test('mcp__ prefixed tools ARE compactable (matches microCompact behavior)', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [
        { type: 'tool_use', id: 'mcp_1', name: 'mcp__github__get_issue', input: {} },
      ],
    },
    {
      role: 'user',
      content: [
        { type: 'tool_result', tool_use_id: 'mcp_1', content: bigText(5_000) },
      ],
    },
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // MCP tool result is compressed (gets stub since it's in old tier)
  expect(getResultText(resultMsgs[0])).toMatch(/^\[mcp__github__get_issue .*chars omitted\]$/)
 })
 // ---------- skip already-cleared blocks ----------
 test('blocks already cleared by microCompact are NOT re-compressed', () => {
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [{ type: 'tool_use', id: 'cleared_1', name: 'Read', input: {} }],
    },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: 'cleared_1',
          content: '[Old tool result content cleared]', // microCompact's marker
        },
      ],
    },
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  // Already-cleared marker survives untouched (no double processing)
  expect(getResultText(resultMsgs[0])).toBe('[Old tool result content cleared]')
 })
 test('extra block attributes (e.g. cache_control) preserved across rewrites', () => {
  const cacheControl = { type: 'ephemeral' }
  const messages: Msg[] = [
    { role: 'user', content: 'start' },
    {
      role: 'assistant',
      content: [{ type: 'tool_use', id: 'toolu_cc', name: 'Read', input: {} }],
    },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: 'toolu_cc',
          cache_control: cacheControl,
          content: bigText(5_000),
        },
      ],
    },
    ...buildConversation(20, 100).slice(1),
  ]
  const result = compressToolHistory(messages, 'gpt-4o')
  const resultMsgs = getResultMessages(result)
  const block = getResultBlock(resultMsgs[0]) as { cache_control?: unknown }
  // The custom attribute survived the stub rewrite via ...block spread
  expect(block.cache_control).toEqual(cacheControl)
 })
--- a/src/services/api/compressToolHistory.ts
+++ b/src/services/api/compressToolHistory.ts
@@ -0,0 +1,255 @@
 /**
 * Compresses old tool_result content for stateless OpenAI-compatible providers
 * (Copilot, Mistral, Ollama). Preserves all conversation structure — tool_use,
 * tool_result pairing, text, thinking, and is_error all survive intact. Only
 * the BULK text of older tool_results is shrunk to delay context saturation.
 *
 * Tier sizes scale with the model's effective context window via
 * getEffectiveContextWindowSize() — same calculation used by auto-compact, so
 * the two systems stay aligned.
 *
 * Complements (does not replace) microCompact.ts:
 * - microCompact: time/cache-based, runs from query.ts, binary clear/keep,
 *   limited to Claude (cache editing) or idle gaps (time-based).
 * - compressToolHistory: size-based, runs at the shim layer, tiered
 *   compression, covers the gap for active sessions on non-Claude providers.
 *
 * Reuses isCompactableTool from microCompact to avoid touching tools the
 * project already classifies as unsafe to compress (e.g. Task, Agent).
 * Skips blocks already cleared by microCompact (TOOL_RESULT_CLEARED_MESSAGE).
 *
 * Anthropic native bypasses both shims, so it is unaffected by this module.
 */
 import { getEffectiveContextWindowSize } from '../compact/autoCompact.js'
 import { isCompactableTool } from '../compact/microCompact.js'
 import { TOOL_RESULT_CLEARED_MESSAGE } from '../../utils/toolResultStorage.js'
 import { getGlobalConfig } from '../../utils/config.js'
 // Mid-tier truncation budget. 2k chars ≈ 500 tokens, enough to preserve the
 // shape of most tool outputs (file headers, command stderr, top grep hits)
 // without ballooning context. Bump too high and the tier loses its purpose.
 const MID_MAX_CHARS = 2_000
 // Stub args budget. JSON.stringify of a typical tool input fits in 200 chars
 // (file paths, short commands, small queries). Long inputs are rare and clamping
 // here keeps the stub size bounded even when callers pass oversized arguments.
 const STUB_ARGS_MAX_CHARS = 200
 type AnyMessage = {
  role?: string
  message?: { role?: string; content?: unknown }
  content?: unknown
 }
 type ToolResultBlock = {
  type: 'tool_result'
  tool_use_id?: string
  is_error?: boolean
  content?: unknown
 }
 type ToolUseBlock = {
  type: 'tool_use'
  id?: string
  name?: string
  input?: unknown
 }
 type Tiers = { recent: number; mid: number }
 // Tier sizes scale with effective window. Targets roughly:
 // - recent tier stays under ~25% of available window (full fidelity kept)
 // - recent + mid tier stays under ~50% of available window (bounded bulk)
 // - everything older collapses to ~15-token stubs
 // Values assume ~5KB avg tool_result, which matches the Copilot default case
 // (parallel_tool_calls=true means multiple Read/Bash outputs per turn). For
 // ≥ 500k models the tiers are so generous that compression is effectively
 // inert for any realistic session — see compressToolHistory.test.ts.
 export function getTiers(effectiveWindow: number): Tiers {
  if (effectiveWindow < 16_000) return { recent: 2, mid: 3 }
  if (effectiveWindow < 32_000) return { recent: 3, mid: 5 }
  if (effectiveWindow < 64_000) return { recent: 4, mid: 8 }
  if (effectiveWindow < 128_000) return { recent: 5, mid: 10 }
  if (effectiveWindow < 256_000) return { recent: 8, mid: 15 }
  if (effectiveWindow < 500_000) return { recent: 12, mid: 25 }
  return { recent: 25, mid: 50 }
 }
 function extractText(content: unknown): string {
  if (typeof content === 'string') return content
  if (Array.isArray(content)) {
    return content
      .filter(
        (b: { type?: string; text?: string }) =>
          b?.type === 'text' && typeof b.text === 'string',
      )
      .map((b: { text?: string }) => b.text ?? '')
      .join('\n')
  }
  return ''
 }
 // Old-tier compression strategy. Replaces content entirely with a one-line
 // metadata marker ~10× more token-efficient than a 500-char truncation AND
 // unambiguous — partial truncations can look authoritative to the model. The
 // stub format encodes tool name + args so the model can re-invoke the same
 // tool if it needs the omitted output back.
 function buildStub(
  block: ToolResultBlock,
  toolUsesById: Map<string, ToolUseBlock>,
 ): ToolResultBlock {
  const original = extractText(block.content)
  const toolUse = toolUsesById.get(block.tool_use_id ?? '')
  const name = toolUse?.name ?? 'tool'
  const args = toolUse?.input
    ? JSON.stringify(toolUse.input).slice(0, STUB_ARGS_MAX_CHARS)
    : '{}'
  return {
    ...block,
    content: [
      {
        type: 'text',
        text: `[${name} args=${args} → ${original.length} chars omitted]`,
      },
    ],
  }
 }
 // Mid-tier compression. The trailing marker is load-bearing: without it, the
 // model can't distinguish "tool returned 2000 chars" from "tool returned 20k
 // chars that we cut to 2000". Distinguishing those matters for the model's
 // decision to re-invoke the tool.
 function truncateBlock(
  block: ToolResultBlock,
  maxChars: number,
 ): ToolResultBlock {
  const text = extractText(block.content)
  if (text.length <= maxChars) return block
  const omitted = text.length - maxChars
  return {
    ...block,
    content: [
      {
        type: 'text',
        text: `${text.slice(0, maxChars)}\n[…truncated ${omitted} chars from tool history]`,
      },
    ],
  }
 }
 function getInner(msg: AnyMessage): { role?: string; content?: unknown } {
  return (msg.message ?? msg) as { role?: string; content?: unknown }
 }
 function indexToolUses(messages: AnyMessage[]): Map<string, ToolUseBlock> {
  const map = new Map<string, ToolUseBlock>()
  for (const msg of messages) {
    const content = getInner(msg).content
    if (!Array.isArray(content)) continue
    for (const b of content as Array<{ type?: string; id?: string }>) {
      if (b?.type === 'tool_use' && b.id) {
        map.set(b.id, b as ToolUseBlock)
      }
    }
  }
  return map
 }
 function indexToolResultMessages(messages: AnyMessage[]): number[] {
  const indices: number[] = []
  for (let i = 0; i < messages.length; i++) {
    const inner = getInner(messages[i])
    const role = inner.role ?? messages[i].role
    const content = inner.content
    if (
      role === 'user' &&
      Array.isArray(content) &&
      content.some((b: { type?: string }) => b?.type === 'tool_result')
    ) {
      indices.push(i)
    }
  }
  return indices
 }
 function rewriteMessage<T extends AnyMessage>(
  msg: T,
  newContent: unknown[],
 ): T {
  if (msg.message) {
    return { ...msg, message: { ...msg.message, content: newContent } }
  }
  return { ...msg, content: newContent }
 }
 // microCompact.maybeTimeBasedMicrocompact may have already replaced old
 // tool_result content with TOOL_RESULT_CLEARED_MESSAGE before we see it.
 // Re-compressing produces a stub over a marker (e.g. `[Read args={} → 40
 // chars omitted]`), wasteful and less informative than the canonical marker.
 function isAlreadyCleared(block: ToolResultBlock): boolean {
  const text = extractText(block.content)
  return text === TOOL_RESULT_CLEARED_MESSAGE
 }
 function shouldCompressBlock(
  block: ToolResultBlock,
  toolUsesById: Map<string, ToolUseBlock>,
 ): boolean {
  if (isAlreadyCleared(block)) return false
  const toolUse = toolUsesById.get(block.tool_use_id ?? '')
  // Unknown tool name (orphan tool_result with no matching tool_use) falls
  // through to compression with a generic "tool" stub. Safer default: the
  // original tool_use vanished so there's no downstream use for the output.
  if (!toolUse?.name) return true
  // Respect microCompact's curated safe-to-compress set (Read/Bash/Grep/…/
  // mcp__*) so user-facing flow tools (Task, Agent, custom) stay intact.
  return isCompactableTool(toolUse.name)
 }
 export function compressToolHistory<T extends AnyMessage>(
  messages: T[],
  model: string,
 ): T[] {
  // Master kill-switch. Returns the original reference so callers skip a
  // defensive copy when the feature is disabled.
  if (!getGlobalConfig().toolHistoryCompressionEnabled) return messages
  const tiers = getTiers(getEffectiveContextWindowSize(model))
  const toolResultIndices = indexToolResultMessages(messages)
  const total = toolResultIndices.length
  // If every tool-result fits in the recent tier, no boundary crosses; return
  // the same reference for the same copy-elision reason.
  if (total <= tiers.recent) return messages
  // O(1) lookup: messageIndex → tool-result position (0 = oldest). Replaces
  // the naive Array.indexOf(i) that was O(n²) across the .map below.
  const positionByIndex = new Map<number, number>()
  for (let pos = 0; pos < toolResultIndices.length; pos++) {
    positionByIndex.set(toolResultIndices[pos], pos)
  }
  const toolUsesById = indexToolUses(messages)
  return messages.map((msg, i) => {
    const pos = positionByIndex.get(i)
    if (pos === undefined) return msg
    const fromEnd = total - 1 - pos
    if (fromEnd < tiers.recent) return msg
    const inMidWindow = fromEnd < tiers.recent + tiers.mid
    const content = getInner(msg).content as unknown[]
    const newContent = content.map(block => {
      const b = block as { type?: string }
      if (b?.type !== 'tool_result') return block
      const tr = block as ToolResultBlock
      if (!shouldCompressBlock(tr, toolUsesById)) return block
      return inMidWindow
        ? truncateBlock(tr, MID_MAX_CHARS)
        : buildStub(tr, toolUsesById)
    })
    return rewriteMessage(msg, newContent)
  })
 }
--- a/src/services/api/openaiErrorClassification.ts
+++ b/src/services/api/openaiErrorClassification.ts
@@ -320,10 +320,7 @@ export function classifyOpenAIHttpFailure(options: {
    }
  }
-  if (
+  if (options.status >= 400 && isMalformedProviderResponse(body)) {
    (options.status >= 200 && options.status < 300 && isMalformedProviderResponse(body)) ||
    (options.status >= 400 && isMalformedProviderResponse(body))
  ) {
    return {
      source: 'http',
      category: 'malformed_provider_response',
--- a/src/services/api/openaiShim.compression.test.ts
+++ b/src/services/api/openaiShim.compression.test.ts
@@ -0,0 +1,317 @@
 import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
 import { createOpenAIShimClient } from './openaiShim.js'
 type FetchType = typeof globalThis.fetch
 const originalFetch = globalThis.fetch
 const originalEnv = {
  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
  OPENAI_MODEL: process.env.OPENAI_MODEL,
 }
 // Mock config + autoCompact so the shim sees deterministic state.
 const mockState = {
  enabled: true,
  effectiveWindow: 100_000, // Copilot gpt-4o tier
 }
 mock.module('../../utils/config.js', () => ({
  getGlobalConfig: () => ({
    toolHistoryCompressionEnabled: mockState.enabled,
    autoCompactEnabled: false,
  }),
 }))
 mock.module('../compact/autoCompact.js', () => ({
  getEffectiveContextWindowSize: () => mockState.effectiveWindow,
 }))
 type OpenAIShimClient = {
  beta: {
    messages: {
      create: (
        params: Record<string, unknown>,
        options?: Record<string, unknown>,
      ) => Promise<unknown>
    }
  }
 }
 function bigText(n: number): string {
  return 'A'.repeat(n)
 }
 function buildToolExchange(id: number, resultLength: number) {
  return [
    {
      role: 'assistant',
      content: [
        {
          type: 'tool_use',
          id: `toolu_${id}`,
          name: 'Read',
          input: { file_path: `/path/to/file${id}.ts` },
        },
      ],
    },
    {
      role: 'user',
      content: [
        {
          type: 'tool_result',
          tool_use_id: `toolu_${id}`,
          content: bigText(resultLength),
        },
      ],
    },
  ]
 }
 function buildLongConversation(numExchanges: number, resultLength = 5_000) {
  const out: Array<{ role: string; content: unknown }> = [
    { role: 'user', content: 'start the work' },
  ]
  for (let i = 0; i < numExchanges; i++) {
    out.push(...buildToolExchange(i, resultLength))
  }
  return out
 }
 function makeFakeResponse(): Response {
  return new Response(
    JSON.stringify({
      id: 'chatcmpl-1',
      model: 'gpt-4o',
      choices: [
        {
          message: { role: 'assistant', content: 'done' },
          finish_reason: 'stop',
        },
      ],
      usage: { prompt_tokens: 8, completion_tokens: 2, total_tokens: 10 },
    }),
    { headers: { 'Content-Type': 'application/json' } },
  )
 }
 beforeEach(() => {
  process.env.OPENAI_BASE_URL = 'http://example.test/v1'
  process.env.OPENAI_API_KEY = 'test-key'
  delete process.env.OPENAI_MODEL
  mockState.enabled = true
  mockState.effectiveWindow = 100_000
 })
 afterEach(() => {
  if (originalEnv.OPENAI_BASE_URL === undefined) delete process.env.OPENAI_BASE_URL
  else process.env.OPENAI_BASE_URL = originalEnv.OPENAI_BASE_URL
  if (originalEnv.OPENAI_API_KEY === undefined) delete process.env.OPENAI_API_KEY
  else process.env.OPENAI_API_KEY = originalEnv.OPENAI_API_KEY
  if (originalEnv.OPENAI_MODEL === undefined) delete process.env.OPENAI_MODEL
  else process.env.OPENAI_MODEL = originalEnv.OPENAI_MODEL
  globalThis.fetch = originalFetch
 })
 async function captureRequestBody(
  messages: Array<{ role: string; content: unknown }>,
  model: string,
 ): Promise<Record<string, unknown>> {
  let captured: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    captured = JSON.parse(String(init?.body))
    return makeFakeResponse()
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model,
    system: 'system prompt',
    messages,
  })
  if (!captured) throw new Error('request not captured')
  return captured
 }
 function getToolMessages(body: Record<string, unknown>): Array<{ content: string }> {
  const messages = body.messages as Array<{ role: string; content: string }>
  return messages.filter(m => m.role === 'tool')
 }
 function getAssistantToolCalls(body: Record<string, unknown>): unknown[] {
  const messages = body.messages as Array<{
    role: string
    tool_calls?: unknown[]
  }>
  return messages
    .filter(m => m.role === 'assistant' && Array.isArray(m.tool_calls))
    .flatMap(m => m.tool_calls ?? [])
 }
 // ============================================================================
 // BUG REPRO: without compression, full tool history is resent every turn
 // ============================================================================
 test('BUG REPRO: without compression, all 30 tool results are sent at full size', async () => {
  mockState.enabled = false
  const messages = buildLongConversation(30, 5_000)
  const body = await captureRequestBody(messages, 'gpt-4o')
  const toolMessages = getToolMessages(body)
  const payloadSize = JSON.stringify(body).length
  // All 30 tool results present, none truncated
  expect(toolMessages.length).toBe(30)
  for (const m of toolMessages) {
    expect(m.content.length).toBeGreaterThanOrEqual(5_000)
    expect(m.content).not.toContain('[…truncated')
    expect(m.content).not.toContain('chars omitted')
  }
  // Total payload is large (~150KB raw) — this is the cost being paid every turn
  expect(payloadSize).toBeGreaterThan(150_000)
 })
 // ============================================================================
 // FIX: with compression, recent kept full, mid truncated, old stubbed
 // ============================================================================
 test('FIX: with compression on Copilot gpt-4o (tier 5/10/rest), 30 turns shrinks dramatically', async () => {
  mockState.enabled = true
  mockState.effectiveWindow = 100_000 // 64–128k → recent=5, mid=10
  const messages = buildLongConversation(30, 5_000)
  const body = await captureRequestBody(messages, 'gpt-4o')
  const toolMessages = getToolMessages(body)
  const payloadSize = JSON.stringify(body).length
  // Structure preserved: still 30 tool messages, no orphan tool_calls
  expect(toolMessages.length).toBe(30)
  expect(getAssistantToolCalls(body).length).toBe(30)
  // Tier breakdown (oldest → newest):
  //   indices 0..14  → old tier (stubs)
  //   indices 15..24 → mid tier (truncated)
  //   indices 25..29 → recent (full)
  for (let i = 0; i <= 14; i++) {
    expect(toolMessages[i].content).toMatch(/^\[Read args=.*chars omitted\]$/)
  }
  for (let i = 15; i <= 24; i++) {
    expect(toolMessages[i].content).toContain('[…truncated')
  }
  for (let i = 25; i <= 29; i++) {
    expect(toolMessages[i].content.length).toBe(5_000)
    expect(toolMessages[i].content).not.toContain('[…truncated')
    expect(toolMessages[i].content).not.toContain('chars omitted')
  }
  // Significant reduction: from ~150KB to <60KB (10 mid×2KB + structure overhead)
  expect(payloadSize).toBeLessThan(60_000)
 })
 // ============================================================================
 // FIX: large-context model gets generous tiers — compression effectively inert
 // ============================================================================
 test('FIX: gpt-4.1 (1M context) with 25 exchanges keeps all full (recent tier=25)', async () => {
  mockState.enabled = true
  mockState.effectiveWindow = 1_000_000 // ≥500k → recent=25, mid=50
  const messages = buildLongConversation(25, 5_000)
  const body = await captureRequestBody(messages, 'gpt-4.1')
  const toolMessages = getToolMessages(body)
  expect(toolMessages.length).toBe(25)
  for (const m of toolMessages) {
    expect(m.content.length).toBe(5_000)
    expect(m.content).not.toContain('[…truncated')
    expect(m.content).not.toContain('chars omitted')
  }
 })
 test('FIX: gpt-4.1 (1M context) with 30 exchanges → only first 5 mid-truncated', async () => {
  mockState.enabled = true
  mockState.effectiveWindow = 1_000_000 // recent=25, mid=50
  const messages = buildLongConversation(30, 5_000)
  const body = await captureRequestBody(messages, 'gpt-4.1')
  const toolMessages = getToolMessages(body)
  // 30 total: indices 0..4 mid, indices 5..29 recent
  for (let i = 0; i < 5; i++) {
    expect(toolMessages[i].content).toContain('[…truncated')
  }
  for (let i = 5; i < 30; i++) {
    expect(toolMessages[i].content.length).toBe(5_000)
  }
 })
 // ============================================================================
 // FIX: stub preserves tool name and args — model can re-invoke if needed
 // ============================================================================
 test('FIX: stub format includes original tool name and arguments', async () => {
  mockState.enabled = true
  mockState.effectiveWindow = 100_000
  const messages = buildLongConversation(30, 5_000)
  const body = await captureRequestBody(messages, 'gpt-4o')
  const toolMessages = getToolMessages(body)
  const oldestStub = toolMessages[0].content
  // Format: [<tool_name> args=<json> → <N> chars omitted]
  expect(oldestStub).toMatch(/^\[Read /)
  expect(oldestStub).toMatch(/file_path/)
  expect(oldestStub).toMatch(/→ 5000 chars omitted\]$/)
 })
 // ============================================================================
 // FIX: tool_use blocks (assistant tool_calls) are never modified
 // ============================================================================
 test('FIX: every tool_call retains its full id, name, and arguments', async () => {
  mockState.enabled = true
  mockState.effectiveWindow = 100_000
  const messages = buildLongConversation(30, 5_000)
  const body = await captureRequestBody(messages, 'gpt-4o')
  const toolCalls = getAssistantToolCalls(body) as Array<{
    id: string
    function: { name: string; arguments: string }
  }>
  expect(toolCalls.length).toBe(30)
  for (let i = 0; i < toolCalls.length; i++) {
    expect(toolCalls[i].id).toBe(`toolu_${i}`)
    expect(toolCalls[i].function.name).toBe('Read')
    expect(JSON.parse(toolCalls[i].function.arguments)).toEqual({
      file_path: `/path/to/file${i}.ts`,
    })
  }
 })
 // ============================================================================
 // FIX: small-context provider (Mistral 32k) gets aggressive compression
 // ============================================================================
 test('FIX: 32k window (Mistral tier) → recent=3 keeps last 3 only', async () => {
  mockState.enabled = true
  mockState.effectiveWindow = 24_000 // 16–32k → recent=3, mid=5
  const messages = buildLongConversation(15, 3_000)
  const body = await captureRequestBody(messages, 'mistral-large-latest')
  const toolMessages = getToolMessages(body)
  // 15 total: indices 0..6 old, 7..11 mid, 12..14 recent
  for (let i = 0; i <= 6; i++) {
    expect(toolMessages[i].content).toContain('chars omitted')
  }
  for (let i = 7; i <= 11; i++) {
    expect(toolMessages[i].content).toContain('[…truncated')
  }
  for (let i = 12; i <= 14; i++) {
    expect(toolMessages[i].content.length).toBe(3_000)
  }
 })
--- a/src/services/api/openaiShim.diagnostics.test.ts
+++ b/src/services/api/openaiShim.diagnostics.test.ts
@@ -117,3 +117,170 @@ test('redacts credentials in transport diagnostic URL logs', async () => {
  expect(logLine).not.toContain('user:supersecret')
  expect(logLine).not.toContain('supersecret@')
 })
 test('logs self-heal localhost fallback with redacted from/to URLs', async () => {
  const debugSpy = mock(() => {})
  mock.module('../../utils/debug.js', () => ({
    logForDebugging: debugSpy,
  }))
  const nonce = `${Date.now()}-${Math.random()}`
  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
  process.env.OPENAI_BASE_URL = 'http://user:supersecret@localhost:11434/v1'
  process.env.OPENAI_API_KEY = 'supersecret'
  globalThis.fetch = mock(async (input: string | Request) => {
    const url = typeof input === 'string' ? input : input.url
    if (url.includes('localhost')) {
      throw Object.assign(new TypeError('fetch failed'), {
        code: 'ENOTFOUND',
      })
    }
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'qwen2.5-coder:7b',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'ok',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 5,
          completion_tokens: 2,
          total_tokens: 7,
        },
      }),
      {
        status: 200,
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as typeof globalThis.fetch
  const client = createOpenAIShimClient({}) as {
    beta: {
      messages: {
        create: (params: Record<string, unknown>) => Promise<unknown>
      }
    }
  }
  await expect(
    client.beta.messages.create({
      model: 'qwen2.5-coder:7b',
      messages: [{ role: 'user', content: 'hello' }],
      max_tokens: 64,
      stream: false,
    }),
  ).resolves.toBeDefined()
  const fallbackLog = debugSpy.mock.calls.find(call =>
    typeof call?.[0] === 'string' &&
    call[0].includes('self-heal retry reason=localhost_resolution_failed'),
  )
  expect(fallbackLog).toBeDefined()
  const logLine = String(fallbackLog?.[0])
  expect(logLine).toContain('from=http://redacted:redacted@localhost:11434/v1/chat/completions')
  expect(logLine).toContain('to=http://redacted:redacted@127.0.0.1:11434/v1/chat/completions')
  expect(logLine).not.toContain('supersecret')
 })
 test('logs self-heal toolless retry for local tool-call incompatibility', async () => {
  const debugSpy = mock(() => {})
  mock.module('../../utils/debug.js', () => ({
    logForDebugging: debugSpy,
  }))
  const nonce = `${Date.now()}-${Math.random()}`
  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
  process.env.OPENAI_API_KEY = 'ollama'
  let callCount = 0
  globalThis.fetch = mock(async () => {
    callCount += 1
    if (callCount === 1) {
      return new Response('tool_calls are not supported', {
        status: 400,
        headers: {
          'Content-Type': 'text/plain',
        },
      })
    }
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'qwen2.5-coder:7b',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'ok',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 7,
          completion_tokens: 3,
          total_tokens: 10,
        },
      }),
      {
        status: 200,
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as typeof globalThis.fetch
  const client = createOpenAIShimClient({}) as {
    beta: {
      messages: {
        create: (params: Record<string, unknown>) => Promise<unknown>
      }
    }
  }
  await expect(
    client.beta.messages.create({
      model: 'qwen2.5-coder:7b',
      messages: [{ role: 'user', content: 'hello' }],
      tools: [
        {
          name: 'Read',
          description: 'Read file',
          input_schema: {
            type: 'object',
            properties: {
              filePath: { type: 'string' },
            },
            required: ['filePath'],
          },
        },
      ],
      max_tokens: 64,
      stream: false,
    }),
  ).resolves.toBeDefined()
  const fallbackLog = debugSpy.mock.calls.find(call =>
    typeof call?.[0] === 'string' &&
    call[0].includes('self-heal retry reason=tool_call_incompatible mode=toolless'),
  )
  expect(fallbackLog).toBeDefined()
  expect(fallbackLog?.[1]).toEqual({ level: 'warn' })
 })
--- a/src/services/api/openaiShim.test.ts
+++ b/src/services/api/openaiShim.test.ts
@@ -2513,7 +2513,7 @@ test('non-streaming: real content takes precedence over reasoning_content', asyn
  ])
 })
-test('non-streaming: strips leaked reasoning preamble from assistant content', async () => {
+test('non-streaming: strips <think> tag block from assistant content', async () => {
  globalThis.fetch = (async () => {
    return new Response(
      JSON.stringify({
@@ -2524,7 +2524,7 @@ test('non-streaming: strips leaked reasoning preamble from assistant content', a
            message: {
              role: 'assistant',
              content:
-                'The user just said "hey" - a simple greeting. I should respond briefly and friendly.\n\nHey! How can I help you today?',
+                '<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?',
            },
            finish_reason: 'stop',
          },
@@ -2645,7 +2645,7 @@ test('streaming: thinking block closed before tool call', async () => {
  expect(thinkingStart?.content_block?.type).toBe('thinking')
 })
-test('streaming: strips leaked reasoning preamble from assistant content deltas', async () => {
+test('streaming: strips <think> tag block from assistant content deltas', async () => {
  globalThis.fetch = (async () => {
    const chunks = makeStreamChunks([
      {
@@ -2658,7 +2658,7 @@ test('streaming: strips leaked reasoning preamble from assistant content deltas'
            delta: {
              role: 'assistant',
              content:
-                'The user just said "hey" - a simple greeting. I should respond briefly and friendly.\n\nHey! How can I help you today?',
+                '<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?',
            },
            finish_reason: null,
          },
@@ -2700,10 +2700,10 @@ test('streaming: strips leaked reasoning preamble from assistant content deltas'
    }
  }
-  expect(textDeltas).toEqual(['Hey! How can I help you today?'])
+  expect(textDeltas.join('')).toBe('Hey! How can I help you today?')
 })
-test('streaming: strips leaked reasoning preamble when split across multiple content chunks', async () => {
+test('streaming: strips <think> tag split across multiple content chunks', async () => {
  globalThis.fetch = (async () => {
    const chunks = makeStreamChunks([
      {
@@ -2715,7 +2715,7 @@ test('streaming: strips leaked reasoning preamble when split across multiple con
            index: 0,
            delta: {
              role: 'assistant',
-              content: 'The user said "hey" - this is a simple greeting. ',
+              content: '<think>user wants a greeting,',
            },
            finish_reason: null,
          },
@@ -2729,8 +2729,21 @@ test('streaming: strips leaked reasoning preamble when split across multiple con
          {
            index: 0,
            delta: {
-              content:
+              content: ' respond briefly</th',
-                'I should respond in a friendly, concise way.\n\nHey! How can I help you today?',
+            },
            finish_reason: null,
          },
        ],
      },
      {
        id: 'chatcmpl-1',
        object: 'chat.completion.chunk',
        model: 'gpt-5-mini',
        choices: [
          {
            index: 0,
            delta: {
              content: 'ink>Hey! How can I help you today?',
            },
            finish_reason: null,
          },
@@ -2773,7 +2786,69 @@ test('streaming: strips leaked reasoning preamble when split across multiple con
    }
  }
-  expect(textDeltas).toEqual(['Hey! How can I help you today?'])
+  expect(textDeltas.join('')).toBe('Hey! How can I help you today?')
 })
 test('streaming: preserves prose without tags (no phrase-based false positive)', async () => {
  // Regression: older phrase-based sanitizer would strip "I should..." prose.
  // The tag-based approach leaves legitimate assistant output alone.
  globalThis.fetch = (async () => {
    const chunks = makeStreamChunks([
      {
        id: 'chatcmpl-1',
        object: 'chat.completion.chunk',
        model: 'gpt-5-mini',
        choices: [
          {
            index: 0,
            delta: {
              role: 'assistant',
              content:
                'I should note that the user role requires a briefly concise friendly response format.',
            },
            finish_reason: null,
          },
        ],
      },
      {
        id: 'chatcmpl-1',
        object: 'chat.completion.chunk',
        model: 'gpt-5-mini',
        choices: [
          {
            index: 0,
            delta: {},
            finish_reason: 'stop',
          },
        ],
      },
    ])
    return makeSseResponse(chunks)
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  const result = await client.beta.messages
    .create({
      model: 'gpt-5-mini',
      system: 'test system',
      messages: [{ role: 'user', content: 'hey' }],
      max_tokens: 64,
      stream: true,
    })
    .withResponse()
  const textDeltas: string[] = []
  for await (const event of result.data) {
    const delta = (event as { delta?: { type?: string; text?: string } }).delta
    if (delta?.type === 'text_delta' && typeof delta.text === 'string') {
      textDeltas.push(delta.text)
    }
  }
  expect(textDeltas.join('')).toBe(
    'I should note that the user role requires a briefly concise friendly response format.',
  )
 })
 test('classifies localhost transport failures with actionable category marker', async () => {
@@ -2856,6 +2931,204 @@ test('classifies chat-completions endpoint 404 failures with endpoint_not_found
    }),
  ).rejects.toThrow('openai_category=endpoint_not_found')
 })
 test('self-heals localhost resolution failures by retrying local loopback base URL', async () => {
  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
  const requestUrls: string[] = []
  globalThis.fetch = (async (input, _init) => {
    const url = typeof input === 'string' ? input : input.url
    requestUrls.push(url)
    if (url.includes('localhost')) {
      const error = Object.assign(new TypeError('fetch failed'), {
        code: 'ENOTFOUND',
      })
      throw error
    }
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'qwen2.5-coder:7b',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'hello from loopback',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 4,
          completion_tokens: 3,
          total_tokens: 7,
        },
      }),
      {
        status: 200,
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await expect(
    client.beta.messages.create({
      model: 'qwen2.5-coder:7b',
      messages: [{ role: 'user', content: 'hello' }],
      max_tokens: 64,
      stream: false,
    }),
  ).resolves.toBeDefined()
  expect(requestUrls[0]).toBe('http://localhost:11434/v1/chat/completions')
  expect(requestUrls).toContain('http://127.0.0.1:11434/v1/chat/completions')
 })
 test('self-heals local endpoint_not_found by retrying with /v1 base URL', async () => {
  process.env.OPENAI_BASE_URL = 'http://localhost:11434'
  const requestUrls: string[] = []
  globalThis.fetch = (async (input, _init) => {
    const url = typeof input === 'string' ? input : input.url
    requestUrls.push(url)
    if (url === 'http://localhost:11434/chat/completions') {
      return new Response('Not Found', {
        status: 404,
        headers: {
          'Content-Type': 'text/plain',
        },
      })
    }
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'qwen2.5-coder:7b',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'hello from /v1',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 5,
          completion_tokens: 2,
          total_tokens: 7,
        },
      }),
      {
        status: 200,
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await expect(
    client.beta.messages.create({
      model: 'qwen2.5-coder:7b',
      messages: [{ role: 'user', content: 'hello' }],
      max_tokens: 64,
      stream: false,
    }),
  ).resolves.toBeDefined()
  expect(requestUrls).toEqual([
    'http://localhost:11434/chat/completions',
    'http://localhost:11434/v1/chat/completions',
  ])
 })
 test('self-heals tool-call incompatibility by retrying local Ollama requests without tools', async () => {
  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
  const requestBodies: Array<Record<string, unknown>> = []
  globalThis.fetch = (async (_input, init) => {
    const requestBody = JSON.parse(String(init?.body)) as Record<string, unknown>
    requestBodies.push(requestBody)
    if (requestBodies.length === 1) {
      return new Response('tool_calls are not supported', {
        status: 400,
        headers: {
          'Content-Type': 'text/plain',
        },
      })
    }
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'qwen2.5-coder:7b',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'fallback without tools',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 8,
          completion_tokens: 4,
          total_tokens: 12,
        },
      }),
      {
        status: 200,
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await expect(
    client.beta.messages.create({
      model: 'qwen2.5-coder:7b',
      messages: [{ role: 'user', content: 'hello' }],
      tools: [
        {
          name: 'Read',
          description: 'Read a file',
          input_schema: {
            type: 'object',
            properties: {
              filePath: { type: 'string' },
            },
            required: ['filePath'],
          },
        },
      ],
      max_tokens: 64,
      stream: false,
    }),
  ).resolves.toBeDefined()
  expect(requestBodies).toHaveLength(2)
  expect(Array.isArray(requestBodies[0]?.tools)).toBe(true)
  expect(requestBodies[0]?.tool_choice).toBeUndefined()
  expect(
    requestBodies[1]?.tools === undefined ||
      (Array.isArray(requestBodies[1]?.tools) && requestBodies[1]?.tools.length === 0),
  ).toBe(true)
  expect(requestBodies[1]?.tool_choice).toBeUndefined()
 })
 test('preserves valid tool_result and drops orphan tool_result', async () => {
  let requestBody: Record<string, unknown> | undefined
@@ -2924,7 +3197,7 @@ test('preserves valid tool_result and drops orphan tool_result', async () => {
          {
            role: 'user',
            content: 'What happened?',
-          }
+          },
        ],
      },
    ],
@@ -2933,14 +3206,526 @@ test('preserves valid tool_result and drops orphan tool_result', async () => {
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  
+
  // Should have: system, user, assistant (tool_use), tool (valid_call_1), user
  // Should NOT have: tool (orphan_call_2)
-  
+
  const toolMessages = messages.filter(m => m.role === 'tool')
  expect(toolMessages.length).toBe(1)
  expect(toolMessages[0].tool_call_id).toBe('valid_call_1')
-  
+
  const orphanMessage = toolMessages.find(m => m.tool_call_id === 'orphan_call_2')
  expect(orphanMessage).toBeUndefined()
  // Actually, the semantic message IS injected here because the user block with orphan 
  // tool result is converted to:
  // 1. Tool result (valid_call_1) -> role 'tool'
  // 2. User content ("What happened?") -> role 'user'
  // This triggers the tool -> assistant injection.
  const assistantMessages = messages.filter(m => m.role === 'assistant')
  expect(assistantMessages.some(m => m.content === '[Tool execution interrupted by user]')).toBe(true)
 })
 test('drops empty assistant message when only thinking block was present and stripped', async () => {
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(JSON.stringify({
      id: 'chatcmpl-1',
      object: 'chat.completion',
      created: 123456789,
      model: 'mistral-large-latest',
      choices: [{ message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
      usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }
    }), { headers: { 'Content-Type': 'application/json' } })
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'mistral-large-latest',
    messages: [
      { role: 'user', content: 'Initial' },
      { role: 'assistant', content: [{ type: 'thinking', thinking: 'I am thinking...', signature: 'sig' }] },
      { role: 'user', content: 'Interrupting query' },
    ],
    max_tokens: 64,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  // The assistant msg is dropped because thinking is stripped.
  // The two user messages are coalesced.
  expect(messages.length).toBe(1)
  expect(messages[0].role).toBe('user')
  expect(String(messages[0].content)).toContain('Initial')
  expect(String(messages[0].content)).toContain('Interrupting query')
 })
 test('injects semantic assistant message when tool result is followed by user message', async () => {
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(JSON.stringify({
      id: 'chatcmpl-2',
      object: 'chat.completion',
      created: 123456789,
      model: 'mistral-large-latest',
      choices: [{ message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
      usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }
    }), { headers: { 'Content-Type': 'application/json' } })
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'mistral-large-latest',
    messages: [
      { 
        role: 'assistant', 
        content: [{ type: 'tool_use', id: 'call_1', name: 'search', input: {} }] 
      },
      { 
        role: 'user', 
        content: [
          { type: 'tool_result', tool_use_id: 'call_1', content: 'Result' }
        ] 
      },
      { role: 'user', content: 'Next user query' },
    ],
    max_tokens: 64,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  // Roles should be: assistant (tool_calls) -> tool -> assistant (semantic) -> user
  const roles = messages.map(m => m.role)
  expect(roles).toEqual(['assistant', 'tool', 'assistant', 'user'])
  const semanticMsg = messages[2]
  expect(semanticMsg.role).toBe('assistant')
  expect(semanticMsg.content).toBe('[Tool execution interrupted by user]')
 })
 test('Moonshot: uses max_tokens (not max_completion_tokens) and strips store', async () => {
  process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'kimi-k2.6',
        choices: [
          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
        ],
        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
      }),
      { headers: { 'Content-Type': 'application/json' } },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'kimi-k2.6',
    system: 'you are kimi',
    messages: [{ role: 'user', content: 'hi' }],
    max_tokens: 256,
    stream: false,
  })
  expect(requestBody?.max_tokens).toBe(256)
  expect(requestBody?.max_completion_tokens).toBeUndefined()
  expect(requestBody?.store).toBeUndefined()
 })
 test('Moonshot: echoes reasoning_content on assistant tool-call messages', async () => {
  // Regression for: "API Error: 400 {"error":{"message":"thinking is enabled
  // but reasoning_content is missing in assistant tool call message at index
  // N"}}" when the agent sends a prior-turn assistant response back to Kimi.
  // The thinking block captured from the inbound response must round-trip
  // as reasoning_content on the outgoing echoed assistant message.
  process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'kimi-k2.6',
        choices: [
          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
        ],
        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
      }),
      { headers: { 'Content-Type': 'application/json' } },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'kimi-k2.6',
    system: 'you are kimi',
    messages: [
      { role: 'user', content: 'check the logs' },
      {
        role: 'assistant',
        content: [
          {
            type: 'thinking',
            thinking: 'Need to inspect logs via Bash; running a cat.',
          },
          { type: 'text', text: "I'll inspect the logs." },
          {
            type: 'tool_use',
            id: 'call_bash_1',
            name: 'Bash',
            input: { command: 'cat /tmp/app.log' },
          },
        ],
      },
      {
        role: 'user',
        content: [
          {
            type: 'tool_result',
            tool_use_id: 'call_bash_1',
            content: 'log line 1\nlog line 2',
          },
        ],
      },
    ],
    max_tokens: 256,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  const assistantWithToolCall = messages.find(
    m => m.role === 'assistant' && Array.isArray(m.tool_calls),
  )
  expect(assistantWithToolCall).toBeDefined()
  expect(assistantWithToolCall?.reasoning_content).toBe(
    'Need to inspect logs via Bash; running a cat.',
  )
 })
 test('non-Moonshot providers do NOT receive reasoning_content on assistant messages', async () => {
  // Guard: only Moonshot opts in. DeepSeek/OpenRouter/etc. receive the
  // outgoing assistant message without reasoning_content to avoid
  // unknown-field rejections from strict servers.
  process.env.OPENAI_BASE_URL = 'https://api.deepseek.com/v1'
  process.env.OPENAI_API_KEY = 'sk-deepseek'
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'deepseek-chat',
        choices: [
          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
        ],
        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
      }),
      { headers: { 'Content-Type': 'application/json' } },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'deepseek-chat',
    system: 'test',
    messages: [
      { role: 'user', content: 'hi' },
      {
        role: 'assistant',
        content: [
          { type: 'thinking', thinking: 'thought' },
          { type: 'text', text: 'hello' },
          {
            type: 'tool_use',
            id: 'call_1',
            name: 'Bash',
            input: { command: 'ls' },
          },
        ],
      },
      {
        role: 'user',
        content: [
          { type: 'tool_result', tool_use_id: 'call_1', content: 'files' },
        ],
      },
    ],
    max_tokens: 32,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  const assistantWithToolCall = messages.find(
    m => m.role === 'assistant' && Array.isArray(m.tool_calls),
  )
  expect(assistantWithToolCall).toBeDefined()
  expect(assistantWithToolCall?.reasoning_content).toBeUndefined()
 })
 test('Moonshot: cn host is also detected', async () => {
  process.env.OPENAI_BASE_URL = 'https://api.moonshot.cn/v1'
  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'kimi-k2.6',
        choices: [
          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
        ],
        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
      }),
      { headers: { 'Content-Type': 'application/json' } },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'kimi-k2.6',
    system: 'you are kimi',
    messages: [{ role: 'user', content: 'hi' }],
    max_tokens: 256,
    stream: false,
  })
  expect(requestBody?.store).toBeUndefined()
 })
 test('collapses multiple text blocks in tool_result to string for DeepSeek compatibility (issue #774)', async () => {
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'deepseek-reasoner',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'done',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 12,
          completion_tokens: 4,
          total_tokens: 16,
        },
      }),
      {
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'deepseek-reasoner',
    system: 'test system',
    messages: [
      { role: 'user', content: 'Run ls' },
      {
        role: 'assistant',
        content: [
          {
            type: 'tool_use',
            id: 'call_1',
            name: 'Bash',
            input: { command: 'ls' },
          },
        ],
      },
      {
        role: 'user',
        content: [
          {
            type: 'tool_result',
            tool_use_id: 'call_1',
            content: [
              { type: 'text', text: 'line one' },
              { type: 'text', text: 'line two' },
            ],
          },
        ],
      },
    ],
    max_tokens: 64,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  const toolMessages = messages.filter(m => m.role === 'tool')
  expect(toolMessages.length).toBe(1)
  expect(toolMessages[0].tool_call_id).toBe('call_1')
  expect(typeof toolMessages[0].content).toBe('string')
  expect(toolMessages[0].content).toBe('line one\n\nline two')
 })
 test('collapses multiple text blocks into a single string for DeepSeek compatibility (issue #774)', async () => {
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'deepseek-reasoner',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'done',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 12,
          completion_tokens: 4,
          total_tokens: 16,
        },
      }),
      {
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'deepseek-reasoner',
    system: 'test system',
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Hello!' },
          { type: 'text', text: 'How are you?' },
        ],
      },
    ],
    max_tokens: 64,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  expect(messages.length).toBe(2) // system + user
  expect(messages[1].role).toBe('user')
  expect(typeof messages[1].content).toBe('string')
  expect(messages[1].content).toBe('Hello!\n\nHow are you?')
 })
 test('preserves mixed text and image tool results as multipart content', async () => {
  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
        model: 'gpt-4o',
        choices: [
          {
            message: {
              role: 'assistant',
              content: 'done',
            },
            finish_reason: 'stop',
          },
        ],
        usage: {
          prompt_tokens: 12,
          completion_tokens: 4,
          total_tokens: 16,
        },
      }),
      {
        headers: {
          'Content-Type': 'application/json',
        },
      },
    )
  }) as FetchType
  const client = createOpenAIShimClient({}) as OpenAIShimClient
  await client.beta.messages.create({
    model: 'gpt-4o',
    system: 'test system',
    messages: [
      { role: 'user', content: 'Show me' },
      {
        role: 'assistant',
        content: [
          {
            type: 'tool_use',
            id: 'call_1',
            name: 'Bash',
            input: { command: 'cat image.png' },
          },
        ],
      },
      {
        role: 'user',
        content: [
          {
            type: 'tool_result',
            tool_use_id: 'call_1',
            content: [
              { type: 'text', text: 'Here is the image:' },
              {
                type: 'image',
                source: {
                  type: 'base64',
                  media_type: 'image/png',
                  data: 'iVBORw0KGgo=',
                },
              },
            ],
          },
        ],
      },
    ],
    max_tokens: 64,
    stream: false,
  })
  const messages = requestBody?.messages as Array<Record<string, unknown>>
  const toolMessages = messages.filter(m => m.role === 'tool')
  expect(toolMessages.length).toBe(1)
  expect(Array.isArray(toolMessages[0].content)).toBe(true)
  const content = toolMessages[0].content as Array<Record<string, unknown>>
  expect(content.length).toBe(2)
  expect(content[0].type).toBe('text')
  expect(content[1].type).toBe('image_url')
 })
--- a/src/services/api/openaiShim.ts
+++ b/src/services/api/openaiShim.ts
@@ -32,10 +32,9 @@ import { resolveGeminiCredential } from '../../utils/geminiAuth.js'
 import { hydrateGeminiAccessTokenFromSecureStorage } from '../../utils/geminiCredentials.js'
 import { hydrateGithubModelsTokenFromSecureStorage } from '../../utils/githubModelsCredentials.js'
 import {
-  looksLikeLeakedReasoningPrefix,
+  createThinkTagFilter,
-  shouldBufferPotentialReasoningPrefix,
+  stripThinkTags,
-  stripLeakedReasoningPreamble,
+} from './thinkTagSanitizer.js'
 } from './reasoningLeakSanitizer.js'
 import {
  codexStreamToAnthropic,
  collectCodexCompletedResponse,
@@ -47,12 +46,15 @@ import {
  type AnthropicUsage,
  type ShimCreateParams,
 } from './codexShim.js'
 import { compressToolHistory } from './compressToolHistory.js'
 import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
 import {
  getLocalProviderRetryBaseUrls,
  getGithubEndpointType,
  isLocalProviderUrl,
  resolveRuntimeCodexCredentials,
  resolveProviderRequest,
-  getGithubEndpointType,
+  shouldAttemptLocalToollessRetry,
 } from './providerConfig.js'
 import {
  buildOpenAICompatibilityErrorMessage,
@@ -65,6 +67,8 @@ import {
  normalizeToolArguments,
  hasToolFieldMapping,
 } from './toolArgumentNormalization.js'
 import { logApiCallStart, logApiCallEnd } from '../../utils/requestLogging.js'
 import { createStreamState, processStreamChunk, getStreamStats } from '../../utils/streamingOptimizer.js'
 type SecretValueSource = Partial<{
  OPENAI_API_KEY: string
@@ -80,6 +84,10 @@ const GITHUB_429_MAX_RETRIES = 3
 const GITHUB_429_BASE_DELAY_SEC = 1
 const GITHUB_429_MAX_DELAY_SEC = 32
 const GEMINI_API_HOST = 'generativelanguage.googleapis.com'
 const MOONSHOT_API_HOSTS = new Set([
  'api.moonshot.ai',
  'api.moonshot.cn',
 ])
 const COPILOT_HEADERS: Record<string, string> = {
  'User-Agent': 'GitHubCopilotChat/0.26.7',
@@ -145,6 +153,15 @@ function hasGeminiApiHost(baseUrl: string | undefined): boolean {
  }
 }
 function isMoonshotBaseUrl(baseUrl: string | undefined): boolean {
  if (!baseUrl) return false
  try {
    return MOONSHOT_API_HOSTS.has(new URL(baseUrl).hostname.toLowerCase())
  } catch {
    return false
  }
 }
 function formatRetryAfterHint(response: Response): string {
  const ra = response.headers.get('retry-after')
  return ra ? ` (Retry-After: ${ra})` : ''
@@ -201,6 +218,14 @@ interface OpenAIMessage {
  }>
  tool_call_id?: string
  name?: string
  /**
   * Per-assistant-message chain-of-thought, attached when echoing an
   * assistant message back to providers that require it (notably Moonshot:
   * "thinking is enabled but reasoning_content is missing in assistant
   * tool call message at index N" 400). Derived from the Anthropic thinking
   * block captured when the original response was translated.
   */
  reasoning_content?: string
 }
 interface OpenAITool {
@@ -276,6 +301,15 @@ function convertToolResultContent(
    const text = parts[0].text ?? ''
    return isError ? `Error: ${text}` : text
  }
  // Collapse arrays of only text blocks into a single string for DeepSeek
  // compatibility (issue #774). DeepSeek rejects arrays in role: "tool" messages.
  const allText = parts.every(p => p.type === 'text')
  if (allText) {
    const text = parts.map(p => p.text ?? '').join('\n\n')
    return isError ? `Error: ${text}` : text
  }
  if (isError && parts[0]?.type === 'text') {
    parts[0] = { ...parts[0], text: `Error: ${parts[0].text ?? ''}` }
  } else if (isError) {
@@ -334,6 +368,14 @@ function convertContentBlocks(
  if (parts.length === 0) return ''
  if (parts.length === 1 && parts[0].type === 'text') return parts[0].text ?? ''
  // Collapse arrays of only text blocks into a single string for DeepSeek
  // compatibility (issue #774).
  const allText = parts.every(p => p.type === 'text')
  if (allText) {
    return parts.map(p => p.text ?? '').join('\n\n')
  }
  return parts
 }
@@ -345,19 +387,45 @@ function isGeminiMode(): boolean {
 }
 function convertMessages(
-  messages: Array<{ role: string; message?: { role?: string; content?: unknown }; content?: unknown }>,
+  messages: Array<{
    role: string
    message?: { role?: string; content?: unknown }
    content?: unknown
  }>,
  system: unknown,
  options?: { preserveReasoningContent?: boolean },
 ): OpenAIMessage[] {
  const preserveReasoningContent = options?.preserveReasoningContent === true
  const result: OpenAIMessage[] = []
  const knownToolCallIds = new Set<string>()
  // Pre-scan for all tool results in the history to identify valid tool calls
  const toolResultIds = new Set<string>()
  for (const msg of messages) {
    const inner = msg.message ?? msg
    const content = (inner as { content?: unknown }).content
    if (Array.isArray(content)) {
      for (const block of content) {
        if (
          (block as { type?: string }).type === 'tool_result' &&
          (block as { tool_use_id?: string }).tool_use_id
        ) {
          toolResultIds.add((block as { tool_use_id: string }).tool_use_id)
        }
      }
    }
  }
  // System message first
  const sysText = convertSystemPrompt(system)
  if (sysText) {
    result.push({ role: 'system', content: sysText })
  }
-  for (const msg of messages) {
+  for (let i = 0; i < messages.length; i++) {
    const msg = messages[i]
    const isLastInHistory = i === messages.length - 1
    // Claude Code wraps messages in { role, message: { role, content } }
    const inner = msg.message ?? msg
    const role = (inner as { role?: string }).role ?? msg.role
@@ -366,8 +434,12 @@ function convertMessages(
    if (role === 'user') {
      // Check for tool_result blocks in user messages
      if (Array.isArray(content)) {
-        const toolResults = content.filter((b: { type?: string }) => b.type === 'tool_result')
+        const toolResults = content.filter(
-        const otherContent = content.filter((b: { type?: string }) => b.type !== 'tool_result')
+          (b: { type?: string }) => b.type === 'tool_result',
        )
        const otherContent = content.filter(
          (b: { type?: string }) => b.type !== 'tool_result',
        )
        // Emit tool results as tool messages, but ONLY if we have a matching tool_use ID.
        // Mistral/OpenAI strictly require tool messages to follow an assistant message with tool_calls.
@@ -382,7 +454,9 @@ function convertMessages(
              content: convertToolResultContent(tr.content, tr.is_error),
            })
          } else {
-            logForDebugging(`Dropping orphan tool_result for ID: ${id} to prevent API error`)
+            logForDebugging(
              `Dropping orphan tool_result for ID: ${id} to prevent API error`,
            )
          }
        }
@@ -402,8 +476,12 @@ function convertMessages(
    } else if (role === 'assistant') {
      // Check for tool_use blocks
      if (Array.isArray(content)) {
-        const toolUses = content.filter((b: { type?: string }) => b.type === 'tool_use')
+        const toolUses = content.filter(
-        const thinkingBlock = content.find((b: { type?: string }) => b.type === 'thinking')
+          (b: { type?: string }) => b.type === 'tool_use',
        )
        const thinkingBlock = content.find(
          (b: { type?: string }) => b.type === 'thinking',
        )
        const textContent = content.filter(
          (b: { type?: string }) => b.type !== 'tool_use' && b.type !== 'thinking',
        )
@@ -412,70 +490,123 @@ function convertMessages(
          role: 'assistant',
          content: (() => {
            const c = convertContentBlocks(textContent)
-            return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
+            return typeof c === 'string'
              ? c
              : Array.isArray(c)
                ? c.map((p: { text?: string }) => p.text ?? '').join('')
                : ''
          })(),
        }
        // Providers that validate reasoning continuity (Moonshot: "thinking
        // is enabled but reasoning_content is missing in assistant tool call
        // message at index N" 400) need the original chain-of-thought echoed
        // back on each assistant message that carries a tool_call. We kept
        // the thinking block on the Anthropic side; re-attach it here as the
        // `reasoning_content` field on the outgoing OpenAI-shaped message.
        // Gated per-provider because other endpoints either ignore the field
        // (harmless) or strict-reject unknown fields (harmful).
        if (preserveReasoningContent) {
          const thinkingText = (thinkingBlock as { thinking?: string } | undefined)?.thinking
          if (typeof thinkingText === 'string' && thinkingText.trim().length > 0) {
            assistantMsg.reasoning_content = thinkingText
          }
        }
        if (toolUses.length > 0) {
-          assistantMsg.tool_calls = toolUses.map(
+          const mappedToolCalls = toolUses
-            (tu: {
+            .map(
-              id?: string
+              (tu: {
-              name?: string
+                id?: string
-              input?: unknown
+                name?: string
-              extra_content?: Record<string, unknown>
+                input?: unknown
-              signature?: string
+                extra_content?: Record<string, unknown>
-            }) => {
+                signature?: string
-              const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`
+              }) => {
-              knownToolCallIds.add(id)
+                const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`
              const toolCall: NonNullable<OpenAIMessage['tool_calls']>[number] = {
                id,
                type: 'function' as const,
                function: {
                  name: tu.name ?? 'unknown',
                  arguments:
                    typeof tu.input === 'string'
                      ? tu.input
                      : JSON.stringify(tu.input ?? {}),
                },
              }
-              // Preserve existing extra_content if present
+                // Only keep tool calls that have a corresponding result in the history,
-              if (tu.extra_content) {
+                // or if it's the last message (prefill scenario).
-                toolCall.extra_content = { ...tu.extra_content }
+                // Orphaned tool calls (e.g. from user interruption) cause 400 errors.
-              }
+                if (!toolResultIds.has(id) && !isLastInHistory) {
                  return null
                }
-              // Handle Gemini thought_signature
+                knownToolCallIds.add(id)
-              if (isGeminiMode()) {
+                const toolCall: NonNullable<
-                // If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
+                  OpenAIMessage['tool_calls']
-                // Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
+                >[number] = {
-                // The API requires the same signature on every replayed function call part in a parallel set.
+                  id,
-                const signature = tu.signature ?? (thinkingBlock as any)?.signature
+                  type: 'function' as const,
                  function: {
                    name: tu.name ?? 'unknown',
                    arguments:
                      typeof tu.input === 'string'
                        ? tu.input
                        : JSON.stringify(tu.input ?? {}),
                  },
                }
-                // Merge into existing google-specific metadata if present
+                // Preserve existing extra_content if present
-                const existingGoogle = (toolCall.extra_content?.google as Record<string, unknown>) ?? {}
+                if (tu.extra_content) {
-                toolCall.extra_content = {
+                  toolCall.extra_content = { ...tu.extra_content }
-                  ...toolCall.extra_content,
+                }
-                  google: {
+
-                    ...existingGoogle,
+                // Handle Gemini thought_signature
-                    thought_signature: signature ?? "skip_thought_signature_validator"
+                if (isGeminiMode()) {
                  // If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
                  // Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
                  // The API requires the same signature on every replayed function call part in a parallel set.
                  const signature =
                    tu.signature ?? (thinkingBlock as any)?.signature
                  // Merge into existing google-specific metadata if present
                  const existingGoogle =
                    (toolCall.extra_content?.google as Record<
                      string,
                      unknown
                    >) ?? {}
                  toolCall.extra_content = {
                    ...toolCall.extra_content,
                    google: {
                      ...existingGoogle,
                      thought_signature:
                        signature ?? 'skip_thought_signature_validator',
                    },
                  }
                }
              }
-              return toolCall
+                return toolCall
-            },
+              },
-          )
+            )
            .filter((tc): tc is NonNullable<typeof tc> => tc !== null)
          if (mappedToolCalls.length > 0) {
            assistantMsg.tool_calls = mappedToolCalls
          }
        }
-        result.push(assistantMsg)
+        // Only push assistant message if it has content or tool calls.
        // Stripped thinking-only blocks from user interruptions are empty and cause 400s.
        if (assistantMsg.content || assistantMsg.tool_calls?.length) {
          result.push(assistantMsg)
        }
      } else {
-        result.push({
+        const assistantMsg: OpenAIMessage = {
          role: 'assistant',
          content: (() => {
            const c = convertContentBlocks(content)
-            return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
+            return typeof c === 'string'
              ? c
              : Array.isArray(c)
                ? c.map((p: { text?: string }) => p.text ?? '').join('')
                : ''
          })(),
-        })
+        }
        if (assistantMsg.content) {
          result.push(assistantMsg)
        }
      }
    }
  }
@@ -489,25 +620,56 @@ function convertMessages(
  for (const msg of result) {
    const prev = coalesced[coalesced.length - 1]
-    if (prev && prev.role === msg.role && msg.role !== 'tool' && msg.role !== 'system') {
+    // Mistral/Devstral: 'tool' message must be followed by an 'assistant' message.
-      const prevContent = prev.content
+    // If a 'tool' result is followed by a 'user' message, we must inject a semantic
    // assistant response to satisfy the strict role sequence:
    // ... -> assistant (calls) -> tool (results) -> assistant (semantic) -> user (next)
    if (prev && prev.role === 'tool' && msg.role === 'user') {
      coalesced.push({
        role: 'assistant',
        content: '[Tool execution interrupted by user]',
      })
    }
    const lastAfterPossibleInjection = coalesced[coalesced.length - 1]
    if (
      lastAfterPossibleInjection &&
      lastAfterPossibleInjection.role === msg.role &&
      msg.role !== 'tool' &&
      msg.role !== 'system'
    ) {
      const prevContent = lastAfterPossibleInjection.content
      const curContent = msg.content
      if (typeof prevContent === 'string' && typeof curContent === 'string') {
-        prev.content = prevContent + (prevContent && curContent ? '\n' : '') + curContent
+        lastAfterPossibleInjection.content =
          prevContent + (prevContent && curContent ? '\n' : '') + curContent
      } else {
        const toArray = (
-          c: string | Array<{ type: string; text?: string; image_url?: { url: string } }> | undefined,
+          c:
-        ): Array<{ type: string; text?: string; image_url?: { url: string } }> => {
+            | string
            | Array<{ type: string; text?: string; image_url?: { url: string } }>
            | undefined,
        ): Array<{
          type: string
          text?: string
          image_url?: { url: string }
        }> => {
          if (!c) return []
          if (typeof c === 'string') return c ? [{ type: 'text', text: c }] : []
          return c
        }
-        prev.content = [...toArray(prevContent), ...toArray(curContent)]
+        lastAfterPossibleInjection.content = [
          ...toArray(prevContent),
          ...toArray(curContent),
        ]
      }
      if (msg.tool_calls?.length) {
-        prev.tool_calls = [...(prev.tool_calls ?? []), ...msg.tool_calls]
+        lastAfterPossibleInjection.tool_calls = [
          ...(lastAfterPossibleInjection.tool_calls ?? []),
          ...msg.tool_calls,
        ]
      }
    } else {
      coalesced.push(msg)
@@ -718,11 +880,11 @@ async function* openaiStreamToAnthropic(
  let hasEmittedContentStart = false
  let hasEmittedThinkingStart = false
  let hasClosedThinking = false
-  let activeTextBuffer = ''
+  const thinkFilter = createThinkTagFilter()
  let textBufferMode: 'none' | 'pending' | 'strip' = 'none'
  let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null
  let hasEmittedFinalUsage = false
  let hasProcessedFinishReason = false
  const streamState = createStreamState()
  // Emit message_start
  yield {
@@ -798,14 +960,12 @@ async function* openaiStreamToAnthropic(
  const closeActiveContentBlock = async function* () {
    if (!hasEmittedContentStart) return
-    if (textBufferMode !== 'none') {
+    const tail = thinkFilter.flush()
-      const sanitized = stripLeakedReasoningPreamble(activeTextBuffer)
+    if (tail) {
-      if (sanitized) {
+      yield {
-        yield {
+        type: 'content_block_delta',
-          type: 'content_block_delta',
+        index: contentBlockIndex,
-          index: contentBlockIndex,
+        delta: { type: 'text_delta', text: tail },
          delta: { type: 'text_delta', text: sanitized },
        }
      }
    }
@@ -815,8 +975,6 @@ async function* openaiStreamToAnthropic(
    }
    contentBlockIndex++
    hasEmittedContentStart = false
    activeTextBuffer = ''
    textBufferMode = 'none'
  }
  try {
@@ -873,7 +1031,6 @@ async function* openaiStreamToAnthropic(
            contentBlockIndex++
            hasClosedThinking = true
          }
          activeTextBuffer += delta.content
          if (!hasEmittedContentStart) {
            yield {
              type: 'content_block_start',
@@ -883,39 +1040,15 @@ async function* openaiStreamToAnthropic(
            hasEmittedContentStart = true
          }
-          if (
+          const visible = thinkFilter.feed(delta.content)
-            textBufferMode === 'strip' ||
+          if (visible) {
            looksLikeLeakedReasoningPrefix(activeTextBuffer)
          ) {
            textBufferMode = 'strip'
            continue
          }
          if (textBufferMode === 'pending') {
            if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
              continue
            }
            yield {
              type: 'content_block_delta',
              index: contentBlockIndex,
-              delta: {
+              delta: { type: 'text_delta', text: visible },
                type: 'text_delta',
                text: activeTextBuffer,
              },
            }
            textBufferMode = 'none'
            continue
          }
          if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
            textBufferMode = 'pending'
            continue
          }
          yield {
            type: 'content_block_delta',
            index: contentBlockIndex,
            delta: { type: 'text_delta', text: delta.content },
          }
          processStreamChunk(streamState, delta.content)
        }
        // Tool calls
@@ -935,6 +1068,7 @@ async function* openaiStreamToAnthropic(
              const toolBlockIndex = contentBlockIndex
              const initialArguments = tc.function.arguments ?? ''
              const normalizeAtStop = hasToolFieldMapping(tc.function.name)
              processStreamChunk(streamState, tc.function.arguments ?? '')
              activeToolCalls.set(tc.index, {
                id: tc.id,
                name: tc.function.name,
@@ -1132,6 +1266,20 @@ async function* openaiStreamToAnthropic(
    reader.releaseLock()
  }
  const stats = getStreamStats(streamState)
  if (stats.totalChunks > 0) {
    logForDebugging(
      JSON.stringify({
        type: 'stream_stats',
        model,
        total_chunks: stats.totalChunks,
        first_token_ms: stats.firstTokenMs,
        duration_ms: stats.durationMs,
      }),
      { level: 'debug' },
    )
  }
  yield { type: 'message_stop' }
 }
@@ -1329,14 +1477,20 @@ class OpenAIShimMessages {
    params: ShimCreateParams,
    options?: { signal?: AbortSignal; headers?: Record<string, string> },
  ): Promise<Response> {
-    const openaiMessages = convertMessages(
+    const compressedMessages = compressToolHistory(
      params.messages as Array<{
        role: string
        message?: { role?: string; content?: unknown }
        content?: unknown
      }>,
-      params.system,
+      request.resolvedModel,
    )
    const openaiMessages = convertMessages(compressedMessages, params.system, {
      // Moonshot requires every assistant tool-call message to carry
      // reasoning_content when its thinking feature is active. Echo it back
      // from the thinking block we captured on the inbound response.
      preserveReasoningContent: isMoonshotBaseUrl(request.baseUrl),
    })
    const body: Record<string, unknown> = {
      model: request.resolvedModel,
@@ -1372,14 +1526,19 @@ class OpenAIShimMessages {
    const isGithubCopilot = isGithub && githubEndpointType === 'copilot'
    const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom')
-    if ((isGithub || isMistral || isLocal) && body.max_completion_tokens !== undefined) {
+    const isMoonshot = isMoonshotBaseUrl(request.baseUrl)
    if ((isGithub || isMistral || isLocal || isMoonshot) && body.max_completion_tokens !== undefined) {
      body.max_tokens = body.max_completion_tokens
      delete body.max_completion_tokens
    }
    // mistral and gemini don't recognize body.store — Gemini returns 400
    // "Invalid JSON payload received. Unknown name 'store': Cannot find field."
-    if (isMistral || isGeminiMode()) {
+    // Moonshot (api.moonshot.ai/.cn) has not published support for the
    // parameter either; strip it preemptively to avoid the same class of
    // error on strict-parse providers.
    if (isMistral || isGeminiMode() || isMoonshot) {
      delete body.store
    }
@@ -1459,48 +1618,95 @@ class OpenAIShimMessages {
      headers['X-GitHub-Api-Version'] = '2022-11-28'
    }
-    // Build the chat completions URL
+    const buildChatCompletionsUrl = (baseUrl: string): string => {
-    // Azure Cognitive Services / Azure OpenAI require a deployment-specific path
+      // Azure Cognitive Services / Azure OpenAI require a deployment-specific
-    // and an api-version query parameter.
+      // path and an api-version query parameter.
-    // Standard format: {base}/openai/deployments/{model}/chat/completions?api-version={version}
+      if (isAzure) {
-    // Non-Azure: {base}/chat/completions
+        const apiVersion = process.env.AZURE_OPENAI_API_VERSION ?? '2024-12-01-preview'
-    let chatCompletionsUrl: string
+        const deployment = request.resolvedModel ?? process.env.OPENAI_MODEL ?? 'gpt-4o'
-    if (isAzure) {
+
-      const apiVersion = process.env.AZURE_OPENAI_API_VERSION ?? '2024-12-01-preview'
+        // If base URL already contains /deployments/, use it as-is with api-version.
-      const deployment = request.resolvedModel ?? process.env.OPENAI_MODEL ?? 'gpt-4o'
+        if (/\/deployments\//i.test(baseUrl)) {
-      // If base URL already contains /deployments/, use it as-is with api-version
+          const normalizedBase = baseUrl.replace(/\/+$/, '')
-      if (/\/deployments\//i.test(request.baseUrl)) {
+          return `${normalizedBase}/chat/completions?api-version=${apiVersion}`
-        const base = request.baseUrl.replace(/\/+$/, '')
+        }
-        chatCompletionsUrl = `${base}/chat/completions?api-version=${apiVersion}`
+
-      } else {
+        // Strip trailing /v1 or /openai/v1 if present, then build Azure path.
-        // Strip trailing /v1 or /openai/v1 if present, then build Azure path
+        const normalizedBase = baseUrl
-        const base = request.baseUrl.replace(/\/(openai\/)?v1\/?$/, '').replace(/\/+$/, '')
+          .replace(/\/(openai\/)?v1\/?$/, '')
-        chatCompletionsUrl = `${base}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`
+          .replace(/\/+$/, '')
        return `${normalizedBase}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`
      }
-    } else {
+
-      chatCompletionsUrl = `${request.baseUrl}/chat/completions`
+      return `${baseUrl}/chat/completions`
    }
-    const fetchInit = {
+    const localRetryBaseUrls = isLocal
      ? getLocalProviderRetryBaseUrls(request.baseUrl)
      : []
    let activeBaseUrl = request.baseUrl
    let chatCompletionsUrl = buildChatCompletionsUrl(activeBaseUrl)
    const attemptedLocalBaseUrls = new Set<string>([activeBaseUrl])
    let didRetryWithoutTools = false
    const promoteNextLocalBaseUrl = (
      reason: 'endpoint_not_found' | 'localhost_resolution_failed',
    ): boolean => {
      for (const candidateBaseUrl of localRetryBaseUrls) {
        if (attemptedLocalBaseUrls.has(candidateBaseUrl)) {
          continue
        }
        const previousUrl = chatCompletionsUrl
        attemptedLocalBaseUrls.add(candidateBaseUrl)
        activeBaseUrl = candidateBaseUrl
        chatCompletionsUrl = buildChatCompletionsUrl(activeBaseUrl)
        logForDebugging(
          `[OpenAIShim] self-heal retry reason=${reason} method=POST from=${redactUrlForDiagnostics(previousUrl)} to=${redactUrlForDiagnostics(chatCompletionsUrl)} model=${request.resolvedModel}`,
          { level: 'warn' },
        )
        return true
      }
      return false
    }
    let serializedBody = JSON.stringify(body)
    const refreshSerializedBody = (): void => {
      serializedBody = JSON.stringify(body)
    }
    const buildFetchInit = () => ({
      method: 'POST' as const,
      headers,
-      body: JSON.stringify(body),
+      body: serializedBody,
      signal: options?.signal,
-    }
+    })
-    const maxAttempts = isGithub ? GITHUB_429_MAX_RETRIES : 1
+    const maxSelfHealAttempts = isLocal
      ? localRetryBaseUrls.length + 1
      : 0
    const maxAttempts = (isGithub ? GITHUB_429_MAX_RETRIES : 1) + maxSelfHealAttempts
    const throwClassifiedTransportError = (
      error: unknown,
      requestUrl: string,
      preclassifiedFailure?: ReturnType<typeof classifyOpenAINetworkFailure>,
    ): never => {
      if (options?.signal?.aborted) {
        throw error
      }
-      const failure = classifyOpenAINetworkFailure(error, {
+      const failure =
-        url: requestUrl,
+        preclassifiedFailure ??
-      })
+        classifyOpenAINetworkFailure(error, {
          url: requestUrl,
        })
      const redactedUrl = redactUrlForDiagnostics(requestUrl)
      const safeMessage =
        redactSecretValueForDisplay(
@@ -1531,11 +1737,14 @@ class OpenAIShimMessages {
      responseHeaders: Headers,
      requestUrl: string,
      rateHint = '',
      preclassifiedFailure?: ReturnType<typeof classifyOpenAIHttpFailure>,
    ): never => {
-      const failure = classifyOpenAIHttpFailure({
+      const failure =
-        status,
+        preclassifiedFailure ??
-        body: errorBody,
+        classifyOpenAIHttpFailure({
-      })
+          status,
          body: errorBody,
        })
      const redactedUrl = redactUrlForDiagnostics(requestUrl)
      logForDebugging(
@@ -1555,12 +1764,21 @@ class OpenAIShimMessages {
    }
    let response: Response | undefined
    const provider = request.baseUrl.includes('nvidia') ? 'nvidia-nim'
      : request.baseUrl.includes('minimax') ? 'minimax'
      : request.baseUrl.includes('localhost:11434') || request.baseUrl.includes('localhost:11435') ? 'ollama'
      : request.baseUrl.includes('anthropic') ? 'anthropic'
      : 'openai'
    const { correlationId, startTime } = logApiCallStart(provider, request.resolvedModel)
    for (let attempt = 0; attempt < maxAttempts; attempt++) {
      try {
-        response = await fetchWithProxyRetry(chatCompletionsUrl, fetchInit)
+        response = await fetchWithProxyRetry(
          chatCompletionsUrl,
          buildFetchInit(),
        )
      } catch (error) {
        const isAbortError =
-          fetchInit.signal?.aborted === true ||
+          options?.signal?.aborted === true ||
          (typeof DOMException !== 'undefined' &&
            error instanceof DOMException &&
            error.name === 'AbortError') ||
@@ -1573,10 +1791,36 @@ class OpenAIShimMessages {
          throw error
        }
-        throwClassifiedTransportError(error, chatCompletionsUrl)
+        const failure = classifyOpenAINetworkFailure(error, {
          url: chatCompletionsUrl,
        })
        if (
          isLocal &&
          failure.category === 'localhost_resolution_failed' &&
          promoteNextLocalBaseUrl('localhost_resolution_failed')
        ) {
          continue
        }
        throwClassifiedTransportError(error, chatCompletionsUrl, failure)
      }
      if (response.ok) {
        let tokensIn = 0
        let tokensOut = 0
        // Skip clone() for streaming responses - it blocks until full body is received,
        // defeating the purpose of streaming. Usage data is already sent via
        // stream_options: { include_usage: true } and can be extracted from the stream.
        if (!params.stream) {
          try {
            const clone = response.clone()
            const data = await clone.json()
            tokensIn = data.usage?.prompt_tokens ?? 0
            tokensOut = data.usage?.completion_tokens ?? 0
          } catch { /* ignore */ }
        }
        logApiCallEnd(correlationId, startTime, request.resolvedModel, 'success', tokensIn, tokensOut, false)
        return response
      }
@@ -1665,6 +1909,10 @@ class OpenAIShimMessages {
            return responsesResponse
          }
          const responsesErrorBody = await responsesResponse.text().catch(() => 'unknown error')
          const responsesFailure = classifyOpenAIHttpFailure({
            status: responsesResponse.status,
            body: responsesErrorBody,
          })
          let responsesErrorResponse: object | undefined
          try { responsesErrorResponse = JSON.parse(responsesErrorBody) } catch { /* raw text */ }
          throwClassifiedHttpError(
@@ -1673,10 +1921,49 @@ class OpenAIShimMessages {
            responsesErrorResponse,
            responsesResponse.headers,
            responsesUrl,
            '',
            responsesFailure,
          )
        }
      }
      const failure = classifyOpenAIHttpFailure({
        status: response.status,
        body: errorBody,
      })
      if (
        isLocal &&
        failure.category === 'endpoint_not_found' &&
        promoteNextLocalBaseUrl('endpoint_not_found')
      ) {
        continue
      }
      const hasToolsPayload =
        Array.isArray(body.tools) &&
        body.tools.length > 0
      if (
        !didRetryWithoutTools &&
        failure.category === 'tool_call_incompatible' &&
        shouldAttemptLocalToollessRetry({
          baseUrl: activeBaseUrl,
          hasTools: hasToolsPayload,
        })
      ) {
        didRetryWithoutTools = true
        delete body.tools
        delete body.tool_choice
        refreshSerializedBody()
        logForDebugging(
          `[OpenAIShim] self-heal retry reason=tool_call_incompatible mode=toolless method=POST url=${redactUrlForDiagnostics(chatCompletionsUrl)} model=${request.resolvedModel}`,
          { level: 'warn' },
        )
        continue
      }
      let errorResponse: object | undefined
      try { errorResponse = JSON.parse(errorBody) } catch { /* raw text */ }
      throwClassifiedHttpError(
@@ -1686,6 +1973,7 @@ class OpenAIShimMessages {
        response.headers as unknown as Headers,
        chatCompletionsUrl,
        rateHint,
        failure,
      )
    }
@@ -1742,7 +2030,7 @@ class OpenAIShimMessages {
    if (typeof rawContent === 'string' && rawContent) {
      content.push({
        type: 'text',
-        text: stripLeakedReasoningPreamble(rawContent),
+        text: stripThinkTags(rawContent),
      })
    } else if (Array.isArray(rawContent) && rawContent.length > 0) {
      const parts: string[] = []
@@ -1760,7 +2048,7 @@ class OpenAIShimMessages {
      if (joined) {
        content.push({
          type: 'text',
-          text: stripLeakedReasoningPreamble(joined),
+          text: stripThinkTags(joined),
        })
      }
    }
--- a/src/services/api/providerConfig.local.test.ts
+++ b/src/services/api/providerConfig.local.test.ts
@@ -2,8 +2,10 @@ import { afterEach, expect, test } from 'bun:test'
 import {
  getAdditionalModelOptionsCacheScope,
  getLocalProviderRetryBaseUrls,
  isLocalProviderUrl,
  resolveProviderRequest,
  shouldAttemptLocalToollessRetry,
 } from './providerConfig.js'
 const originalEnv = {
@@ -83,3 +85,42 @@ test('skips local model cache scope for remote openai-compatible providers', ()
  expect(getAdditionalModelOptionsCacheScope()).toBeNull()
 })
 test('derives local retry base URLs with /v1 and loopback fallback candidates', () => {
  expect(getLocalProviderRetryBaseUrls('http://localhost:11434')).toEqual([
    'http://localhost:11434/v1',
    'http://127.0.0.1:11434',
    'http://127.0.0.1:11434/v1',
  ])
 })
 test('does not derive local retry base URLs for remote providers', () => {
  expect(getLocalProviderRetryBaseUrls('https://api.openai.com/v1')).toEqual([])
 })
 test('enables local toolless retry for likely Ollama endpoints with tools', () => {
  expect(
    shouldAttemptLocalToollessRetry({
      baseUrl: 'http://localhost:11434/v1',
      hasTools: true,
    }),
  ).toBe(true)
 })
 test('disables local toolless retry when no tools are present', () => {
  expect(
    shouldAttemptLocalToollessRetry({
      baseUrl: 'http://localhost:11434/v1',
      hasTools: false,
    }),
  ).toBe(false)
 })
 test('disables local toolless retry for non-Ollama local endpoints', () => {
  expect(
    shouldAttemptLocalToollessRetry({
      baseUrl: 'http://localhost:1234/v1',
      hasTools: true,
    }),
  ).toBe(false)
 })
--- a/src/services/api/providerConfig.ts
+++ b/src/services/api/providerConfig.ts
@@ -305,6 +305,101 @@ export function isLocalProviderUrl(baseUrl: string | undefined): boolean {
  }
 }
 function trimTrailingSlash(value: string): string {
  return value.replace(/\/+$/, '')
 }
 function normalizePathWithV1(pathname: string): string {
  const trimmed = trimTrailingSlash(pathname)
  if (!trimmed || trimmed === '/') {
    return '/v1'
  }
  if (trimmed.toLowerCase().endsWith('/v1')) {
    return trimmed
  }
  return `${trimmed}/v1`
 }
 function isLikelyOllamaEndpoint(baseUrl: string): boolean {
  try {
    const parsed = new URL(baseUrl)
    const hostname = parsed.hostname.toLowerCase()
    const pathname = parsed.pathname.toLowerCase()
    if (parsed.port === '11434') {
      return true
    }
    return (
      hostname.includes('ollama') ||
      pathname.includes('ollama')
    )
  } catch {
    return false
  }
 }
 export function getLocalProviderRetryBaseUrls(baseUrl: string): string[] {
  if (!isLocalProviderUrl(baseUrl)) {
    return []
  }
  try {
    const parsed = new URL(baseUrl)
    const original = trimTrailingSlash(parsed.toString())
    const seen = new Set<string>([original])
    const candidates: string[] = []
    const addCandidate = (hostname: string, pathname: string): void => {
      const next = new URL(parsed.toString())
      next.hostname = hostname
      next.pathname = pathname
      next.search = ''
      next.hash = ''
      const normalized = trimTrailingSlash(next.toString())
      if (seen.has(normalized)) {
        return
      }
      seen.add(normalized)
      candidates.push(normalized)
    }
    const v1Pathname = normalizePathWithV1(parsed.pathname)
    if (v1Pathname !== trimTrailingSlash(parsed.pathname)) {
      addCandidate(parsed.hostname, v1Pathname)
    }
    const hostname = parsed.hostname.toLowerCase().replace(/^\[|\]$/g, '')
    if (hostname === 'localhost' || hostname === '::1') {
      addCandidate('127.0.0.1', parsed.pathname || '/')
      addCandidate('127.0.0.1', v1Pathname)
    }
    return candidates
  } catch {
    return []
  }
 }
 export function shouldAttemptLocalToollessRetry(options: {
  baseUrl: string
  hasTools: boolean
 }): boolean {
  if (!options.hasTools) {
    return false
  }
  if (!isLocalProviderUrl(options.baseUrl)) {
    return false
  }
  return isLikelyOllamaEndpoint(options.baseUrl)
 }
 export function isCodexBaseUrl(baseUrl: string | undefined): boolean {
  if (!baseUrl) return false
  try {
@@ -412,6 +507,9 @@ export function resolveProviderRequest(options?: {
    ? normalizedGeminiEnvBaseUrl
    : asNamedEnvUrl(process.env.OPENAI_BASE_URL, 'OPENAI_BASE_URL')
  // In Mistral mode, a literal "undefined" MISTRAL_BASE_URL is treated as
  // misconfiguration and falls back to OPENAI_API_BASE, then
  // DEFAULT_MISTRAL_BASE_URL for a safe default endpoint.
  const fallbackEnvBaseUrl = isMistralMode
    ? (primaryEnvBaseUrl === undefined
      ? asNamedEnvUrl(process.env.OPENAI_API_BASE, 'OPENAI_API_BASE') ?? DEFAULT_MISTRAL_BASE_URL
--- a/src/services/api/reasoningLeakSanitizer.test.ts
+++ b/src/services/api/reasoningLeakSanitizer.test.ts
@@ -1,46 +0,0 @@
 import { describe, expect, test } from 'bun:test'
 import {
  looksLikeLeakedReasoningPrefix,
  shouldBufferPotentialReasoningPrefix,
  stripLeakedReasoningPreamble,
 } from './reasoningLeakSanitizer.ts'
 describe('reasoning leak sanitizer', () => {
  test('strips explicit internal reasoning preambles', () => {
    const text =
      'The user just said "hey" - a simple greeting. I should respond briefly and friendly.\n\nHey! How can I help you today?'
    expect(looksLikeLeakedReasoningPrefix(text)).toBe(true)
    expect(stripLeakedReasoningPreamble(text)).toBe(
      'Hey! How can I help you today?',
    )
  })
  test('does not strip normal user-facing advice that mentions "the user should"', () => {
    const text =
      'The user should reset their password immediately.\n\nHere are the steps...'
    expect(looksLikeLeakedReasoningPrefix(text)).toBe(false)
    expect(shouldBufferPotentialReasoningPrefix(text)).toBe(false)
    expect(stripLeakedReasoningPreamble(text)).toBe(text)
  })
  test('does not strip legitimate first-person advice about responding to an incident', () => {
    const text =
      'I need to respond to this security incident immediately. The system is compromised.\n\nHere are the remediation steps...'
    expect(looksLikeLeakedReasoningPrefix(text)).toBe(false)
    expect(shouldBufferPotentialReasoningPrefix(text)).toBe(false)
    expect(stripLeakedReasoningPreamble(text)).toBe(text)
  })
  test('does not strip legitimate first-person advice about answering a support ticket', () => {
    const text =
      'I need to answer the support ticket before end of day. The customer is waiting.\n\nHere is the response I drafted...'
    expect(looksLikeLeakedReasoningPrefix(text)).toBe(false)
    expect(shouldBufferPotentialReasoningPrefix(text)).toBe(false)
    expect(stripLeakedReasoningPreamble(text)).toBe(text)
  })
 })
--- a/src/services/api/reasoningLeakSanitizer.ts
+++ b/src/services/api/reasoningLeakSanitizer.ts
@@ -1,54 +0,0 @@
 const EXPLICIT_REASONING_START_RE =
  /^\s*(i should\b|i need to\b|let me think\b|the task\b|the request\b)/i
 const EXPLICIT_REASONING_META_RE =
  /\b(user|request|question|prompt|message|task|greeting|small talk|briefly|friendly|concise)\b/i
 const USER_META_START_RE =
  /^\s*the user\s+(just\s+)?(said|asked|is asking|wants|wanted|mentioned|seems|appears)\b/i
 const USER_REASONING_RE =
  /^\s*the user\s+(just\s+)?(said|asked|is asking|wants|wanted|mentioned|seems|appears)\b[\s\S]*\b(i should|i need to|let me think|respond|reply|answer|greeting|small talk|briefly|friendly|concise)\b/i
 export function shouldBufferPotentialReasoningPrefix(text: string): boolean {
  const normalized = text.trim()
  if (!normalized) return false
  if (looksLikeLeakedReasoningPrefix(normalized)) {
    return true
  }
  const hasParagraphBoundary = /\n\s*\n/.test(normalized)
  if (hasParagraphBoundary) {
    return false
  }
  return (
    EXPLICIT_REASONING_START_RE.test(normalized) ||
    USER_META_START_RE.test(normalized)
  )
 }
 export function looksLikeLeakedReasoningPrefix(text: string): boolean {
  const normalized = text.trim()
  if (!normalized) return false
  return (
    (EXPLICIT_REASONING_START_RE.test(normalized) &&
      EXPLICIT_REASONING_META_RE.test(normalized)) ||
    USER_REASONING_RE.test(normalized)
  )
 }
 export function stripLeakedReasoningPreamble(text: string): string {
  const normalized = text.replace(/\r\n/g, '\n')
  const parts = normalized.split(/\n\s*\n/)
  if (parts.length < 2) return text
  const first = parts[0]?.trim() ?? ''
  if (!looksLikeLeakedReasoningPrefix(first)) {
    return text
  }
  const remainder = parts.slice(1).join('\n\n').trim()
  return remainder || text
 }
--- a/src/services/api/smartModelRouting.test.ts
+++ b/src/services/api/smartModelRouting.test.ts
@@ -0,0 +1,191 @@
 import { describe, expect, test } from 'bun:test'
 import {
  routeModel,
  type SmartRoutingConfig,
 } from './smartModelRouting.ts'
 const ENABLED: SmartRoutingConfig = {
  enabled: true,
  simpleModel: 'claude-haiku-4-5',
  strongModel: 'claude-opus-4-7',
 }
 describe('routeModel — disabled / misconfigured', () => {
  test('disabled config routes to strong', () => {
    const decision = routeModel(
      { userText: 'hi' },
      { ...ENABLED, enabled: false },
    )
    expect(decision.model).toBe('claude-opus-4-7')
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('disabled')
  })
  test('missing simpleModel falls back to strong', () => {
    const decision = routeModel(
      { userText: 'hi' },
      { ...ENABLED, simpleModel: '' },
    )
    expect(decision.model).toBe('claude-opus-4-7')
    expect(decision.complexity).toBe('strong')
  })
  test('simpleModel === strongModel routes to strong (no-op)', () => {
    const decision = routeModel(
      { userText: 'hi' },
      { ...ENABLED, simpleModel: 'claude-opus-4-7' },
    )
    expect(decision.model).toBe('claude-opus-4-7')
    expect(decision.complexity).toBe('strong')
  })
 })
 describe('routeModel — simple path', () => {
  test('short greeting routes to simple', () => {
    const decision = routeModel({ userText: 'thanks!', turnNumber: 5 }, ENABLED)
    expect(decision.model).toBe('claude-haiku-4-5')
    expect(decision.complexity).toBe('simple')
  })
  test('empty input routes to simple', () => {
    const decision = routeModel({ userText: '   ' }, ENABLED)
    expect(decision.model).toBe('claude-haiku-4-5')
    expect(decision.complexity).toBe('simple')
  })
  test('mid-length chatter routes to simple', () => {
    const decision = routeModel(
      { userText: 'yep looks good, go ahead', turnNumber: 10 },
      ENABLED,
    )
    expect(decision.complexity).toBe('simple')
  })
 })
 describe('routeModel — strong path', () => {
  test('first turn always routes to strong, even when short', () => {
    const decision = routeModel(
      { userText: 'fix the bug', turnNumber: 1 },
      ENABLED,
    )
    expect(decision.model).toBe('claude-opus-4-7')
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('first turn')
  })
  test('code fence routes to strong', () => {
    const decision = routeModel(
      {
        userText: 'change this:\n```\nfoo()\n```',
        turnNumber: 5,
      },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('code')
  })
  test('inline code span routes to strong', () => {
    const decision = routeModel(
      { userText: 'rename `foo` to `bar`', turnNumber: 5 },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
  })
  test('reasoning keyword "plan" routes to strong even when short', () => {
    const decision = routeModel(
      { userText: 'plan the refactor', turnNumber: 5 },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('keyword')
  })
  test('reasoning keyword "debug" routes to strong', () => {
    const decision = routeModel(
      { userText: 'debug the test', turnNumber: 5 },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
  })
  test('"root cause" multi-word keyword routes to strong', () => {
    const decision = routeModel(
      { userText: 'find the root cause', turnNumber: 5 },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
  })
  test('multi-paragraph input routes to strong', () => {
    const decision = routeModel(
      {
        userText: 'first thought.\n\nsecond thought.',
        turnNumber: 5,
      },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('multi-paragraph')
  })
  test('over-long input routes to strong', () => {
    const long = 'ok '.repeat(100) // ~300 chars, 100 words
    const decision = routeModel(
      { userText: long, turnNumber: 5 },
      ENABLED,
    )
    expect(decision.complexity).toBe('strong')
  })
  test('exactly at the boundary stays simple', () => {
    const text = 'a'.repeat(160)
    const decision = routeModel(
      { userText: text, turnNumber: 5 },
      { ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
    )
    expect(decision.complexity).toBe('simple')
  })
  test('one char over the boundary routes to strong', () => {
    const text = 'a'.repeat(161)
    const decision = routeModel(
      { userText: text, turnNumber: 5 },
      { ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
    )
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('160 chars')
  })
 })
 describe('routeModel — config overrides', () => {
  test('custom simpleMaxChars is honored', () => {
    const decision = routeModel(
      { userText: 'abcdefghijklmnop', turnNumber: 5 },
      { ...ENABLED, simpleMaxChars: 10 },
    )
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('10 chars')
  })
  test('custom simpleMaxWords is honored', () => {
    const decision = routeModel(
      { userText: 'one two three four five', turnNumber: 5 },
      { ...ENABLED, simpleMaxWords: 3 },
    )
    expect(decision.complexity).toBe('strong')
    expect(decision.reason).toContain('3 words')
  })
 })
 describe('routeModel — reason strings', () => {
  test('simple decisions include char + word counts', () => {
    const decision = routeModel(
      { userText: 'sounds good', turnNumber: 5 },
      ENABLED,
    )
    expect(decision.reason).toMatch(/\d+ chars, \d+ words/)
  })
 })
--- a/src/services/api/smartModelRouting.ts
+++ b/src/services/api/smartModelRouting.ts
@@ -0,0 +1,215 @@
 /**
 * Smart model routing — cheap-for-simple, strong-for-hard.
 *
 * For everyday short chatter ("ok", "thanks", "what does this do?") the
 * incremental quality of Opus/GPT-5 over Haiku/Mini is negligible while the
 * cost and latency are an order of magnitude worse. Smart routing opts a
 * user into routing such "obviously simple" turns to a cheaper model while
 * keeping the strong model for the anything-non-trivial path.
 *
 * This module is a pure primitive: it takes a turn description (the user's
 * text + light context) and returns which model to use, based on config.
 * It never reads env vars or state directly — caller supplies everything.
 *
 * Off by default. Users opt in via settings.smartRouting.enabled. Intent:
 * make this a copy-paste-small config block rather than a hidden heuristic,
 * so the tradeoff is visible and the user controls it.
 */
 export type SmartRoutingConfig = {
  enabled: boolean
  /** Model to use for turns classified as "simple". */
  simpleModel: string
  /** Model to use for turns classified as "strong" (or when unsure). */
  strongModel: string
  /** Max characters in user input to qualify as "simple". Default 160. */
  simpleMaxChars?: number
  /** Max whitespace-separated words to qualify as "simple". Default 28. */
  simpleMaxWords?: number
 }
 export type RoutingDecision = {
  model: string
  complexity: 'simple' | 'strong'
  /** Human-readable reason — useful for the UI indicator and debug logs. */
  reason: string
 }
 export type RoutingInput = {
  /** The user's message text for this turn. */
  userText: string
  /**
   * Optional: how many tool-use blocks the assistant has emitted in the
   * recent conversation. High values correlate with "continue this work"
   * follow-ups that can still be cheap, UNLESS the user also typed code
   * or strong-keyword text.
   */
  recentToolUses?: number
  /**
   * Optional: turn number within the current session (1-indexed). The first
   * turn is often task-setup and benefits from the strong model even if
   * short — a bare "build X" opens the whole task.
   */
  turnNumber?: number
 }
 const DEFAULT_SIMPLE_MAX_CHARS = 160
 const DEFAULT_SIMPLE_MAX_WORDS = 28
 // Keywords that strongly suggest reasoning/planning/design work.
 // Matching is word-boundary / case-insensitive. Must include enough anchors
 // that short prompts like "plan the refactor" route to strong even under
 // the char/word cutoff.
 const STRONG_KEYWORDS = [
  'plan',
  'design',
  'architect',
  'architecture',
  'refactor',
  'debug',
  'investigate',
  'analyze',
  'analyse',
  'implement',
  'optimize',
  'optimise',
  'review',
  'audit',
  'diagnose',
  'root cause',
  'root-cause',
  'why does',
  'why is',
  'how should',
  'why did',
  'propose',
  'trace',
  'reproduce',
 ]
 const STRONG_KEYWORD_RE = new RegExp(
  `\\b(?:${STRONG_KEYWORDS.map(k => k.replace(/[-]/g, '[-\\s]')).join('|')})\\b`,
  'i',
 )
 const CODE_FENCE_RE = /```[\s\S]*?```|`[^`\n]+`/
 function countWords(text: string): number {
  const trimmed = text.trim()
  if (!trimmed) return 0
  return trimmed.split(/\s+/).length
 }
 function hasMultiParagraph(text: string): boolean {
  return /\n\s*\n/.test(text)
 }
 function hasCode(text: string): boolean {
  return CODE_FENCE_RE.test(text)
 }
 function hasStrongKeyword(text: string): boolean {
  return STRONG_KEYWORD_RE.test(text)
 }
 /**
 * Decide whether to route to the simple or strong model based on heuristics.
 * Returns the chosen model + a reason. When routing is disabled or both
 * models match, the strong model is used (safe default).
 */
 export function routeModel(
  input: RoutingInput,
  config: SmartRoutingConfig,
 ): RoutingDecision {
  if (!config.enabled) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'smart-routing disabled',
    }
  }
  if (!config.simpleModel || !config.strongModel) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'simpleModel or strongModel missing from config',
    }
  }
  if (config.simpleModel === config.strongModel) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'simpleModel equals strongModel',
    }
  }
  const text = input.userText ?? ''
  const trimmed = text.trim()
  if (!trimmed) {
    // Empty input (e.g. resuming a tool-use chain) — cheap by default.
    return {
      model: config.simpleModel,
      complexity: 'simple',
      reason: 'empty user text',
    }
  }
  // First turn of a session is task-setup — always use strong.
  if (input.turnNumber === 1) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'first turn of session',
    }
  }
  const maxChars = config.simpleMaxChars ?? DEFAULT_SIMPLE_MAX_CHARS
  const maxWords = config.simpleMaxWords ?? DEFAULT_SIMPLE_MAX_WORDS
  if (hasCode(trimmed)) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'contains code block or inline code',
    }
  }
  if (hasStrongKeyword(trimmed)) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'contains reasoning/planning keyword',
    }
  }
  if (hasMultiParagraph(trimmed)) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: 'multi-paragraph input',
    }
  }
  if (trimmed.length > maxChars) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: `input > ${maxChars} chars`,
    }
  }
  if (countWords(trimmed) > maxWords) {
    return {
      model: config.strongModel,
      complexity: 'strong',
      reason: `input > ${maxWords} words`,
    }
  }
  return {
    model: config.simpleModel,
    complexity: 'simple',
    reason: `short (${trimmed.length} chars, ${countWords(trimmed)} words)`,
  }
 }
--- a/src/services/api/thinkTagSanitizer.test.ts
+++ b/src/services/api/thinkTagSanitizer.test.ts
@@ -0,0 +1,183 @@
 import { describe, expect, test } from 'bun:test'
 import {
  createThinkTagFilter,
  stripThinkTags,
 } from './thinkTagSanitizer.ts'
 describe('stripThinkTags — whole-text cleanup', () => {
  test('strips closed think pair', () => {
    expect(stripThinkTags('<think>reasoning</think>Hello')).toBe('Hello')
  })
  test('strips closed thinking pair', () => {
    expect(stripThinkTags('<thinking>x</thinking>Out')).toBe('Out')
  })
  test('strips closed reasoning pair', () => {
    expect(stripThinkTags('<reasoning>x</reasoning>Out')).toBe('Out')
  })
  test('strips REASONING_SCRATCHPAD pair', () => {
    expect(stripThinkTags('<REASONING_SCRATCHPAD>plan</REASONING_SCRATCHPAD>Answer'))
      .toBe('Answer')
  })
  test('is case-insensitive', () => {
    expect(stripThinkTags('<THINKING>x</THINKING>out')).toBe('out')
    expect(stripThinkTags('<Think>x</Think>out')).toBe('out')
  })
  test('handles attributes on open tag', () => {
    expect(stripThinkTags('<think id="plan-1">reason</think>ok')).toBe('ok')
  })
  test('strips unterminated open tag at block boundary', () => {
    expect(stripThinkTags('<think>reasoning that never closes')).toBe('')
  })
  test('strips unterminated open tag after newline', () => {
    // Block-boundary match consumes the leading newline, same as hermes.
    expect(stripThinkTags('Answer: 42\n<think>second-guess myself'))
      .toBe('Answer: 42')
  })
  test('strips orphan close tag', () => {
    expect(stripThinkTags('trailing </think>done')).toBe('trailing done')
  })
  test('strips multiple blocks', () => {
    expect(stripThinkTags('<think>a</think>B<think>c</think>D')).toBe('BD')
  })
  test('handles reasoning mid-response after content', () => {
    expect(stripThinkTags('Answer: 42\n<think>double-check</think>\nDone'))
      .toBe('Answer: 42\n\nDone')
  })
  test('handles nested-looking tags (lazy match + orphan cleanup)', () => {
    expect(stripThinkTags('<think><think>x</think></think>y')).toBe('y')
  })
  test('preserves legitimate non-think tags', () => {
    expect(stripThinkTags('use <div> and <span>')).toBe('use <div> and <span>')
  })
  test('preserves text without any tags', () => {
    expect(stripThinkTags('Hello, world. I should respond briefly.')).toBe(
      'Hello, world. I should respond briefly.',
    )
  })
  test('handles empty input', () => {
    expect(stripThinkTags('')).toBe('')
  })
 })
 describe('createThinkTagFilter — streaming state machine', () => {
  test('passes through plain text', () => {
    const f = createThinkTagFilter()
    expect(f.feed('Hello, ')).toBe('Hello, ')
    expect(f.feed('world!')).toBe('world!')
    expect(f.flush()).toBe('')
  })
  test('strips a complete think block in one chunk', () => {
    const f = createThinkTagFilter()
    expect(f.feed('pre<think>reason</think>post')).toBe('prepost')
    expect(f.flush()).toBe('')
  })
  test('handles open tag split across deltas', () => {
    const f = createThinkTagFilter()
    expect(f.feed('before<th')).toBe('before')
    expect(f.feed('ink>reason</think>after')).toBe('after')
    expect(f.flush()).toBe('')
  })
  test('handles close tag split across deltas', () => {
    const f = createThinkTagFilter()
    expect(f.feed('<think>reason</th')).toBe('')
    expect(f.feed('ink>keep')).toBe('keep')
    expect(f.flush()).toBe('')
  })
  test('handles tag split on bare < boundary', () => {
    const f = createThinkTagFilter()
    expect(f.feed('leading <')).toBe('leading ')
    expect(f.feed('think>inner</think>tail')).toBe('tail')
    expect(f.flush()).toBe('')
  })
  test('preserves partial non-tag < at boundary when next char rules it out', () => {
    const f = createThinkTagFilter()
    // "<d" — 'd' cannot start any of our tag names, so emit immediately
    expect(f.feed('pre<d')).toBe('pre<d')
    expect(f.feed('iv>rest')).toBe('iv>rest')
    expect(f.flush()).toBe('')
  })
  test('case-insensitive streaming', () => {
    const f = createThinkTagFilter()
    expect(f.feed('<THINKING>x</THINKING>out')).toBe('out')
    expect(f.flush()).toBe('')
  })
  test('unterminated open tag — flush drops remainder', () => {
    const f = createThinkTagFilter()
    expect(f.feed('<think>reasoning with no close ')).toBe('')
    expect(f.feed('and more reasoning')).toBe('')
    expect(f.flush()).toBe('')
    expect(f.isInsideBlock()).toBe(false)
  })
  test('multiple blocks in single feed', () => {
    const f = createThinkTagFilter()
    expect(f.feed('<think>a</think>B<think>c</think>D')).toBe('BD')
    expect(f.flush()).toBe('')
  })
  test('flush after clean stream emits nothing extra', () => {
    const f = createThinkTagFilter()
    expect(f.feed('complete message')).toBe('complete message')
    expect(f.flush()).toBe('')
  })
  test('flush of bare < at end emits it (not a tag prefix)', () => {
    const f = createThinkTagFilter()
    // bare '<' held back; flush emits it since it has no tag-name chars
    expect(f.feed('x <')).toBe('x ')
    expect(f.flush()).toBe('<')
  })
  test('flush of partial tag-name prefix at end drops it', () => {
    const f = createThinkTagFilter()
    expect(f.feed('x <thi')).toBe('x ')
    expect(f.flush()).toBe('')
  })
  test('handles attributes on streaming open tag', () => {
    const f = createThinkTagFilter()
    expect(f.feed('<think type="plan">reason</think>ok')).toBe('ok')
    expect(f.flush()).toBe('')
  })
  test('mid-delta transition: content, reasoning, content', () => {
    const f = createThinkTagFilter()
    expect(f.feed('Answer: 42\n<think>')).toBe('Answer: 42\n')
    expect(f.feed('double-check')).toBe('')
    expect(f.feed('</think>\nDone')).toBe('\nDone')
    expect(f.flush()).toBe('')
  })
  test('orphan close tag mid-stream is stripped on flush via safety-net behavior', () => {
    // Filter alone treats orphan close as "we're not inside", so it emits as-is.
    // Safety net (stripThinkTags on final text) removes orphans.
    const f = createThinkTagFilter()
    const chunk1 = f.feed('trailing ')
    const chunk2 = f.feed('</think>done')
    const final = chunk1 + chunk2 + f.flush()
    // Orphan close appears in stream output; safety net cleans it
    expect(stripThinkTags(final)).toBe('trailing done')
  })
 })
--- a/src/services/api/thinkTagSanitizer.ts
+++ b/src/services/api/thinkTagSanitizer.ts
@@ -0,0 +1,162 @@
 /**
 * Think-tag sanitizer for reasoning content leaks.
 *
 * Some OpenAI-compatible reasoning models (MiniMax M2.7, GLM-4.5/5, DeepSeek, Kimi K2,
 * self-hosted vLLM builds) emit chain-of-thought inline inside the `content` field using
 * XML-like tags instead of the separate `reasoning_content` channel. Example:
 *
 *   <think>the user wants foo, let me check bar</think>Here is the answer: ...
 *
 * This module strips those blocks structurally (tag-based), independent of English
 * phrasings. Three layers:
 *
 *   1. `createThinkTagFilter()` — streaming state machine. Feeds deltas, emits only
 *      the visible (non-reasoning) portion, and buffers partial tags across chunk
 *      boundaries so `</th` + `ink>` still parses correctly.
 *
 *   2. `stripThinkTags()` — whole-text cleanup. Removes closed pairs, unterminated
 *      opens at block boundaries, and orphan open/close tags. Used for non-streaming
 *      responses and as a safety net after stream close.
 *
 *   3. Flush discards buffered partial tags at stream end (false-negative bias —
 *      prefer losing a partial reasoning fragment over leaking it).
 */
 const TAG_NAMES = [
  'think',
  'thinking',
  'reasoning',
  'thought',
  'reasoning_scratchpad',
 ] as const
 const TAG_ALT = TAG_NAMES.join('|')
 const OPEN_TAG_RE = new RegExp(`<\\s*(?:${TAG_ALT})\\b[^>]*>`, 'i')
 const CLOSE_TAG_RE = new RegExp(`<\\s*/\\s*(?:${TAG_ALT})\\s*>`, 'i')
 const CLOSED_PAIR_RE_G = new RegExp(
  `<\\s*(${TAG_ALT})\\b[^>]*>[\\s\\S]*?<\\s*/\\s*\\1\\s*>`,
  'gi',
 )
 const UNTERMINATED_OPEN_RE = new RegExp(
  `(?:^|\\n)[ \\t]*<\\s*(?:${TAG_ALT})\\b[^>]*>[\\s\\S]*$`,
  'i',
 )
 const ORPHAN_TAG_RE_G = new RegExp(
  `<\\s*/?\\s*(?:${TAG_ALT})\\b[^>]*>\\s*`,
  'gi',
 )
 const MAX_PARTIAL_TAG = 64
 /**
 * Remove reasoning/thinking blocks from a complete text body.
 *
 * Handles:
 *   - Closed pairs: <think>...</think> (lazy match, anywhere in text)
 *   - Unterminated open tags at a block boundary: strips from the tag to end of string
 *   - Orphan open or close tags (no matching partner)
 *
 * False-negative bias: prefers leaving a few tag characters in rare edge cases over
 * stripping legitimate content.
 */
 export function stripThinkTags(text: string): string {
  if (!text) return text
  let out = text
  out = out.replace(CLOSED_PAIR_RE_G, '')
  out = out.replace(UNTERMINATED_OPEN_RE, '')
  out = out.replace(ORPHAN_TAG_RE_G, '')
  return out
 }
 export interface ThinkTagFilter {
  feed(chunk: string): string
  flush(): string
  isInsideBlock(): boolean
 }
 /**
 * Streaming state machine. Feed deltas, emits visible (non-reasoning) text.
 * Handles tags split across chunk boundaries by holding back a short tail buffer
 * whenever the current buffer ends with what looks like a partial tag.
 */
 export function createThinkTagFilter(): ThinkTagFilter {
  let inside = false
  let buffer = ''
  function findPartialTagStart(s: string): number {
    const lastLt = s.lastIndexOf('<')
    if (lastLt === -1) return -1
    if (s.indexOf('>', lastLt) !== -1) return -1
    const tail = s.slice(lastLt)
    if (tail.length > MAX_PARTIAL_TAG) return -1
    const m = /^<\s*\/?\s*([a-zA-Z_]\w*)?\s*$/.exec(tail)
    if (!m) return -1
    const partialName = (m[1] ?? '').toLowerCase()
    if (!partialName) return lastLt
    if (TAG_NAMES.some(name => name.startsWith(partialName))) return lastLt
    return -1
  }
  function feed(chunk: string): string {
    if (!chunk) return ''
    buffer += chunk
    let out = ''
    while (buffer.length > 0) {
      if (!inside) {
        const open = OPEN_TAG_RE.exec(buffer)
        if (open) {
          out += buffer.slice(0, open.index)
          buffer = buffer.slice(open.index + open[0].length)
          inside = true
          continue
        }
        const partialStart = findPartialTagStart(buffer)
        if (partialStart === -1) {
          out += buffer
          buffer = ''
        } else {
          out += buffer.slice(0, partialStart)
          buffer = buffer.slice(partialStart)
        }
        return out
      }
      const close = CLOSE_TAG_RE.exec(buffer)
      if (close) {
        buffer = buffer.slice(close.index + close[0].length)
        inside = false
        continue
      }
      const partialStart = findPartialTagStart(buffer)
      if (partialStart === -1) {
        buffer = ''
      } else {
        buffer = buffer.slice(partialStart)
      }
      return out
    }
    return out
  }
  function flush(): string {
    const held = buffer
    const wasInside = inside
    buffer = ''
    inside = false
    if (wasInside) return ''
    if (!held) return ''
    if (/^<\s*\/?\s*[a-zA-Z_]/.test(held)) return ''
    return held
  }
  return { feed, flush, isInsideBlock: () => inside }
 }
--- a/src/services/autoFix/autoFixRunner.test.ts
+++ b/src/services/autoFix/autoFixRunner.test.ts
@@ -70,7 +70,7 @@ describe('runAutoFixCheck', () => {
  test('handles timeout gracefully', async () => {
    const result = await runAutoFixCheck({
-      lint: 'sleep 10',
+      lint: 'node -e "setTimeout(() => {}, 10000)"',
      timeout: 100,
      cwd: '/tmp',
--- a/src/services/autoFix/autoFixRunner.ts
+++ b/src/services/autoFix/autoFixRunner.ts
@@ -46,14 +46,31 @@ async function runCommand(
    const killTree = () => {
      try {
-        if (!isWindows && proc.pid) {
+        if (isWindows && proc.pid) {
          // shell=true on Windows can leave child commands running unless we
          // terminate the full process tree.
          const killer = spawn('taskkill', ['/pid', String(proc.pid), '/T', '/F'], {
            windowsHide: true,
            stdio: 'ignore',
          })
          killer.unref()
          return
        }
        if (proc.pid) {
          // Kill the entire process group
          process.kill(-proc.pid, 'SIGTERM')
-        } else {
+          return
          proc.kill('SIGTERM')
        }
        proc.kill('SIGTERM')
      } catch {
-        // Process may have already exited
+        // Process may have already exited; fallback to direct child kill.
        try {
          proc.kill('SIGTERM')
        } catch {
          // Ignore final fallback errors.
        }
      }
    }
--- a/src/services/compact/autoCompact.test.ts
+++ b/src/services/compact/autoCompact.test.ts
@@ -16,12 +16,21 @@ describe('getEffectiveContextWindowSize', () => {
    // 8k minus 20k summary reservation = -12k, causing infinite auto-compact.
    // Now the fallback is 128k and there's a floor, so effective is always
    // at least reservedTokensForSummary + buffer.
    //
    // The exact floor depends on the max-output-tokens slot-reservation cap
    // (tengu_otk_slot_v1 GrowthBook flag). With cap enabled, the model's
    // default output cap drops to CAPPED_DEFAULT_MAX_TOKENS (8k), so the
    // summary reservation is 8k and the floor is 8k + 13k = 21k. With cap
    // disabled it's 20k + 13k = 33k. Assert the worst case so the test is
    // stable regardless of flag state in CI vs local.
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
    try {
      const effective = getEffectiveContextWindowSize('some-unknown-3p-model')
      expect(effective).toBeGreaterThan(0)
-      // Must be at least summary reservation (20k) + buffer (13k) = 33k
+      // 21k = CAPPED_DEFAULT_MAX_TOKENS (8k) + AUTOCOMPACT_BUFFER_TOKENS (13k).
-      expect(effective).toBeGreaterThanOrEqual(33_000)
+      // Covers the anti-regression intent of issue #635 without assuming
      // the GrowthBook flag state.
      expect(effective).toBeGreaterThanOrEqual(21_000)
    } finally {
      delete process.env.CLAUDE_CODE_USE_OPENAI
    }
--- a/src/services/compact/microCompact.ts
+++ b/src/services/compact/microCompact.ts
@@ -38,7 +38,7 @@ export const TIME_BASED_MC_CLEARED_MESSAGE = '[Old tool result content cleared]'
 const IMAGE_MAX_TOKEN_SIZE = 2000
 // Only compact these built-in tools (MCP tools are also compactable via prefix match)
-const COMPACTABLE_TOOLS = new Set<string>([
+export const COMPACTABLE_TOOLS = new Set<string>([
  FILE_READ_TOOL_NAME,
  ...SHELL_TOOL_NAMES,
  GREP_TOOL_NAME,
@@ -51,7 +51,7 @@ const COMPACTABLE_TOOLS = new Set<string>([
 const MCP_TOOL_PREFIX = 'mcp__'
-function isCompactableTool(name: string): boolean {
+export function isCompactableTool(name: string): boolean {
  return COMPACTABLE_TOOLS.has(name) || name.startsWith(MCP_TOOL_PREFIX)
 }
--- a/src/services/mcp/client.ts
+++ b/src/services/mcp/client.ts
@@ -2524,7 +2524,7 @@ export async function transformResultContent(
      return [
        {
          type: 'text',
-          text: resultContent.text,
+          text: recursivelySanitizeUnicode(resultContent.text) as string,
        },
      ]
    case 'audio': {
@@ -2569,7 +2569,9 @@ export async function transformResultContent(
        return [
          {
            type: 'text',
-            text: `${prefix}${resource.text}`,
+            text: recursivelySanitizeUnicode(
              `${prefix}${resource.text}`,
            ) as string,
          },
        ]
      } else if ('blob' in resource) {
--- a/src/services/tokenEstimation.ts
+++ b/src/services/tokenEstimation.ts
@@ -223,6 +223,49 @@ export function bytesPerTokenForFileType(fileExtension: string): number {
  }
 }
 /**
 * Tokenizer ratio by model family.
 * Different models have different encodings.
 */
 export interface ModelTokenizerConfig {
  modelFamily: string
  bytesPerToken: number
  supportsJson: boolean
  supportsCode: boolean
 }
 export const MODEL_TOKENIZER_CONFIGS: ModelTokenizerConfig[] = [
  { modelFamily: 'claude', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
  { modelFamily: 'gpt-4', bytesPerToken: 4, supportsJson: true, supportsCode: true },
  { modelFamily: 'gpt-3.5', bytesPerToken: 4, supportsJson: true, supportsCode: true },
  { modelFamily: 'gemini', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
  { modelFamily: 'llama', bytesPerToken: 3.8, supportsJson: true, supportsCode: true },
  { modelFamily: 'deepseek', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
  { modelFamily: 'minimax', bytesPerToken: 3.2, supportsJson: true, supportsCode: true },
 ]
 /**
 * Get tokenizer config for a model.
 */
 export function getTokenizerConfig(model: string): ModelTokenizerConfig {
  const lower = model.toLowerCase()
  for (const config of MODEL_TOKENIZER_CONFIGS) {
    if (lower.includes(config.modelFamily)) {
      return config
    }
  }
  return { modelFamily: 'unknown', bytesPerToken: 4, supportsJson: true, supportsCode: true }
 }
 /**
 * Get bytes-per-token ratio for a model.
 */
 export function getBytesPerTokenForModel(model: string): number {
  return getTokenizerConfig(model).bytesPerToken
 }
 /**
 * Like {@link roughTokenCountEstimation} but uses a more accurate
 * bytes-per-token ratio when the file type is known.
@@ -241,6 +284,106 @@ export function roughTokenCountEstimationForFileType(
  )
 }
 /**
 * Content type classification for compression ratio.
 */
 export type ContentType = 
  | 'json' | 'code' | 'prose' | 'technical' 
  | 'list' | 'table' | 'mixed'
 /**
 * Compression ratio by content type.
 * Measured empirically - denser content = lower ratio.
 */
 export const COMPRESSION_RATIOS: Record<ContentType, { min: number; max: number; typical: number }> = {
  json: { min: 1.5, max: 2.5, typical: 2 },
  code: { min: 3, max: 4.5, typical: 3.5 },
  prose: { min: 3.5, max: 4.5, typical: 4 },
  technical: { min: 2.5, max: 3.5, typical: 3 },
  list: { min: 2, max: 3, typical: 2.5 },
  table: { min: 1.8, max: 2.8, typical: 2.2 },
  mixed: { min: 3, max: 4, typical: 3.5 },
 }
 /**
 * Detect content type from content.
 */
 export function detectContentType(content: string): ContentType {
  const trimmed = content.trim()
  // JSON
  if ((trimmed.startsWith('{') && trimmed.endsWith('}')) || 
      (trimmed.startsWith('[') && trimmed.endsWith(']'))) {
    try {
      JSON.parse(trimmed)
      return 'json'
    } catch { /* not valid json */ }
  }
  // Table (tabs or consistent delimiters)
  const lines = trimmed.split('\n')
  if (lines.length > 2) {
    const hasTabs = lines[0].includes('\t')
    const hasCommas = lines[0].includes(',')
    if (hasTabs || hasCommas) {
      const consistent = lines.slice(1).every(l => l.includes('\t') || l.includes(','))
      if (consistent) return 'table'
    }
  }
  // List
  if (/^[\d\-\*\•]/.test(trimmed) || /^[\d\-\*\•]/.test(lines[0])) {
    return 'list'
  }
  // Code (high density of special chars)
  const codeChars = (content.match(/[{}()\[\];=]/g) || []).length
  const codeRatio = codeChars / content.length
  if (codeRatio > 0.05) return 'code'
  // Technical (has numbers and units)
  if (/\d+\s*(px|em|rem|%|ms|s|kb|mb|gb)/i.test(content)) {
    return 'technical'
  }
  // Prose (default - natural language)
  return 'prose'
 }
 /**
 * Get compression ratio for content.
 */
 export function getCompressionRatio(content: string, type?: ContentType): { ratio: number; min: number; max: number } {
  const detectedType = type ?? detectContentType(content)
  const { min, max, typical } = COMPRESSION_RATIOS[detectedType]
  // Adjust based on actual content length
  // Shorter content = higher variance
  const lengthBonus = content.length < 100 ? 0.5 : 0
  return {
    ratio: typical,
    min: min + lengthBonus,
    max: max + lengthBonus,
  }
 }
 /**
 * Estimate tokens with confidence bounds.
 */
 export function estimateWithBounds(
  content: string,
  type?: ContentType,
 ): { estimate: number; min: number; max: number } {
  const { ratio, min: minRatio, max: maxRatio } = getCompressionRatio(content, type)
  const estimate = roughTokenCountEstimation(content, ratio)
  const min = roughTokenCountEstimation(content, maxRatio)
  const max = roughTokenCountEstimation(content, minRatio)
  return { estimate, min, max }
 }
 /**
 * Estimates token count for a Message object by extracting and analyzing its text content.
 * This provides a more reliable estimate than getTokenUsage for messages that may have been compacted.
--- a/src/services/tokenModelCompression.test.ts
+++ b/src/services/tokenModelCompression.test.ts
@@ -0,0 +1,100 @@
 import { describe, expect, it } from 'bun:test'
 import {
  getTokenizerConfig,
  getBytesPerTokenForModel,
  detectContentType,
  getCompressionRatio,
  estimateWithBounds,
 } from './tokenEstimation.js'
 describe('Model Tokenizers', () => {
  describe('getTokenizerConfig', () => {
    it('returns config for claude models', () => {
      const config = getTokenizerConfig('claude-sonnet-4-5-20250514')
      expect(config.modelFamily).toBe('claude')
      expect(config.bytesPerToken).toBe(3.5)
    })
    it('returns config for gpt models', () => {
      const config = getTokenizerConfig('gpt-4')
      expect(config.modelFamily).toBe('gpt-4')
      expect(config.bytesPerToken).toBe(4)
    })
    it('returns default for unknown models', () => {
      const config = getTokenizerConfig('unknown-model')
      expect(config.modelFamily).toBe('unknown')
      expect(config.bytesPerToken).toBe(4)
    })
  })
  describe('getBytesPerTokenForModel', () => {
    it('returns bytes per token for model', () => {
      expect(getBytesPerTokenForModel('claude-opus-3-5-20250214')).toBe(3.5)
      expect(getBytesPerTokenForModel('gpt-4o')).toBe(4)
      expect(getBytesPerTokenForModel('deepseek-chat')).toBe(3.5)
      expect(getBytesPerTokenForModel('minimax-M2.7')).toBe(3.2)
    })
  })
 })
 describe('Content Type Detection', () => {
  describe('detectContentType', () => {
    it('detects JSON', () => {
      expect(detectContentType('{"key": "value"}')).toBe('json')
      expect(detectContentType('[1, 2, 3]')).toBe('json')
    })
    it('detects code', () => {
      expect(detectContentType('function test() { return 1 + 2; }')).toBe('code')
      expect(detectContentType('const x = () => {}')).toBe('code')
    })
    it('detects prose', () => {
      expect(detectContentType('This is a natural language response.')).toBe('prose')
      expect(detectContentType('Hello world how are you?')).toBe('prose')
    })
    it('detects code-like technical', () => {
      // Has both code chars and technical - higher code char ratio wins
      expect(detectContentType('margin: 10px; padding: 5px;')).toBe('code')
    })
    it('detects list', () => {
      expect(detectContentType('- item 1\n- item 2')).toBe('list')
      expect(detectContentType('1. first\n2. second')).toBe('list')
    })
    it('detects prose by default', () => {
      // Single column with newlines = prose
      expect(detectContentType('a b c\n1 2 3')).toBe('prose')
    })
  })
 })
 describe('Compression Ratio', () => {
  describe('getCompressionRatio', () => {
    it('returns appropriate ratios', () => {
      expect(getCompressionRatio('{"a":1}').ratio).toBe(2)
      expect(getCompressionRatio('code here {} []').ratio).toBe(3.5)
      expect(getCompressionRatio('Hello world').ratio).toBe(4)
    })
  })
  describe('estimateWithBounds', () => {
    it('returns estimate with bounds', () => {
      const result = estimateWithBounds('Hello world')
      expect(result.min).toBeLessThanOrEqual(result.estimate)
      expect(result.max).toBeGreaterThanOrEqual(result.estimate)
      expect(result.min).toBeLessThan(result.max)
    })
    it('handles JSON with tighter bounds', () => {
      const result = estimateWithBounds('{"key": "value"}')
      // JSON has smaller ratio range
      expect(result.max).toBeLessThan(10)
    })
  })
 })
--- a/src/services/tools/toolExecution.ts
+++ b/src/services/tools/toolExecution.ts
@@ -1241,6 +1241,7 @@ async function checkPermissionsAndCallTool(
      {
        ...toolUseContext,
        toolUseId: toolUseID,
        hookChainsCanUseTool: canUseTool,
        userModified: permissionDecision.userModified ?? false,
      },
      canUseTool,
@@ -1729,19 +1730,29 @@ async function checkPermissionsAndCallTool(
    const hookMessages: MessageUpdateLazy<
      AttachmentMessage | ProgressMessage<HookProgress>
    >[] = []
-    for await (const hookResult of runPostToolUseFailureHooks(
+    const hookChainsContext = toolUseContext as ToolUseContext & {
-      toolUseContext,
+      hookChainsCanUseTool?: CanUseToolFn
-      tool,
+    }
-      toolUseID,
+    hookChainsContext.hookChainsCanUseTool = canUseTool
-      messageId,
+    try {
-      processedInput,
+      for await (const hookResult of runPostToolUseFailureHooks(
-      content,
+        toolUseContext,
-      isInterrupt,
+        tool,
-      requestId,
+        toolUseID,
-      mcpServerType,
+        messageId,
-      mcpServerBaseUrl,
+        processedInput,
-    )) {
+        content,
-      hookMessages.push(hookResult)
+        isInterrupt,
        requestId,
        mcpServerType,
        mcpServerBaseUrl,
      )) {
        hookMessages.push(hookResult)
      }
    } finally {
      if (hookChainsContext.hookChainsCanUseTool === canUseTool) {
        delete hookChainsContext.hookChainsCanUseTool
      }
    }
    return [
--- a/src/services/tools/toolHooks.ts
+++ b/src/services/tools/toolHooks.ts
@@ -284,6 +284,7 @@ export async function* runPostToolUseFailureHooks<Input extends AnyObject>(
      isInterrupt,
      permissionMode,
      toolUseContext.abortController.signal,
      undefined,
    )) {
      try {
        // Check if we were aborted during hook execution
--- a/src/services/wiki/init.test.ts
+++ b/src/services/wiki/init.test.ts
@@ -26,10 +26,10 @@ test('initializeWiki creates the expected wiki scaffold', async () => {
  expect(result.alreadyExisted).toBe(false)
  expect(result.createdFiles).toEqual([
-    '.openclaude/wiki/schema.md',
+    join('.openclaude', 'wiki', 'schema.md'),
-    '.openclaude/wiki/index.md',
+    join('.openclaude', 'wiki', 'index.md'),
-    '.openclaude/wiki/log.md',
+    join('.openclaude', 'wiki', 'log.md'),
-    '.openclaude/wiki/pages/architecture.md',
+    join('.openclaude', 'wiki', 'pages', 'architecture.md'),
  ])
  expect(await readFile(paths.schemaFile, 'utf8')).toContain(
    '# OpenClaude Wiki Schema',
--- a/src/tools/ConfigTool/prompt.ts
+++ b/src/tools/ConfigTool/prompt.ts
@@ -59,7 +59,7 @@ export function generatePrompt(): string {
 ## Configurable settings list
 The following settings are available for you to change:
-### Global Settings (stored in ~/.claude.json)
+### Global Settings (stored in ~/.openclaude.json)
 ${globalSettings.join('\n')}
 ### Project Settings (stored in settings.json)
--- a/src/tools/WebFetchTool/utils.ts
+++ b/src/tools/WebFetchTool/utils.ts
@@ -15,6 +15,7 @@ import {
 } from '../../utils/mcpOutputStorage.js'
 import { getSettings_DEPRECATED } from '../../utils/settings/settings.js'
 import { asSystemPrompt } from '../../utils/systemPromptType.js'
 import { ssrfGuardedLookup } from '../../utils/hooks/ssrfGuard.js'
 import { isPreapprovedHost } from './preapproved.js'
 import { makeSecondaryModelPrompt } from './prompt.js'
@@ -281,6 +282,7 @@ export async function getWithPermittedRedirects(
      maxRedirects: 0,
      responseType: 'arraybuffer',
      maxContentLength: MAX_HTTP_CONTENT_LENGTH,
      lookup: ssrfGuardedLookup,
      headers: {
        Accept: 'text/markdown, text/html, */*',
        'User-Agent': getWebFetchUserAgent(),
--- a/src/tools/WebSearchTool/providers/duckduckgo.ts
+++ b/src/tools/WebSearchTool/providers/duckduckgo.ts
@@ -1,6 +1,23 @@
 import type { SearchInput, SearchProvider } from './types.js'
 import { applyDomainFilters, type ProviderOutput } from './types.js'
 // DuckDuckGo's HTML scraper aggressively blocks datacenter / repeat IPs with
 // an "anomaly in the request" response. When that happens we surface an
 // actionable error instead of the opaque scraper message so users know how
 // to configure a working backend.
 const DDG_ANOMALY_HINT =
  'DuckDuckGo scraping is rate-limited from this network. ' +
  'Configure a search backend with one of: ' +
  'FIRECRAWL_API_KEY, TAVILY_API_KEY, EXA_API_KEY, YOU_API_KEY, ' +
  'JINA_API_KEY, BING_API_KEY, MOJEEK_API_KEY, LINKUP_API_KEY — ' +
  'or use an Anthropic / Vertex / Foundry provider for native web search.'
 function isAnomalyError(message: string): boolean {
  return /anomaly in the request|likely making requests too quickly/i.test(
    message,
  )
 }
 export const duckduckgoProvider: SearchProvider = {
  name: 'duckduckgo',
@@ -20,7 +37,16 @@ export const duckduckgoProvider: SearchProvider = {
    }
    if (signal?.aborted) throw new DOMException('Aborted', 'AbortError')
    // TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches
-    const response = await search(input.query, { safeSearch: SafeSearchType.STRICT })
+    let response: Awaited<ReturnType<typeof search>>
    try {
      response = await search(input.query, { safeSearch: SafeSearchType.STRICT })
    } catch (err) {
      const msg = err instanceof Error ? err.message : String(err)
      if (isAnomalyError(msg)) {
        throw new Error(DDG_ANOMALY_HINT)
      }
      throw err
    }
    const hits = applyDomainFilters(
      response.results.map(r => ({
--- a/src/utils/auth.ts
+++ b/src/utils/auth.ts
@@ -693,7 +693,7 @@ export function refreshAwsAuth(awsAuthRefresh: string): Promise<boolean> {
              'AWS auth refresh timed out after 3 minutes. Run your auth command manually in a separate terminal.',
            )
          : chalk.red(
-              'Error running awsAuthRefresh (in settings or ~/.claude.json):',
+              'Error running awsAuthRefresh (in settings or ~/.openclaude.json):',
            )
        // biome-ignore lint/suspicious/noConsole:: intentional console output
        console.error(message)
@@ -771,7 +771,7 @@ async function getAwsCredsFromCredentialExport(): Promise<{
      }
    } catch (e) {
      const message = chalk.red(
-        'Error getting AWS credentials from awsCredentialExport (in settings or ~/.claude.json):',
+        'Error getting AWS credentials from awsCredentialExport (in settings or ~/.openclaude.json):',
      )
      if (e instanceof Error) {
        // biome-ignore lint/suspicious/noConsole:: intentional console output
@@ -961,7 +961,7 @@ export function refreshGcpAuth(gcpAuthRefresh: string): Promise<boolean> {
              'GCP auth refresh timed out after 3 minutes. Run your auth command manually in a separate terminal.',
            )
          : chalk.red(
-              'Error running gcpAuthRefresh (in settings or ~/.claude.json):',
+              'Error running gcpAuthRefresh (in settings or ~/.openclaude.json):',
            )
        // biome-ignore lint/suspicious/noConsole:: intentional console output
        console.error(message)
@@ -1959,7 +1959,7 @@ export async function validateForceLoginOrg(): Promise<OrgValidationResult> {
  // Always fetch the authoritative org UUID from the profile endpoint.
  // Even keychain-sourced tokens verify server-side: the cached org UUID
-  // in ~/.claude.json is user-writable and cannot be trusted.
+  // in ~/.openclaude.json is user-writable and cannot be trusted.
  const { source } = getAuthTokenSource()
  const isEnvVarToken =
    source === 'CLAUDE_CODE_OAUTH_TOKEN' ||
--- a/src/utils/caCertsConfig.ts
+++ b/src/utils/caCertsConfig.ts
@@ -28,7 +28,7 @@ import { getSettingsForSource } from './settings/settings.js'
 * is lazy-initialized) and ensure Node.js compatibility.
 *
 * This is safe to call before the trust dialog because we only read from
- * user-controlled files (~/.claude/settings.json and ~/.claude.json),
+ * user-controlled files (~/.claude/settings.json and ~/.openclaude.json),
 * not from project-level settings.
 */
 export function applyExtraCACertsFromConfig(): void {
@@ -52,7 +52,7 @@ export function applyExtraCACertsFromConfig(): void {
 * after the trust dialog. But we need the CA cert early to establish the TLS
 * connection to an HTTPS proxy during init().
 *
- * We read from global config (~/.claude.json) and user settings
+ * We read from global config (~/.openclaude.json) and user settings
 * (~/.claude/settings.json). These are user-controlled files that don't
 * require trust approval.
 */
--- a/src/utils/claudeInChrome/setup.ts
+++ b/src/utils/claudeInChrome/setup.ts
@@ -355,7 +355,7 @@ exec ${command}
 *
 * Only positive detections are persisted. A negative result from the
 * filesystem scan is not cached, because it may come from a machine that
- * shares ~/.claude.json but has no local Chrome (e.g. a remote dev
+ * shares ~/.openclaude.json but has no local Chrome (e.g. a remote dev
 * environment using the bridge), and caching it would permanently poison
 * auto-enable for every session on every machine that reads that config.
 */
--- a/src/utils/config.ts
+++ b/src/utils/config.ts
@@ -244,6 +244,7 @@ export type GlobalConfig = {
  bypassPermissionsModeAccepted?: boolean
  hasUsedBackslashReturn?: boolean
  autoCompactEnabled: boolean // Controls whether auto-compact is enabled
  toolHistoryCompressionEnabled: boolean // Compress old tool_result content for small-context providers
  showTurnDuration: boolean // Controls whether to show turn duration message (e.g., "Cooked for 1m 6s")
  /**
   * @deprecated Use settings.env instead.
@@ -622,6 +623,7 @@ function createDefaultGlobalConfig(): GlobalConfig {
    verbose: false,
    editorMode: 'normal',
    autoCompactEnabled: true,
    toolHistoryCompressionEnabled: true,
    showTurnDuration: true,
    hasSeenTasksHint: false,
    hasUsedStash: false,
@@ -668,6 +670,7 @@ export const GLOBAL_CONFIG_KEYS = [
  'editorMode',
  'hasUsedBackslashReturn',
  'autoCompactEnabled',
  'toolHistoryCompressionEnabled',
  'showTurnDuration',
  'diffTool',
  'env',
@@ -918,7 +921,7 @@ let configCacheHits = 0
 let configCacheMisses = 0
 // Session-total count of actual disk writes to the global config file.
 // Exposed for internal-only dev diagnostics (see inc-4552) so anomalous write
-// rates surface in the UI before they corrupt ~/.claude.json.
+// rates surface in the UI before they corrupt ~/.openclaude.json.
 let globalConfigWriteCount = 0
 export function getGlobalConfigWriteCount(): number {
@@ -1257,7 +1260,7 @@ function saveConfigWithLock<A extends object>(
    const currentConfig = getConfig(file, createDefault)
    if (file === getGlobalClaudeFile() && wouldLoseAuthState(currentConfig)) {
      logForDebugging(
-        'saveConfigWithLock: re-read config is missing auth that cache has; refusing to write to avoid wiping ~/.claude.json. See GH #3117.',
+        'saveConfigWithLock: re-read config is missing auth that cache has; refusing to write to avoid wiping ~/.openclaude.json. See GH #3117.',
        { level: 'error' },
      )
      logEvent('tengu_config_auth_loss_prevented', {})
--- a/src/utils/deepLink/registerProtocol.ts
+++ b/src/utils/deepLink/registerProtocol.ts
@@ -253,7 +253,7 @@ async function resolveClaudePath(): Promise<string> {
 * Check whether the OS-level protocol handler is already registered AND
 * points at the expected `claude` binary. Reads the registration artifact
 * directly (symlink target, .desktop Exec line, registry value) rather than
- * a cached flag in ~/.claude.json, so:
+ * a cached flag in ~/.openclaude.json, so:
 *   - the check is per-machine (config can sync across machines; OS state can't)
 *   - stale paths self-heal (install-method change → re-register next session)
 *   - deleted artifacts self-heal
@@ -311,7 +311,7 @@ export async function ensureDeepLinkProtocolRegistered(): Promise<void> {
  // EACCES/ENOSPC are deterministic — retrying next session won't help.
  // Throttle to once per 24h so a read-only ~/.local/share/applications
  // doesn't generate a failure event on every startup. Marker lives in
-  // ~/.claude (per-machine, not synced) rather than ~/.claude.json (can sync).
+  // ~/.claude (per-machine, not synced) rather than ~/.openclaude.json (can sync).
  const failureMarkerPath = path.join(
    getClaudeConfigHomeDir(),
    '.deep-link-register-failed',
--- a/src/utils/env.test.ts
+++ b/src/utils/env.test.ts
@@ -0,0 +1,62 @@
 import { afterEach, beforeEach, expect, test } from 'bun:test'
 import { mkdtempSync, rmSync, writeFileSync } from 'fs'
 import { tmpdir } from 'os'
 import { join } from 'path'
 const originalEnv = {
  CLAUDE_CONFIG_DIR: process.env.CLAUDE_CONFIG_DIR,
  CLAUDE_CODE_CUSTOM_OAUTH_URL: process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL,
  USER_TYPE: process.env.USER_TYPE,
 }
 let tempDir: string
 beforeEach(() => {
  tempDir = mkdtempSync(join(tmpdir(), 'openclaude-env-test-'))
  process.env.CLAUDE_CONFIG_DIR = tempDir
  delete process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL
  delete process.env.USER_TYPE
 })
 afterEach(() => {
  rmSync(tempDir, { recursive: true, force: true })
  if (originalEnv.CLAUDE_CONFIG_DIR === undefined) {
    delete process.env.CLAUDE_CONFIG_DIR
  } else {
    process.env.CLAUDE_CONFIG_DIR = originalEnv.CLAUDE_CONFIG_DIR
  }
  if (originalEnv.CLAUDE_CODE_CUSTOM_OAUTH_URL === undefined) {
    delete process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL
  } else {
    process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL = originalEnv.CLAUDE_CODE_CUSTOM_OAUTH_URL
  }
  if (originalEnv.USER_TYPE === undefined) {
    delete process.env.USER_TYPE
  } else {
    process.env.USER_TYPE = originalEnv.USER_TYPE
  }
 })
 async function importFreshEnvModule() {
  return import(`./env.js?ts=${Date.now()}-${Math.random()}`)
 }
 // getGlobalClaudeFile — three migration branches
 test('getGlobalClaudeFile: new install returns .openclaude.json when neither file exists', async () => {
  const { getGlobalClaudeFile } = await importFreshEnvModule()
  expect(getGlobalClaudeFile()).toBe(join(tempDir, '.openclaude.json'))
 })
 test('getGlobalClaudeFile: existing user keeps .claude.json when only legacy file exists', async () => {
  writeFileSync(join(tempDir, '.claude.json'), '{}')
  const { getGlobalClaudeFile } = await importFreshEnvModule()
  expect(getGlobalClaudeFile()).toBe(join(tempDir, '.claude.json'))
 })
 test('getGlobalClaudeFile: migrated user uses .openclaude.json when both files exist', async () => {
  writeFileSync(join(tempDir, '.claude.json'), '{}')
  writeFileSync(join(tempDir, '.openclaude.json'), '{}')
  const { getGlobalClaudeFile } = await importFreshEnvModule()
  expect(getGlobalClaudeFile()).toBe(join(tempDir, '.openclaude.json'))
 })
--- a/src/utils/env.ts
+++ b/src/utils/env.ts
@@ -21,8 +21,21 @@ export const getGlobalClaudeFile = memoize((): string => {
    return join(getClaudeConfigHomeDir(), '.config.json')
  }
-  const filename = `.claude${fileSuffixForOauthConfig()}.json`
+  const oauthSuffix = fileSuffixForOauthConfig()
-  return join(process.env.CLAUDE_CONFIG_DIR || homedir(), filename)
+  const configDir = process.env.CLAUDE_CONFIG_DIR || homedir()
  // Default to .openclaude.json. Fall back to .claude.json only if the new
  // file doesn't exist yet and the legacy one does (same migration pattern
  // as resolveClaudeConfigHomeDir for the config directory).
  const newFilename = `.openclaude${oauthSuffix}.json`
  const legacyFilename = `.claude${oauthSuffix}.json`
  if (
    !getFsImplementation().existsSync(join(configDir, newFilename)) &&
    getFsImplementation().existsSync(join(configDir, legacyFilename))
  ) {
    return join(configDir, legacyFilename)
  }
  return join(configDir, newFilename)
 })
 const hasInternetAccess = memoize(async (): Promise<boolean> => {
--- a/src/utils/hookChains.integration.test.ts
+++ b/src/utils/hookChains.integration.test.ts
@@ -0,0 +1,350 @@
 import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
 import { mkdtemp, rm, writeFile } from 'node:fs/promises'
 import { tmpdir } from 'node:os'
 import { join } from 'node:path'
 type HookChainsModule = typeof import('./hookChains.js')
 type ImportHarnessOptions = {
  allowRemoteSessions?: boolean
  teamFile?:
    | {
        name: string
        members: Array<{ name: string }>
      }
    | null
  teamName?: string
  senderName?: string
  replBridgeHandle?: unknown
 }
 const tempDirs: string[] = []
 const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
 async function createConfigFile(config: unknown): Promise<string> {
  const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-int-'))
  tempDirs.push(dir)
  const filePath = join(dir, 'hook-chains.json')
  await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
  return filePath
 }
 async function importHookChainsHarness(
  options: ImportHarnessOptions = {},
 ): Promise<{
  mod: HookChainsModule
  writeToMailboxSpy: ReturnType<typeof mock>
  agentToolCallSpy: ReturnType<typeof mock>
 }> {
  mock.restore()
  const allowRemoteSessions = options.allowRemoteSessions ?? true
  const teamName = options.teamName ?? 'mesh-team'
  const senderName = options.senderName ?? 'mesh-lead'
  const replBridgeHandle = options.replBridgeHandle ?? null
  const writeToMailboxSpy = mock(async () => {})
  const agentToolCallSpy = mock(async () => ({
    data: {
      status: 'async_launched',
      agentId: 'agent-fallback-1',
    },
  }))
  mock.module('../services/analytics/index.js', () => ({
    logEvent: () => {},
  }))
  mock.module('./telemetry/events.js', () => ({
    logOTelEvent: async () => {},
  }))
  mock.module('../services/policyLimits/index.js', () => ({
    isPolicyAllowed: () => allowRemoteSessions,
  }))
  mock.module('./swarm/teamHelpers.js', () => ({
    readTeamFileAsync: async () => options.teamFile ?? null,
  }))
  mock.module('./teammateMailbox.js', () => ({
    writeToMailbox: writeToMailboxSpy,
  }))
  mock.module('./teammate.js', () => ({
    getAgentName: () => senderName,
    getTeamName: () => teamName,
    getTeammateColor: () => 'blue',
  }))
  mock.module('../bridge/replBridgeHandle.js', () => ({
    getReplBridgeHandle: () => replBridgeHandle,
  }))
  // Integration mock target requested in the task: fallback action can route
  // through this mocked tool launcher from runtime callback wiring.
  mock.module('../tools/AgentTool/AgentTool.js', () => ({
    AgentTool: {
      call: agentToolCallSpy,
    },
  }))
  const mod = await import(`./hookChains.js?integration=${Date.now()}-${Math.random()}`)
  return { mod, writeToMailboxSpy, agentToolCallSpy }
 }
 beforeEach(() => {
  process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
 })
 afterEach(async () => {
  mock.restore()
  if (originalHookChainsEnabled === undefined) {
    delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
  } else {
    process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
  }
  await Promise.all(
    tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
  )
 })
 describe('hookChains integration dispatch', () => {
  test('end-to-end rule evaluation + action dispatch on TaskCompleted failure', async () => {
    const { mod } = await importHookChainsHarness({
      teamName: 'mesh-team',
      senderName: 'mesh-lead',
      teamFile: {
        name: 'mesh-team',
        members: [{ name: 'mesh-lead' }, { name: 'worker-a' }, { name: 'worker-b' }],
      },
    })
    const configPath = await createConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'task-failure-recovery',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [
            { type: 'spawn_fallback_agent' },
            { type: 'notify_team' },
          ],
        },
      ],
    })
    const spawnSpy = mock(async () => ({ launched: true, agentId: 'agent-e2e-1' }))
    const notifySpy = mock(async () => ({ sent: true, recipientCount: 2 }))
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: {
          task_id: 'task-001',
          task_subject: 'Patch flaky build',
          error: 'CI timeout',
        },
      },
      runtime: {
        onSpawnFallbackAgent: spawnSpy,
        onNotifyTeam: notifySpy,
      },
    })
    expect(result.enabled).toBe(true)
    expect(result.matchedRuleIds).toEqual(['task-failure-recovery'])
    expect(result.actionResults).toHaveLength(2)
    expect(result.actionResults[0]?.status).toBe('executed')
    expect(result.actionResults[1]?.status).toBe('executed')
    expect(spawnSpy).toHaveBeenCalledTimes(1)
    expect(notifySpy).toHaveBeenCalledTimes(1)
  })
  test('fallback spawn injects failure context into generated prompt', async () => {
    const { mod, agentToolCallSpy } = await importHookChainsHarness()
    const configPath = await createConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'fallback-context',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [
            {
              type: 'spawn_fallback_agent',
              description: 'Fallback for failed task',
            },
          ],
        },
      ],
    })
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: {
          task_id: 'task-ctx-1',
          task_subject: 'Repair migration guard',
          task_description: 'Fix regression in check ordering',
          error: 'Task failed after retry budget exhausted',
        },
      },
      runtime: {
        onSpawnFallbackAgent: async request => {
          const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
          await (AgentTool.call as unknown as (...args: unknown[]) => Promise<unknown>)({
            prompt: request.prompt,
            description: request.description,
            run_in_background: request.runInBackground,
            subagent_type: request.agentType,
            model: request.model,
          })
          return { launched: true, agentId: 'agent-fallback-ctx' }
        },
      },
    })
    expect(result.actionResults[0]?.status).toBe('executed')
    expect(agentToolCallSpy).toHaveBeenCalledTimes(1)
    const callInput = agentToolCallSpy.mock.calls[0]?.[0] as {
      prompt: string
      description: string
      run_in_background: boolean
    }
    expect(callInput.description).toBe('Fallback for failed task')
    expect(callInput.run_in_background).toBe(true)
    expect(callInput.prompt).toContain('Event: TaskCompleted')
    expect(callInput.prompt).toContain('Outcome: failed')
    expect(callInput.prompt).toContain('Task subject: Repair migration guard')
    expect(callInput.prompt).toContain('Failure details: Task failed after retry budget exhausted')
  })
  test('notify_team dispatches mailbox writes when team exists and skips when absent', async () => {
    const withTeam = await importHookChainsHarness({
      teamName: 'mesh-a',
      senderName: 'lead-a',
      teamFile: {
        name: 'mesh-a',
        members: [{ name: 'lead-a' }, { name: 'worker-1' }, { name: 'worker-2' }],
      },
    })
    const configPathWithTeam = await createConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'notify-existing-team',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'notify_team' }],
        },
      ],
    })
    const withTeamResult = await withTeam.mod.dispatchHookChainsForEvent({
      configPathOverride: configPathWithTeam,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-team-ok', error: 'boom' },
      },
    })
    expect(withTeamResult.actionResults[0]?.status).toBe('executed')
    expect(withTeam.writeToMailboxSpy).toHaveBeenCalledTimes(2)
    const recipients = withTeam.writeToMailboxSpy.mock.calls.map(
      call => call[0] as string,
    )
    expect(recipients.sort()).toEqual(['worker-1', 'worker-2'])
    const withoutTeam = await importHookChainsHarness({
      teamName: 'mesh-missing',
      senderName: 'lead-missing',
      teamFile: null,
    })
    const configPathWithoutTeam = await createConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'notify-missing-team',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'notify_team' }],
        },
      ],
    })
    const withoutTeamResult = await withoutTeam.mod.dispatchHookChainsForEvent({
      configPathOverride: configPathWithoutTeam,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-team-missing', error: 'boom' },
      },
    })
    expect(withoutTeamResult.actionResults[0]?.status).toBe('skipped')
    expect(withoutTeamResult.actionResults[0]?.reason).toContain('Team file not found')
    expect(withoutTeam.writeToMailboxSpy).not.toHaveBeenCalled()
  })
  test('warm_remote_capacity is a safe no-op when bridge is inactive', async () => {
    const { mod } = await importHookChainsHarness({
      allowRemoteSessions: true,
      replBridgeHandle: null,
    })
    const configPath = await createConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'bridge-warmup-noop',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'warm_remote_capacity' }],
        },
      ],
    })
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-warm-1' },
      },
    })
    expect(result.actionResults).toHaveLength(1)
    expect(result.actionResults[0]?.status).toBe('skipped')
    expect(result.actionResults[0]?.reason).toContain('Bridge is not active')
  })
 })
--- a/src/utils/hookChains.test.ts
+++ b/src/utils/hookChains.test.ts
@@ -0,0 +1,476 @@
 import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
 import { mkdtemp, rm, writeFile } from 'node:fs/promises'
 import { tmpdir } from 'node:os'
 import { join } from 'node:path'
 type HookChainsModule = typeof import('./hookChains.js')
 const tempDirs: string[] = []
 const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
 async function makeConfigFile(config: unknown): Promise<string> {
  const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-'))
  tempDirs.push(dir)
  const filePath = join(dir, 'hook-chains.json')
  await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
  return filePath
 }
 async function importHookChainsModule(options?: {
  allowRemoteSessions?: boolean
 }): Promise<HookChainsModule> {
  mock.restore()
  const allowRemoteSessions = options?.allowRemoteSessions ?? true
  mock.module('../services/analytics/index.js', () => ({
    logEvent: () => {},
  }))
  mock.module('./telemetry/events.js', () => ({
    logOTelEvent: async () => {},
  }))
  mock.module('../services/policyLimits/index.js', () => ({
    isPolicyAllowed: () => allowRemoteSessions,
  }))
  return import(`./hookChains.js?test=${Date.now()}-${Math.random()}`)
 }
 beforeEach(() => {
  process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
 })
 afterEach(async () => {
  mock.restore()
  if (originalHookChainsEnabled === undefined) {
    delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
  } else {
    process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
  }
  await Promise.all(
    tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
  )
 })
 describe('hookChains schema validation', () => {
  test('returns disabled config when env gate is unset', async () => {
    delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      rules: [
        {
          id: 'env-gated-rule',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'spawn_fallback_agent' }],
        },
      ],
    })
    const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
    expect(loaded.exists).toBe(false)
    expect(loaded.config.enabled).toBe(false)
    expect(loaded.config.rules).toHaveLength(0)
  })
  test('loads valid config and memoizes by mtime/size', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 5000,
      defaultDedupWindowMs: 5000,
      rules: [
        {
          id: 'task-failure-fallback',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [
            {
              type: 'spawn_fallback_agent',
              description: 'Fallback recovery agent',
            },
          ],
        },
      ],
    })
    const first = mod.loadHookChainsConfig({ pathOverride: configPath })
    expect(first.exists).toBe(true)
    expect(first.error).toBeUndefined()
    expect(first.fromCache).toBe(false)
    expect(first.config.enabled).toBe(true)
    expect(first.config.rules).toHaveLength(1)
    expect(first.config.rules[0]?.id).toBe('task-failure-fallback')
    const second = mod.loadHookChainsConfig({ pathOverride: configPath })
    expect(second.exists).toBe(true)
    expect(second.error).toBeUndefined()
    expect(second.fromCache).toBe(true)
    expect(second.config.rules).toHaveLength(1)
  })
  test('accepts wrapped { hookChains: ... } config shape', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      hookChains: {
        version: 1,
        enabled: true,
        rules: [
          {
            id: 'wrapped-shape',
            trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
            actions: [{ type: 'notify_team' }],
          },
        ],
      },
    })
    const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
    expect(loaded.error).toBeUndefined()
    expect(loaded.config.enabled).toBe(true)
    expect(loaded.config.rules[0]?.id).toBe('wrapped-shape')
  })
  test('returns disabled config for invalid schema', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      rules: [
        {
          id: 'invalid-rule',
          trigger: {
            event: 'TaskCompleted',
            outcome: 'failed',
            outcomes: ['failed'],
          },
          actions: [{ type: 'spawn_fallback_agent' }],
        },
      ],
    })
    const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
    expect(loaded.exists).toBe(true)
    expect(loaded.error).toBeDefined()
    expect(loaded.config.enabled).toBe(false)
    expect(loaded.config.rules).toHaveLength(0)
  })
 })
 describe('evaluateHookChainRules', () => {
  test('matches by event + outcome + condition', async () => {
    const mod = await importHookChainsModule()
    const rules = [
      {
        id: 'post-tool-failure-rule',
        trigger: { event: 'PostToolUseFailure', outcome: 'failed' },
        condition: {
          toolNames: ['Edit'],
          errorIncludes: ['permission'],
          eventFieldEquals: { 'meta.source': 'scheduler' },
        },
        actions: [{ type: 'spawn_fallback_agent' }],
      },
    ]
    const matches = mod.evaluateHookChainRules(rules as never, {
      eventName: 'PostToolUseFailure',
      outcome: 'failed',
      payload: {
        tool_name: 'Edit',
        error: 'Permission denied by policy',
        meta: { source: 'scheduler' },
      },
    })
    expect(matches).toHaveLength(1)
    expect(matches[0]?.rule.id).toBe('post-tool-failure-rule')
  })
  test('does not match when event/condition fail', async () => {
    const mod = await importHookChainsModule()
    const rules = [
      {
        id: 'rule-no-match',
        trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
        condition: { toolNames: ['Write'] },
        actions: [{ type: 'spawn_fallback_agent' }],
      },
    ]
    const wrongEvent = mod.evaluateHookChainRules(rules as never, {
      eventName: 'TaskCompleted',
      outcome: 'failed',
      payload: { tool_name: 'Write' },
    })
    expect(wrongEvent).toHaveLength(0)
    const wrongCondition = mod.evaluateHookChainRules(rules as never, {
      eventName: 'PostToolUseFailure',
      outcome: 'failed',
      payload: { tool_name: 'Edit' },
    })
    expect(wrongCondition).toHaveLength(0)
  })
 })
 describe('dispatchHookChainsForEvent guard logic', () => {
  test('dedup skips duplicate event/action within dedup window', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 4,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 60_000,
      rules: [
        {
          id: 'dedup-rule',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          cooldownMs: 0,
          dedupWindowMs: 60_000,
          actions: [{ id: 'spawn-1', type: 'spawn_fallback_agent' }],
        },
      ],
    })
    const spawn = mock(async () => ({ launched: true, agentId: 'agent-1' }))
    const first = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-123', error: 'boom' },
      },
      runtime: { onSpawnFallbackAgent: spawn },
    })
    const second = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-123', error: 'boom' },
      },
      runtime: { onSpawnFallbackAgent: spawn },
    })
    expect(first.actionResults[0]?.status).toBe('executed')
    expect(second.actionResults[0]?.status).toBe('skipped')
    expect(second.actionResults[0]?.reason).toContain('dedup')
    expect(spawn).toHaveBeenCalledTimes(1)
  })
  test('cooldown skips second dispatch when rule cooldown is active', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 4,
      defaultCooldownMs: 60_000,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'cooldown-rule',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          cooldownMs: 60_000,
          dedupWindowMs: 0,
          actions: [{ type: 'spawn_fallback_agent' }],
        },
      ],
    })
    const spawn = mock(async () => ({ launched: true, agentId: 'agent-2' }))
    const first = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-456' },
      },
      runtime: { onSpawnFallbackAgent: spawn },
    })
    const second = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-789' },
      },
      runtime: { onSpawnFallbackAgent: spawn },
    })
    expect(first.actionResults[0]?.status).toBe('executed')
    expect(second.actionResults[0]?.status).toBe('skipped')
    expect(second.actionResults[0]?.reason).toContain('cooldown')
    expect(spawn).toHaveBeenCalledTimes(1)
  })
  test('depth limit blocks dispatch when chain depth reaches max', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 1,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'depth-rule',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'spawn_fallback_agent' }],
        },
      ],
    })
    const spawn = mock(async () => ({ launched: true, agentId: 'agent-3' }))
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-depth' },
      },
      runtime: {
        chainDepth: 1,
        onSpawnFallbackAgent: spawn,
      },
    })
    expect(result.enabled).toBe(true)
    expect(result.matchedRuleIds).toHaveLength(0)
    expect(result.actionResults).toHaveLength(0)
    expect(spawn).not.toHaveBeenCalled()
  })
 })
 describe('action dispatch skip scenarios', () => {
  test('fails spawn_fallback_agent when launcher callback is missing', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'missing-launcher',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'spawn_fallback_agent' }],
        },
      ],
    })
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-missing-launcher' },
      },
      runtime: {},
    })
    expect(result.actionResults[0]?.status).toBe('failed')
    expect(result.actionResults[0]?.reason).toContain('launcher')
  })
  test('skips disabled action and does not execute callback', async () => {
    const mod = await importHookChainsModule()
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'disabled-action-rule',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [
            {
              type: 'spawn_fallback_agent',
              enabled: false,
            },
          ],
        },
      ],
    })
    const spawn = mock(async () => ({ launched: true, agentId: 'agent-4' }))
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-disabled' },
      },
      runtime: { onSpawnFallbackAgent: spawn },
    })
    expect(result.actionResults[0]?.status).toBe('skipped')
    expect(result.actionResults[0]?.reason).toContain('disabled')
    expect(spawn).not.toHaveBeenCalled()
  })
  test('skips warm_remote_capacity when policy denies remote sessions', async () => {
    const mod = await importHookChainsModule({ allowRemoteSessions: false })
    const configPath = await makeConfigFile({
      version: 1,
      enabled: true,
      maxChainDepth: 3,
      defaultCooldownMs: 0,
      defaultDedupWindowMs: 0,
      rules: [
        {
          id: 'policy-denied-remote-warm',
          trigger: { event: 'TaskCompleted', outcome: 'failed' },
          actions: [{ type: 'warm_remote_capacity' }],
        },
      ],
    })
    const warm = mock(async () => ({
      warmed: true,
      environmentId: 'env-123',
    }))
    const result = await mod.dispatchHookChainsForEvent({
      configPathOverride: configPath,
      event: {
        eventName: 'TaskCompleted',
        outcome: 'failed',
        payload: { task_id: 'task-policy-denied' },
      },
      runtime: { onWarmRemoteCapacity: warm },
    })
    expect(result.actionResults[0]?.status).toBe('skipped')
    expect(result.actionResults[0]?.reason).toContain('policy')
    expect(warm).not.toHaveBeenCalled()
  })
 })
--- a/src/utils/hookChains.ts
+++ b/src/utils/hookChains.ts
--- a/src/utils/hooks.ts
+++ b/src/utils/hooks.ts
@@ -10,6 +10,7 @@ import { wrapSpawn } from './ShellCommand.js'
 import { TaskOutput } from './task/TaskOutput.js'
 import { getCwd } from './cwd.js'
 import { randomUUID } from 'crypto'
 import { feature } from 'bun:bundle'
 import { formatShellPrefixCommand } from './bash/shellPrefix.js'
 import {
  getHookEnvFilePath,
@@ -134,6 +135,7 @@ import { registerPendingAsyncHook } from './hooks/AsyncHookRegistry.js'
 import { enqueuePendingNotification } from './messageQueueManager.js'
 import {
  extractTextContent,
  createAssistantMessage,
  getLastAssistantMessage,
  wrapInSystemReminder,
 } from './messages.js'
@@ -145,6 +147,7 @@ import {
 import { createAttachmentMessage } from './attachments.js'
 import { all } from './generators.js'
 import { findToolByName, type Tools, type ToolUseContext } from '../Tool.js'
 import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
 import { execPromptHook } from './hooks/execPromptHook.js'
 import type { Message, AssistantMessage } from '../types/message.js'
 import { execAgentHook } from './hooks/execAgentHook.js'
@@ -162,9 +165,147 @@ import type { AppState } from '../state/AppState.js'
 import { jsonStringify, jsonParse } from './slowOperations.js'
 import { isEnvTruthy } from './envUtils.js'
 import { errorMessage, getErrnoCode } from './errors.js'
 import { getAgentName, getTeamName, getTeammateColor } from './teammate.js'
 import type {
  HookChainOutcome,
  HookChainRuntimeContext,
  SpawnFallbackAgentRequest,
  SpawnFallbackAgentResponse,
 } from './hookChains.js'
 const TOOL_HOOK_EXECUTION_TIMEOUT_MS = 10 * 60 * 1000
 function normalizeFallbackAgentModel(
  model: string | undefined,
 ): 'sonnet' | 'opus' | 'haiku' | undefined {
  if (model === 'sonnet' || model === 'opus' || model === 'haiku') {
    return model
  }
  return undefined
 }
 async function launchFallbackAgentFromHookChains(
  request: SpawnFallbackAgentRequest,
  toolUseContext: ToolUseContext,
  canUseTool: CanUseToolFn,
 ): Promise<SpawnFallbackAgentResponse> {
  try {
    const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
    const normalizedModel = normalizeFallbackAgentModel(request.model)
    const result = await AgentTool.call(
      {
        prompt: request.prompt,
        description: request.description,
        run_in_background: true,
        ...(request.agentType ? { subagent_type: request.agentType } : {}),
        ...(normalizedModel ? { model: normalizedModel } : {}),
      },
      toolUseContext,
      canUseTool,
      createAssistantMessage({ content: [] }),
    )
    const data = result.data as
      | {
          status?: string
          agentId?: string
          agent_id?: string
        }
      | undefined
    const status = data?.status
    if (
      status === 'async_launched' ||
      status === 'completed' ||
      status === 'remote_launched' ||
      status === 'teammate_spawned'
    ) {
      return {
        launched: true,
        agentId: data?.agentId ?? data?.agent_id,
      }
    }
    return {
      launched: true,
      reason:
        status !== undefined
          ? `Fallback launched with status ${status}`
          : undefined,
    }
  } catch (error) {
    return {
      launched: false,
      reason: `Fallback launch failed: ${errorMessage(error)}`,
    }
  }
 }
 async function dispatchHookChainFromHookRuntime(args: {
  eventName: 'PostToolUseFailure' | 'TaskCompleted'
  outcome: HookChainOutcome
  payload: Record<string, unknown>
  signal?: AbortSignal
  toolUseContext?: ToolUseContext
 }): Promise<void> {
  try {
    if (!feature('HOOK_CHAINS')) {
      return
    }
    const { dispatchHookChainsForEvent } = await import('./hookChains.js')
    const runtime: HookChainRuntimeContext = {
      signal: args.signal,
      senderName: getAgentName() ?? undefined,
      senderColor: getTeammateColor() ?? undefined,
      teamName: getTeamName() ?? undefined,
    }
    const chainDepth = args.toolUseContext?.queryTracking?.depth
    if (typeof chainDepth === 'number' && Number.isFinite(chainDepth)) {
      runtime.chainDepth = chainDepth
    }
    const hookChainsCanUseTool = (
      args.toolUseContext as
        | (ToolUseContext & { hookChainsCanUseTool?: CanUseToolFn })
        | undefined
    )?.hookChainsCanUseTool
    if (args.toolUseContext) {
      runtime.onSpawnFallbackAgent = request => {
        if (!hookChainsCanUseTool) {
          return Promise.resolve({
            launched: false,
            reason:
              'Fallback action requires canUseTool in this hook runtime context',
          })
        }
        return launchFallbackAgentFromHookChains(
          request,
          args.toolUseContext!,
          hookChainsCanUseTool,
        )
      }
    }
    await dispatchHookChainsForEvent({
      event: {
        eventName: args.eventName,
        outcome: args.outcome,
        payload: args.payload,
      },
      runtime,
    })
  } catch (error) {
    logForDebugging(
      `[hook-chains] Dispatch failed for ${args.eventName}: ${errorMessage(error)}`,
    )
  }
 }
 /**
 * SessionEnd hooks run during shutdown/clear and need a much tighter bound
 * than TOOL_HOOK_EXECUTION_TIMEOUT_MS. This value is used by callers as both
@@ -3502,9 +3643,11 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
 ): AsyncGenerator<AggregatedHookResult> {
  const appState = toolUseContext.getAppState()
  const sessionId = toolUseContext.agentId ?? getSessionId()
-  if (!hasHookForEvent('PostToolUseFailure', appState, sessionId)) {
+  const hasPostToolFailureHooks = hasHookForEvent(
-    return
+    'PostToolUseFailure',
-  }
+    appState,
    sessionId,
  )
  const hookInput: PostToolUseFailureHookInput = {
    ...createBaseHookInput(permissionMode, undefined, toolUseContext),
@@ -3516,12 +3659,33 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
    is_interrupt: isInterrupt,
  }
-  yield* executeHooks({
+  let blockingHookCount = 0
-    hookInput,
+
-    toolUseID,
+  if (hasPostToolFailureHooks) {
-    matchQuery: toolName,
+    for await (const result of executeHooks({
      hookInput,
      toolUseID,
      matchQuery: toolName,
      signal,
      timeoutMs,
      toolUseContext,
    })) {
      if (result.blockingError) {
        blockingHookCount++
      }
      yield result
    }
  }
  await dispatchHookChainFromHookRuntime({
    eventName: 'PostToolUseFailure',
    outcome: 'failed',
    payload: {
      ...hookInput,
      hook_blocking_error_count: blockingHookCount,
      hook_execution_skipped: !hasPostToolFailureHooks,
    },
    signal,
    timeoutMs,
    toolUseContext,
  })
 }
@@ -3807,12 +3971,36 @@ export async function* executeTaskCompletedHooks(
    team_name: teamName,
  }
-  yield* executeHooks({
+  let blockingHookCount = 0
  let preventedContinuation = false
  for await (const result of executeHooks({
    hookInput,
    toolUseID: randomUUID(),
    signal,
    timeoutMs,
    toolUseContext,
  })) {
    if (result.blockingError) {
      blockingHookCount++
    }
    if (result.preventContinuation) {
      preventedContinuation = true
    }
    yield result
  }
  await dispatchHookChainFromHookRuntime({
    eventName: 'TaskCompleted',
    outcome:
      blockingHookCount > 0 || preventedContinuation ? 'failed' : 'success',
    payload: {
      ...hookInput,
      hook_blocking_error_count: blockingHookCount,
      hook_prevented_continuation: preventedContinuation,
    },
    signal,
    toolUseContext,
  })
 }
--- a/src/utils/json.ts
+++ b/src/utils/json.ts
@@ -24,7 +24,7 @@ type CachedParse = { ok: true; value: unknown } | { ok: false }
 // lodash memoize default resolver = first arg only).
 // Skip caching above this size — the LRU stores the full string as the key,
 // so a 200KB config file would pin ~10MB in #keyList across 50 slots. Large
-// inputs like ~/.claude.json also change between reads (numStartups bumps on
+// inputs like ~/.openclaude.json also change between reads (numStartups bumps on
 // every CC startup), so the cache never hits anyway.
 const PARSE_CACHE_MAX_KEY_BYTES = 8 * 1024
--- a/src/utils/localInstaller.ts
+++ b/src/utils/localInstaller.ts
@@ -44,9 +44,10 @@ function getCandidateLocalBinaryPaths(localInstallDir: string): string[] {
 }
 export function isManagedLocalInstallationPath(execPath: string): boolean {
  const normalizedExecPath = execPath.replace(/\\+/g, '/')
  return (
-    execPath.includes('/.openclaude/local/node_modules/') ||
+    normalizedExecPath.includes('/.openclaude/local/node_modules/') ||
-    execPath.includes('/.claude/local/node_modules/')
+    normalizedExecPath.includes('/.claude/local/node_modules/')
  )
 }
--- a/src/utils/managedEnv.ts
+++ b/src/utils/managedEnv.ts
@@ -131,7 +131,7 @@ export function applySafeConfigEnvironmentVariables(): void {
        : null
  }
-  // Global config (~/.claude.json) is user-controlled. In CCD mode,
+  // Global config (~/.openclaude.json) is user-controlled. In CCD mode,
  // filterSettingsEnv strips keys that were in the spawn env snapshot so
  // the desktop host's operational vars (OTEL, etc.) are not overridden.
  Object.assign(process.env, filterSettingsEnv(getGlobalConfig().env))
--- a/src/utils/managedEnvConstants.ts
+++ b/src/utils/managedEnvConstants.ts
@@ -123,7 +123,6 @@ export const SAFE_ENV_VARS = new Set([
  'ANTHROPIC_DEFAULT_SONNET_MODEL_DESCRIPTION',
  'ANTHROPIC_DEFAULT_SONNET_MODEL_NAME',
  'ANTHROPIC_DEFAULT_SONNET_MODEL_SUPPORTED_CAPABILITIES',
  'ANTHROPIC_FOUNDRY_API_KEY',
  'ANTHROPIC_MODEL',
  'ANTHROPIC_SMALL_FAST_MODEL_AWS_REGION',
  'ANTHROPIC_SMALL_FAST_MODEL',
--- a/src/utils/model/benchmark.ts
+++ b/src/utils/model/benchmark.ts
@@ -0,0 +1,205 @@
 /**
 * Model Benchmarking for OpenClaude
 * 
 * Tests and compares model speed/quality for informed model selection.
 * Supports OpenAI-compatible, Ollama, Anthropic, Bedrock, Vertex.
 */
 import { getAPIProvider } from './providers.js'
 export interface BenchmarkResult {
  model: string
  provider: string
  firstTokenMs: number
  totalTokens: number
  tokensPerSecond: number
  success: boolean
  error?: string
 }
 const TEST_PROMPT = 'Write a short hello world in Python.'
 const MAX_TOKENS = 50
 const TIMEOUT_MS = 30000
 function getBenchmarkEndpoint(): string | null {
  const provider = getAPIProvider()
  const baseUrl = process.env.OPENAI_BASE_URL
  // Check for Ollama (local)
  if (baseUrl?.includes('localhost:11434') || baseUrl?.includes('localhost:11435')) {
    return `${baseUrl}/chat/completions`
  }
  // OpenAI-compatible endpoints
  if (provider === 'openai' || provider === 'firstParty') {
    return `${baseUrl || 'https://api.openai.com/v1'}/chat/completions`
  }
  // NVIDIA NIM or MiniMax via OPENAI_BASE_URL
  if (baseUrl?.includes('nvidia') || baseUrl?.includes('minimax')) {
    return `${baseUrl}/chat/completions`
  }
  return null
 }
 function getBenchmarkAuthHeader(): string | null {
  const apiKey = process.env.OPENAI_API_KEY
  if (!apiKey) return null
  return `Bearer ${apiKey}`
 }
 export async function benchmarkModel(
  model: string,
  onChunk?: (text: string) => void,
 ): Promise<BenchmarkResult> {
  const endpoint = getBenchmarkEndpoint()
  const authHeader = getBenchmarkAuthHeader()
  if (!endpoint || !authHeader) {
    return {
      model,
      provider: getAPIProvider(),
      firstTokenMs: 0,
      totalTokens: 0,
      tokensPerSecond: 0,
      success: false,
      error: 'Benchmark not supported for this provider',
    }
  }
  const startTime = performance.now()
  let totalTokens = 0
  let firstTokenMs: number | null = null
  try {
    const response = await fetch(endpoint, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': authHeader,
      },
      body: JSON.stringify({
        model,
        messages: [{ role: 'user', content: TEST_PROMPT }],
        max_tokens: MAX_TOKENS,
        stream: true,
      }),
      signal: AbortSignal.timeout(TIMEOUT_MS),
    })
    if (!response.ok) {
      let errorMsg = `HTTP ${response.status}`
      try {
        const error = await response.json()
        errorMsg = error.error?.message || errorMsg
      } catch {
        // ignore
      }
      return {
        model,
        provider: getAPIProvider(),
        firstTokenMs: 0,
        totalTokens: 0,
        tokensPerSecond: 0,
        success: false,
        error: errorMsg,
      }
    }
    const reader = response.body?.getReader()
    if (!reader) {
      throw new Error('No response body')
    }
    const decoder = new TextDecoder()
    let buffer = ''
    while (true) {
      const { done, value } = await reader.read()
      if (done) break
      buffer += decoder.decode(value, { stream: true })
      const lines = buffer.split('\n')
      buffer = lines.pop() || ''
      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = line.slice(6)
          if (data === '[DONE]') continue
          try {
            const json = JSON.parse(data)
            const content = json.choices?.[0]?.delta?.content
            if (content) {
              if (firstTokenMs === null) {
                firstTokenMs = performance.now() - startTime
              }
              totalTokens += content.length / 4
              onChunk?.(content)
            }
          } catch {
            // skip invalid JSON
          }
        }
      }
    }
    const totalMs = performance.now() - startTime
    const tokensPerSecond = totalMs > 0 ? (totalTokens / totalMs) * 1000 : 0
    return {
      model,
      provider: getAPIProvider(),
      firstTokenMs: firstTokenMs ?? 0,
      totalTokens,
      tokensPerSecond,
      success: true,
    }
  } catch (error) {
    return {
      model,
      provider: getAPIProvider(),
      firstTokenMs: 0,
      totalTokens: 0,
      tokensPerSecond: 0,
      success: false,
      error: error instanceof Error ? error.message : 'Unknown error',
    }
  }
 }
 export async function benchmarkMultipleModels(
  models: string[],
  onProgress?: (completed: number, total: number, result: BenchmarkResult) => void,
 ): Promise<BenchmarkResult[]> {
  const results: BenchmarkResult[] = []
  for (let i = 0; i < models.length; i++) {
    const result = await benchmarkModel(models[i])
    results.push(result)
    onProgress?.(i + 1, models.length, result)
  }
  return results
 }
 export function formatBenchmarkResults(results: BenchmarkResult[]): string {
  const header = 'Model'.padEnd(40) + 'TPS' + '  First Token' + '  Status'
  const divider = '-'.repeat(70)
  const rows = results
    .sort((a, b) => b.tokensPerSecond - a.tokensPerSecond)
    .map(r => {
      const name = r.model.length > 38 ? r.model.slice(0, 37) + '…' : r.model
      const tps = r.tokensPerSecond.toFixed(1).padStart(6)
      const first = r.firstTokenMs > 0 ? `${r.firstTokenMs.toFixed(0)}ms`.padStart(12) : 'N/A'.padStart(12)
      const status = r.success ? '✓' : '✗'
      return name.padEnd(40) + tps + '  ' + first + '  ' + status
    })
  return [header, divider, ...rows].join('\n')
 }
 export function isBenchmarkSupported(): boolean {
  const endpoint = getBenchmarkEndpoint()
  const authHeader = getBenchmarkAuthHeader()
  return endpoint !== null && authHeader !== null
 }
--- a/src/utils/model/configs.ts
+++ b/src/utils/model/configs.ts
@@ -20,7 +20,7 @@ export const OPENAI_MODEL_DEFAULTS = {
 // Override with GEMINI_MODEL env var.
 // ---------------------------------------------------------------------------
 export const GEMINI_MODEL_DEFAULTS = {
-  opus: 'gemini-2.5-pro-preview-03-25',   // most capable
+  opus: 'gemini-2.5-pro',   // most capable
  sonnet: 'gemini-2.0-flash',              // balanced
  haiku: 'gemini-2.0-flash-lite',          // fast & cheap
 } as const
@@ -112,7 +112,7 @@ export const CLAUDE_OPUS_4_CONFIG = {
  vertex: 'claude-opus-4@20250514',
  foundry: 'claude-opus-4',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro-preview-03-25',
+  gemini: 'gemini-2.5-pro',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -125,7 +125,7 @@ export const CLAUDE_OPUS_4_1_CONFIG = {
  vertex: 'claude-opus-4-1@20250805',
  foundry: 'claude-opus-4-1',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro-preview-03-25',
+  gemini: 'gemini-2.5-pro',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -138,7 +138,7 @@ export const CLAUDE_OPUS_4_5_CONFIG = {
  vertex: 'claude-opus-4-5@20251101',
  foundry: 'claude-opus-4-5',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro-preview-03-25',
+  gemini: 'gemini-2.5-pro',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -151,7 +151,7 @@ export const CLAUDE_OPUS_4_6_CONFIG = {
  vertex: 'claude-opus-4-6',
  foundry: 'claude-opus-4-6',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro-preview-03-25',
+  gemini: 'gemini-2.5-pro',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
--- a/src/utils/model/model.openai-shim-providers.test.ts
+++ b/src/utils/model/model.openai-shim-providers.test.ts
@@ -0,0 +1,115 @@
 import { afterEach, beforeEach, expect, test } from 'bun:test'
 import { saveGlobalConfig } from '../config.js'
 import { getUserSpecifiedModelSetting } from './model.js'
 const SAVED_ENV = {
  CLAUDE_CODE_USE_OPENAI: process.env.CLAUDE_CODE_USE_OPENAI,
  CLAUDE_CODE_USE_GEMINI: process.env.CLAUDE_CODE_USE_GEMINI,
  CLAUDE_CODE_USE_GITHUB: process.env.CLAUDE_CODE_USE_GITHUB,
  CLAUDE_CODE_USE_MISTRAL: process.env.CLAUDE_CODE_USE_MISTRAL,
  CLAUDE_CODE_USE_BEDROCK: process.env.CLAUDE_CODE_USE_BEDROCK,
  CLAUDE_CODE_USE_VERTEX: process.env.CLAUDE_CODE_USE_VERTEX,
  CLAUDE_CODE_USE_FOUNDRY: process.env.CLAUDE_CODE_USE_FOUNDRY,
  NVIDIA_NIM: process.env.NVIDIA_NIM,
  MINIMAX_API_KEY: process.env.MINIMAX_API_KEY,
  OPENAI_MODEL: process.env.OPENAI_MODEL,
  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
  CODEX_API_KEY: process.env.CODEX_API_KEY,
  CHATGPT_ACCOUNT_ID: process.env.CHATGPT_ACCOUNT_ID,
 }
 function restoreEnv(key: keyof typeof SAVED_ENV): void {
  if (SAVED_ENV[key] === undefined) {
    delete process.env[key]
  } else {
    process.env[key] = SAVED_ENV[key]
  }
 }
 beforeEach(() => {
  delete process.env.CLAUDE_CODE_USE_OPENAI
  delete process.env.CLAUDE_CODE_USE_GEMINI
  delete process.env.CLAUDE_CODE_USE_GITHUB
  delete process.env.CLAUDE_CODE_USE_MISTRAL
  delete process.env.CLAUDE_CODE_USE_BEDROCK
  delete process.env.CLAUDE_CODE_USE_VERTEX
  delete process.env.CLAUDE_CODE_USE_FOUNDRY
  delete process.env.NVIDIA_NIM
  delete process.env.MINIMAX_API_KEY
  delete process.env.OPENAI_MODEL
  delete process.env.OPENAI_BASE_URL
  delete process.env.CODEX_API_KEY
  delete process.env.CHATGPT_ACCOUNT_ID
  saveGlobalConfig(current => ({
    ...current,
    model: undefined,
  }))
 })
 afterEach(() => {
  for (const key of Object.keys(SAVED_ENV) as Array<keyof typeof SAVED_ENV>) {
    restoreEnv(key)
  }
  saveGlobalConfig(current => ({
    ...current,
    model: undefined,
  }))
 })
 test('codex provider reads OPENAI_MODEL, not stale settings.model', () => {
  // Regression: switching from Moonshot (settings.model='kimi-k2.6' persisted
  // from that session) to the Codex profile. Codex profile correctly sets
  // OPENAI_MODEL=codexplan + base URL to chatgpt.com/backend-api/codex.
  // getUserSpecifiedModelSetting previously ignored env for 'codex' provider
  // and returned settings.model='kimi-k2.6', causing Codex's API to reject
  // the request: "The 'kimi-k2.6' model is not supported when using Codex".
  saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
  process.env.CLAUDE_CODE_USE_OPENAI = '1'
  process.env.OPENAI_BASE_URL = 'https://chatgpt.com/backend-api/codex'
  process.env.OPENAI_MODEL = 'codexplan'
  process.env.CODEX_API_KEY = 'codex-test'
  process.env.CHATGPT_ACCOUNT_ID = 'acct_test'
  const model = getUserSpecifiedModelSetting()
  expect(model).toBe('codexplan')
 })
 test('nvidia-nim provider reads OPENAI_MODEL, not stale settings.model', () => {
  saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
  process.env.NVIDIA_NIM = '1'
  process.env.CLAUDE_CODE_USE_OPENAI = '1'
  process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
  const model = getUserSpecifiedModelSetting()
  expect(model).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
 })
 test('minimax provider reads OPENAI_MODEL, not stale settings.model', () => {
  saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
  process.env.MINIMAX_API_KEY = 'minimax-test'
  process.env.CLAUDE_CODE_USE_OPENAI = '1'
  process.env.OPENAI_MODEL = 'MiniMax-M2.5'
  const model = getUserSpecifiedModelSetting()
  expect(model).toBe('MiniMax-M2.5')
 })
 test('openai provider still reads OPENAI_MODEL (regression guard)', () => {
  saveGlobalConfig(current => ({ ...current, model: 'stale-default' }))
  process.env.CLAUDE_CODE_USE_OPENAI = '1'
  process.env.OPENAI_MODEL = 'gpt-4o'
  const model = getUserSpecifiedModelSetting()
  expect(model).toBe('gpt-4o')
 })
 test('github provider still reads OPENAI_MODEL (regression guard)', () => {
  saveGlobalConfig(current => ({ ...current, model: 'stale-default' }))
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'github:copilot'
  const model = getUserSpecifiedModelSetting()
  expect(model).toBe('github:copilot')
 })
--- a/src/utils/model/model.ts
+++ b/src/utils/model/model.ts
@@ -91,11 +91,24 @@ export function getUserSpecifiedModelSetting(): ModelSetting | undefined {
    const setting = normalizeModelSetting(settings.model)
    // Read the model env var that matches the active provider to prevent
    // cross-provider leaks (e.g. ANTHROPIC_MODEL sent to the OpenAI API).
    //
    // All OpenAI-shim providers (openai, codex, github, nvidia-nim, minimax)
    // set CLAUDE_CODE_USE_OPENAI=1 + OPENAI_MODEL via
    // applyProviderProfileToProcessEnv. Earlier this check only included
    // openai/github — codex/nvidia-nim/minimax fell through to the stale
    // settings.model, so switching from (say) Moonshot to Codex kept firing
    // `kimi-k2.6` at the Codex endpoint and getting 400s.
    const provider = getAPIProvider()
    const isOpenAIShimProvider =
      provider === 'openai' ||
      provider === 'codex' ||
      provider === 'github' ||
      provider === 'nvidia-nim' ||
      provider === 'minimax'
    specifiedModel =
      (provider === 'gemini' ? process.env.GEMINI_MODEL : undefined) ||
      (provider === 'mistral' ? process.env.MISTRAL_MODEL : undefined) ||
-      (provider === 'openai' || provider === 'gemini' || provider === 'mistral' || provider === 'github' ? process.env.OPENAI_MODEL : undefined) ||
+      (isOpenAIShimProvider ? process.env.OPENAI_MODEL : undefined) ||
      (provider === 'firstParty' ? process.env.ANTHROPIC_MODEL : undefined) ||
      setting ||
      undefined
@@ -140,7 +153,7 @@ export function getDefaultOpusModel(): ModelName {
  }
  // Gemini provider
  if (getAPIProvider() === 'gemini') {
-    return process.env.GEMINI_MODEL || 'gemini-2.5-pro-preview-03-25'
+    return process.env.GEMINI_MODEL || 'gemini-2.5-pro'
  }
  // Mistral provider
  if (getAPIProvider() === 'mistral') {
--- a/src/utils/model/modelCache.test.ts
+++ b/src/utils/model/modelCache.test.ts
@@ -0,0 +1,30 @@
 import { describe, expect, it, beforeEach, afterEach, vi } from 'bun:test'
 import { isModelCacheValid, getCachedModelsFromDisk, saveModelsToCache } from '../model/modelCache.js'
 vi.mock('../model/ollamaModels.js', () => ({
  isOllamaProvider: vi.fn(() => true),
 }))
 describe('modelCache', () => {
  const mockModel = { value: 'llama3', label: 'Llama 3', description: 'Test model' }
  describe('isModelCacheValid', () => {
    it('returns false for non-existent cache', async () => {
      const result = await isModelCacheValid('ollama')
      expect(result).toBe(false)
    })
  })
  describe('getCachedModelsFromDisk', () => {
    it('returns null when not cache available', async () => {
      const result = await getCachedModelsFromDisk()
      expect(result).toBeNull()
    })
  })
  describe('saveModelsToCache', () => {
    it('has saveModelsToCache function', () => {
      expect(typeof saveModelsToCache).toBe('function')
    })
  })
 })
--- a/src/utils/model/modelCache.ts
+++ b/src/utils/model/modelCache.ts
@@ -0,0 +1,165 @@
 /**
 * Model Caching for OpenClaude
 * 
 * Caches model lists to disk for faster startup and offline access.
 * Uses async fs operations to avoid blocking the event loop.
 */
 import { access, readFile, writeFile, mkdir, unlink } from 'node:fs/promises'
 import { existsSync } from 'node:fs'
 import { join } from 'node:path'
 import { homedir } from 'node:os'
 import { getAPIProvider } from './providers.js'
 const CACHE_VERSION = '1'
 const CACHE_TTL_HOURS = 24
 const CACHE_DIR_NAME = '.openclaude-model-cache'
 interface ModelCache {
  version: string
  timestamp: number
  provider: string
  models: Array<{ value: string; label: string; description: string }>
 }
 function getCacheDir(): string {
  const home = homedir()
  const cacheDir = join(home, CACHE_DIR_NAME)
  if (!existsSync(cacheDir)) {
    mkdir(cacheDir, { recursive: true })
  }
  return cacheDir
 }
 function getCacheFilePath(provider: string): string {
  return join(getCacheDir(), `${provider}.json`)
 }
 function isOpenAICompatibleProvider(): boolean {
  const baseUrl = process.env.OPENAI_BASE_URL || ''
  return baseUrl.includes('localhost') || baseUrl.includes('nvidia') || baseUrl.includes('minimax') || getAPIProvider() === 'openai'
 }
 export async function isModelCacheValid(provider: string): Promise<boolean> {
  const cachePath = getCacheFilePath(provider)
  try {
    await access(cachePath)
  } catch {
    return false
  }
  try {
    const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
    if (data.version !== CACHE_VERSION) {
      return false
    }
    if (data.provider !== provider) {
      return false
    }
    const ageHours = (Date.now() - data.timestamp) / (1000 * 60 * 60)
    return ageHours < CACHE_TTL_HOURS
  } catch {
    return false
  }
 }
 export async function getCachedModelsFromDisk<T>(): Promise<T[] | null> {
  const provider = getAPIProvider()
  const baseUrl = process.env.OPENAI_BASE_URL || ''
  const isLocalOllama = baseUrl.includes('localhost:11434') || baseUrl.includes('localhost:11435')
  const isNvidia = baseUrl.includes('nvidia') || baseUrl.includes('integrate.api.nvidia')
  const isMiniMax = baseUrl.includes('minimax')
  if (!isLocalOllama && !isNvidia && !isMiniMax && provider !== 'openai') {
    return null
  }
  const cachePath = getCacheFilePath(provider)
  if (!(await isModelCacheValid(provider))) {
    return null
  }
  try {
    const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
    return data.models as T[]
  } catch {
    return null
  }
 }
 export async function saveModelsToCache(
  models: Array<{ value: string; label: string; description: string }>,
 ): Promise<void> {
  const provider = getAPIProvider()
  if (!provider) return
  const cachePath = getCacheFilePath(provider)
  const cacheData: ModelCache = {
    version: CACHE_VERSION,
    timestamp: Date.now(),
    provider,
    models,
  }
  try {
    await writeFile(cachePath, JSON.stringify(cacheData, null, 2), 'utf-8')
  } catch (error) {
    console.warn('[ModelCache] Failed to save cache:', error)
  }
 }
 export async function clearModelCache(provider?: string): Promise<void> {
  if (provider) {
    const cachePath = getCacheFilePath(provider)
    try {
      await unlink(cachePath)
    } catch {
      // ignore if doesn't exist
    }
  } else {
    const cacheDir = getCacheDir()
    try {
      await unlink(join(cacheDir, 'ollama.json'))
      await unlink(join(cacheDir, 'nvidia-nim.json'))
      await unlink(join(cacheDir, 'minimax.json'))
    } catch {
      // ignore
    }
  }
 }
 export async function getModelCacheInfo(): Promise<{ provider: string; age: string } | null> {
  const provider = getAPIProvider()
  const cachePath = getCacheFilePath(provider)
  try {
    await access(cachePath)
  } catch {
    return null
  }
  try {
    const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
    const ageMs = Date.now() - data.timestamp
    const ageHours = Math.floor(ageMs / (1000 * 60 * 60))
    const ageMins = Math.floor((ageMs % (1000 * 60 * 60)) / (1000 * 60))
    return {
      provider: data.provider,
      age: ageHours > 0 ? `${ageHours}h ${ageMins}m` : `${ageMins}m`,
    }
  } catch {
    return null
  }
 }
 export function isCacheAvailable(): boolean {
  const baseUrl = process.env.OPENAI_BASE_URL || ''
  const isLocalOllama = baseUrl.includes('localhost:11434') || baseUrl.includes('localhost:11435')
  const isNvidia = baseUrl.includes('nvidia') || baseUrl.includes('integrate.api.nvidia')
  const isMiniMax = baseUrl.includes('minimax')
  return isLocalOllama || isNvidia || isMiniMax || getAPIProvider() === 'openai'
 }
--- a/src/utils/model/openaiContextWindows.ts
+++ b/src/utils/model/openaiContextWindows.ts
@@ -219,6 +219,17 @@ const OPENAI_CONTEXT_WINDOWS: Record<string, number> = {
  'kimi-k2.5':                262_144,
  'glm-5':                    202_752,
  'glm-4.7':                  202_752,
  // Moonshot AI direct API (api.moonshot.ai/v1). Values from Moonshot's
  // published model card — all K2 tier share 256K context. Prefix matching
  // in lookupByKey catches variants like "kimi-k2.6-preview".
  'kimi-k2.6':                262_144,
  'kimi-k2':                  131_072,
  'kimi-k2-instruct':         131_072,
  'kimi-k2-thinking':         262_144,
  'moonshot-v1-8k':             8_192,
  'moonshot-v1-32k':           32_768,
  'moonshot-v1-128k':         131_072,
 }
 /**
@@ -391,6 +402,15 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
  'kimi-k2.5':                 32_768,
  'glm-5':                     16_384,
  'glm-4.7':                   16_384,
  // Moonshot AI direct API
  'kimi-k2.6':                 32_768,
  'kimi-k2':                   32_768,
  'kimi-k2-instruct':          32_768,
  'kimi-k2-thinking':          32_768,
  'moonshot-v1-8k':             4_096,
  'moonshot-v1-32k':           16_384,
  'moonshot-v1-128k':          32_768,
 }
 function lookupByModel<T>(table: Record<string, T>, model: string): T | undefined {
--- a/src/utils/model/providers.test.ts
+++ b/src/utils/model/providers.test.ts
@@ -107,3 +107,60 @@ test('official OpenAI base URLs now keep provider detection on openai for aliase
  const { getAPIProvider } = await importFreshProvidersModule()
  expect(getAPIProvider()).toBe('openai')
 })
 // isGithubNativeAnthropicMode
 test('isGithubNativeAnthropicMode: false when CLAUDE_CODE_USE_GITHUB is not set', async () => {
  clearProviderEnv()
  process.env.OPENAI_MODEL = 'claude-sonnet-4-5'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode()).toBe(false)
 })
 test('isGithubNativeAnthropicMode: true for bare claude- model via OPENAI_MODEL', async () => {
  clearProviderEnv()
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'claude-sonnet-4-5'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode()).toBe(true)
 })
 test('isGithubNativeAnthropicMode: true for github:copilot:claude- compound format', async () => {
  clearProviderEnv()
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'github:copilot:claude-sonnet-4'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode()).toBe(true)
 })
 test('isGithubNativeAnthropicMode: true when resolvedModel is a claude- model', async () => {
  clearProviderEnv()
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'github:copilot'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode('claude-haiku-4-5')).toBe(true)
 })
 test('isGithubNativeAnthropicMode: false for generic github:copilot alias', async () => {
  clearProviderEnv()
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'github:copilot'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode()).toBe(false)
 })
 test('isGithubNativeAnthropicMode: false for non-Claude model', async () => {
  clearProviderEnv()
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'gpt-4o'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode()).toBe(false)
 })
 test('isGithubNativeAnthropicMode: false for github:copilot:gpt- model', async () => {
  clearProviderEnv()
  process.env.CLAUDE_CODE_USE_GITHUB = '1'
  process.env.OPENAI_MODEL = 'github:copilot:gpt-4o'
  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
  expect(isGithubNativeAnthropicMode()).toBe(false)
 })
--- a/src/utils/model/providers.ts
+++ b/src/utils/model/providers.ts
@@ -45,6 +45,24 @@ export function getAPIProvider(): APIProvider {
 export function usesAnthropicAccountFlow(): boolean {
  return getAPIProvider() === 'firstParty'
 }
 /**
 * Returns true when the GitHub provider should use Anthropic's native API
 * format instead of the OpenAI-compatible shim.
 *
 * Enabled when CLAUDE_CODE_USE_GITHUB=1 and the model string contains "claude-"
 * anywhere (handles bare names like "claude-sonnet-4" and compound formats like
 * "github:copilot:claude-sonnet-4" or any future provider-prefixed variants).
 *
 * api.githubcopilot.com supports Anthropic native format for Claude models,
 * enabling prompt caching via cache_control blocks which significantly reduces
 * per-turn token costs by caching the system prompt and tool definitions.
 */
 export function isGithubNativeAnthropicMode(resolvedModel?: string): boolean {
  if (!isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)) return false
  const model = resolvedModel?.trim() || process.env.OPENAI_MODEL?.trim() || ''
  return model.toLowerCase().includes('claude-')
 }
 function isCodexModel(): boolean {
  return shouldUseCodexTransport(
    process.env.OPENAI_MODEL || '',
--- a/src/utils/permissions/filesystem.ts
+++ b/src/utils/permissions/filesystem.ts
@@ -64,6 +64,7 @@ export const DANGEROUS_FILES = [
  '.profile',
  '.ripgreprc',
  '.mcp.json',
  '.openclaude.json',
  '.claude.json',
 ] as const
--- a/src/utils/plugins/marketplaceManager.ts
+++ b/src/utils/plugins/marketplaceManager.ts
@@ -532,6 +532,7 @@ export async function gitPull(
 ): Promise<{ code: number; stderr: string }> {
  logForDebugging(`git pull: cwd=${cwd} ref=${ref ?? 'default'}`)
  const env = { ...process.env, ...GIT_NO_PROMPT_ENV }
  const baseArgs = ['-c', 'core.hooksPath=/dev/null']
  const credentialArgs = options?.disableCredentialHelper
    ? ['-c', 'credential.helper=']
    : []
@@ -539,7 +540,7 @@ export async function gitPull(
  if (ref) {
    const fetchResult = await execFileNoThrowWithCwd(
      gitExe(),
-      [...credentialArgs, 'fetch', 'origin', ref],
+      [...baseArgs, ...credentialArgs, 'fetch', 'origin', ref],
      { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
    )
@@ -549,7 +550,7 @@ export async function gitPull(
    const checkoutResult = await execFileNoThrowWithCwd(
      gitExe(),
-      [...credentialArgs, 'checkout', ref],
+      [...baseArgs, ...credentialArgs, 'checkout', ref],
      { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
    )
@@ -559,7 +560,7 @@ export async function gitPull(
    const pullResult = await execFileNoThrowWithCwd(
      gitExe(),
-      [...credentialArgs, 'pull', 'origin', ref],
+      [...baseArgs, ...credentialArgs, 'pull', 'origin', ref],
      { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
    )
    if (pullResult.code !== 0) {
@@ -571,7 +572,7 @@ export async function gitPull(
  const result = await execFileNoThrowWithCwd(
    gitExe(),
-    [...credentialArgs, 'pull', 'origin', 'HEAD'],
+    [...baseArgs, ...credentialArgs, 'pull', 'origin', 'HEAD'],
    { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
  )
  if (result.code !== 0) {
@@ -625,6 +626,8 @@ async function gitSubmoduleUpdate(
    [
      '-c',
      'core.sshCommand=ssh -o BatchMode=yes -o StrictHostKeyChecking=yes',
      '-c',
      'core.hooksPath=/dev/null',
      ...credentialArgs,
      'submodule',
      'update',
@@ -810,6 +813,8 @@ export async function gitClone(
  const args = [
    '-c',
    'core.sshCommand=ssh -o BatchMode=yes -o StrictHostKeyChecking=yes',
    '-c',
    'core.hooksPath=/dev/null',
    'clone',
    '--depth',
    '1',
--- a/src/utils/providerAutoDetect.test.ts
+++ b/src/utils/providerAutoDetect.test.ts
@@ -0,0 +1,299 @@
 import { describe, expect, test } from 'bun:test'
 import {
  detectBestProvider,
  detectLocalService,
  detectProviderFromEnv,
 } from './providerAutoDetect.ts'
 // Hermetic env scan: always report "no Codex auth on disk" so tests don't
 // depend on the dev machine's ~/.codex/auth.json state.
 function scan(env: Record<string, string | undefined>) {
  return detectProviderFromEnv({ env, hasCodexAuth: () => false })
 }
 describe('detectProviderFromEnv — priority order', () => {
  test('ANTHROPIC_API_KEY wins over all others', () => {
    expect(
      scan({
        ANTHROPIC_API_KEY: 'sk-ant-x',
        OPENAI_API_KEY: 'sk-x',
        GEMINI_API_KEY: 'gem-x',
      }),
    ).toEqual({ kind: 'anthropic', source: 'ANTHROPIC_API_KEY set' })
  })
  test('CODEX_API_KEY beats OpenAI/Gemini/etc', () => {
    expect(
      scan({
        CODEX_API_KEY: 'codex-x',
        OPENAI_API_KEY: 'sk-x',
      }),
    ).toEqual({ kind: 'codex', source: 'CODEX_API_KEY set' })
  })
  test('CHATGPT_ACCOUNT_ID alone is enough for Codex', () => {
    expect(
      scan({
        CHATGPT_ACCOUNT_ID: 'acct-123',
      }),
    ).toEqual({ kind: 'codex', source: 'CHATGPT_ACCOUNT_ID set' })
  })
  test('Codex auth file on disk is detected without any env', () => {
    expect(
      detectProviderFromEnv({ env: {}, hasCodexAuth: () => true }),
    ).toEqual({ kind: 'codex', source: '~/.codex/auth.json present' })
  })
  test('GITHUB_TOKEN wins over OpenAI', () => {
    expect(
      scan({
        GITHUB_TOKEN: 'ghp-x',
        OPENAI_API_KEY: 'sk-x',
      }),
    ).toEqual({ kind: 'github', source: 'GITHUB_TOKEN set (GitHub Copilot)' })
  })
  test('GH_TOKEN is equivalent to GITHUB_TOKEN', () => {
    expect(
      scan({
        GH_TOKEN: 'ghp-x',
      }),
    ).toEqual({ kind: 'github', source: 'GH_TOKEN set (GitHub Copilot)' })
  })
  test('OPENAI_API_KEYS (plural) detected', () => {
    expect(
      scan({
        OPENAI_API_KEYS: 'sk-a,sk-b',
      }),
    ).toEqual({ kind: 'openai', source: 'OPENAI_API_KEYS set' })
  })
  test('OPENAI_API_KEY reports baseUrl when set', () => {
    expect(
      scan({
        OPENAI_API_KEY: 'sk-x',
        OPENAI_BASE_URL: 'https://openrouter.ai/api/v1',
      }),
    ).toEqual({
      kind: 'openai',
      source: 'OPENAI_API_KEY set',
      baseUrl: 'https://openrouter.ai/api/v1',
    })
  })
  test('GEMINI_API_KEY detected', () => {
    expect(scan({ GEMINI_API_KEY: 'gem-x' })).toEqual({
      kind: 'gemini',
      source: 'GEMINI_API_KEY set',
    })
  })
  test('GOOGLE_API_KEY also detects Gemini', () => {
    expect(scan({ GOOGLE_API_KEY: 'gk-x' })).toEqual({
      kind: 'gemini',
      source: 'GOOGLE_API_KEY set',
    })
  })
  test('MISTRAL_API_KEY detected', () => {
    expect(scan({ MISTRAL_API_KEY: 'mis-x' })).toEqual({
      kind: 'mistral',
      source: 'MISTRAL_API_KEY set',
    })
  })
  test('MINIMAX_API_KEY detected', () => {
    expect(scan({ MINIMAX_API_KEY: 'mm-x' })).toEqual({
      kind: 'minimax',
      source: 'MINIMAX_API_KEY set',
    })
  })
  test('empty-string values are ignored', () => {
    expect(
      scan({
        ANTHROPIC_API_KEY: '',
        OPENAI_API_KEY: '   ',
        GEMINI_API_KEY: 'gem-x',
      }),
    ).toEqual({ kind: 'gemini', source: 'GEMINI_API_KEY set' })
  })
  test('no credentials → null', () => {
    expect(scan({})).toBeNull()
  })
 })
 describe('detectLocalService', () => {
  test('returns Ollama when its /api/tags responds ok', async () => {
    const fetchImpl = (async (input: URL | RequestInfo) => {
      const url = typeof input === 'string' ? input : (input as URL).toString()
      if (url.includes(':11434')) {
        return new Response('{"models":[]}', { status: 200 })
      }
      return new Response('', { status: 404 })
    }) as typeof fetch
    const result = await detectLocalService({
      env: {},
      fetchImpl,
      timeoutMs: 200,
    })
    expect(result?.kind).toBe('ollama')
    expect(result?.baseUrl).toBe('http://localhost:11434')
  })
  test('Ollama wins over LM Studio even when both are reachable', async () => {
    const fetchImpl = (async () => new Response('{}', { status: 200 })) as typeof fetch
    const result = await detectLocalService({
      env: {},
      fetchImpl,
      timeoutMs: 200,
    })
    expect(result?.kind).toBe('ollama')
  })
  test('falls back to LM Studio when Ollama is unreachable', async () => {
    const fetchImpl = (async (input: URL | RequestInfo) => {
      const url = typeof input === 'string' ? input : (input as URL).toString()
      if (url.includes(':1234')) {
        return new Response('{"data":[]}', { status: 200 })
      }
      return new Response('', { status: 404 })
    }) as typeof fetch
    const result = await detectLocalService({
      env: {},
      fetchImpl,
      timeoutMs: 200,
    })
    expect(result?.kind).toBe('lm-studio')
    expect(result?.baseUrl).toBe('http://localhost:1234')
  })
  test('returns null when no local services respond', async () => {
    const fetchImpl = (async () =>
      new Response('', { status: 500 })) as typeof fetch
    const result = await detectLocalService({
      env: {},
      fetchImpl,
      timeoutMs: 200,
    })
    expect(result).toBeNull()
  })
  test('honors OLLAMA_BASE_URL override', async () => {
    const probedUrls: string[] = []
    const fetchImpl = (async (input: URL | RequestInfo) => {
      const url = typeof input === 'string' ? input : (input as URL).toString()
      probedUrls.push(url)
      return new Response('{"models":[]}', { status: 200 })
    }) as typeof fetch
    const result = await detectLocalService({
      env: { OLLAMA_BASE_URL: 'http://10.0.0.5:11434' },
      fetchImpl,
      timeoutMs: 200,
    })
    expect(result?.baseUrl).toBe('http://10.0.0.5:11434')
    expect(probedUrls).toContain('http://10.0.0.5:11434/api/tags')
  })
  test('probe timeout does not throw — returns null', async () => {
    const fetchImpl = (async (_input: URL | RequestInfo, init?: RequestInit) => {
      // Respect the caller's abort signal so the race with timeoutMs is fair.
      return new Promise<Response>((_resolve, reject) => {
        const onAbort = () => reject(new Error('aborted'))
        init?.signal?.addEventListener('abort', onAbort)
        setTimeout(() => {
          init?.signal?.removeEventListener('abort', onAbort)
          _resolve(new Response('ok'))
        }, 500)
      })
    }) as typeof fetch
    const result = await detectLocalService({
      env: {},
      fetchImpl,
      timeoutMs: 50,
    })
    expect(result).toBeNull()
  })
  test('network errors do not throw', async () => {
    const fetchImpl = (async () => {
      throw new Error('ECONNREFUSED')
    }) as typeof fetch
    const result = await detectLocalService({
      env: {},
      fetchImpl,
      timeoutMs: 200,
    })
    expect(result).toBeNull()
  })
 })
 describe('detectBestProvider — orchestrator', () => {
  test('env match short-circuits the local probe', async () => {
    let probeCalled = false
    const fetchImpl = (async () => {
      probeCalled = true
      return new Response('{}', { status: 200 })
    }) as typeof fetch
    const result = await detectBestProvider({
      env: { ANTHROPIC_API_KEY: 'sk-ant' },
      fetchImpl,
      timeoutMs: 200,
      hasCodexAuth: () => false,
    })
    expect(result?.kind).toBe('anthropic')
    expect(probeCalled).toBe(false)
  })
  test('env miss falls through to local-service probe', async () => {
    const fetchImpl = (async () => new Response('{}', { status: 200 })) as typeof fetch
    const result = await detectBestProvider({
      env: {},
      fetchImpl,
      timeoutMs: 200,
      hasCodexAuth: () => false,
    })
    expect(result?.kind).toBe('ollama')
  })
  test('skipLocal prevents network probes', async () => {
    let probeCalled = false
    const fetchImpl = (async () => {
      probeCalled = true
      return new Response('{}', { status: 200 })
    }) as typeof fetch
    const result = await detectBestProvider({
      env: {},
      fetchImpl,
      skipLocal: true,
      hasCodexAuth: () => false,
    })
    expect(result).toBeNull()
    expect(probeCalled).toBe(false)
  })
  test('completely empty environment returns null', async () => {
    const fetchImpl = (async () => {
      throw new Error('nothing reachable')
    }) as typeof fetch
    const result = await detectBestProvider({
      env: {},
      fetchImpl,
      timeoutMs: 100,
      hasCodexAuth: () => false,
    })
    expect(result).toBeNull()
  })
 })
--- a/src/utils/providerAutoDetect.ts
+++ b/src/utils/providerAutoDetect.ts
@@ -0,0 +1,283 @@
 /**
 * Zero-config provider autodetection.
 *
 * Scans the environment (API keys, OAuth tokens, stored credentials) and local
 * network (Ollama, LM Studio) to pick the best provider for first-run users
 * who have not explicitly configured one. Returns a structured detection
 * result that callers can consume to build a launch-ready profile env, or
 * null when nothing is detected — in which case the existing onboarding /
 * picker flow should take over.
 *
 * Detection priority (first match wins):
 *   1. ANTHROPIC_API_KEY → first-party Claude (most capable default)
 *   2. Codex: CODEX_API_KEY, CHATGPT_ACCOUNT_ID, or valid ~/.codex/auth.json
 *   3. GitHub Copilot: GITHUB_TOKEN or GH_TOKEN
 *   4. OPENAI_API_KEY / OPENAI_API_KEYS
 *   5. GEMINI_API_KEY or GOOGLE_API_KEY
 *   6. MISTRAL_API_KEY
 *   7. MINIMAX_API_KEY
 *   8. Local Ollama reachable (default localhost:11434)
 *   9. Local LM Studio reachable (default localhost:1234)
 *
 * Local-service probes are parallelized and cheap (short timeout, no
 * request body). Env scans are synchronous and run first so we don't make
 * network calls when a credential is already present.
 *
 * This module intentionally does NOT decide whether to apply the detection;
 * callers should gate on hasExplicitProviderSelection() (providerProfile.ts)
 * and the presence of a persisted profile file.
 */
 import { existsSync } from 'fs'
 import { homedir } from 'os'
 import { join } from 'path'
 export type DetectedProviderKind =
  | 'anthropic'
  | 'codex'
  | 'github'
  | 'openai'
  | 'gemini'
  | 'mistral'
  | 'minimax'
  | 'ollama'
  | 'lm-studio'
 export type DetectedProvider = {
  kind: DetectedProviderKind
  /** One-line human-readable reason, e.g. "ANTHROPIC_API_KEY set". */
  source: string
  /** Present when the detection already resolved a usable base URL. */
  baseUrl?: string
  /** Present when detection also narrowed down a specific model. */
  model?: string
 }
 type EnvLike = NodeJS.ProcessEnv | Record<string, string | undefined>
 function envHasNonEmpty(env: EnvLike, key: string): boolean {
  const value = env[key]
  return typeof value === 'string' && value.trim().length > 0
 }
 function firstSet(env: EnvLike, keys: readonly string[]): string | undefined {
  for (const key of keys) {
    if (envHasNonEmpty(env, key)) return key
  }
  return undefined
 }
 function defaultHasCodexAuthFile(): boolean {
  const paths = [
    process.env.CODEX_AUTH_PATH,
    join(homedir(), '.codex', 'auth.json'),
  ]
  return paths.some(p => p && existsSync(p))
 }
 export type DetectProviderFromEnvOptions = {
  env?: EnvLike
  /**
   * Override Codex auth-file detection. Primarily for tests — the default
   * implementation checks ~/.codex/auth.json and CODEX_AUTH_PATH on disk.
   */
  hasCodexAuth?: () => boolean
 }
 /**
 * Synchronous env-only scan. Returns the highest-priority env-provided
 * provider, or null if nothing is present. Intentionally does not touch
 * the network — fast path for the common case where a user has exported
 * one of the standard API-key env vars.
 */
 function isOptionsObject(
  value: EnvLike | DetectProviderFromEnvOptions | undefined,
 ): value is DetectProviderFromEnvOptions {
  if (!value || typeof value !== 'object') return false
  if ('hasCodexAuth' in value && typeof value.hasCodexAuth === 'function') {
    return true
  }
  if ('env' in value && typeof (value as { env?: unknown }).env === 'object') {
    return true
  }
  return false
 }
 export function detectProviderFromEnv(
  envOrOptions: EnvLike | DetectProviderFromEnvOptions = process.env,
 ): DetectedProvider | null {
  const options: DetectProviderFromEnvOptions = isOptionsObject(envOrOptions)
    ? envOrOptions
    : { env: envOrOptions as EnvLike }
  const env = options.env ?? process.env
  const hasCodexAuth = options.hasCodexAuth ?? defaultHasCodexAuthFile
  if (envHasNonEmpty(env, 'ANTHROPIC_API_KEY')) {
    return { kind: 'anthropic', source: 'ANTHROPIC_API_KEY set' }
  }
  if (
    envHasNonEmpty(env, 'CODEX_API_KEY') ||
    envHasNonEmpty(env, 'CHATGPT_ACCOUNT_ID') ||
    envHasNonEmpty(env, 'CODEX_ACCOUNT_ID') ||
    hasCodexAuth()
  ) {
    const sourceEnv =
      firstSet(env, ['CODEX_API_KEY', 'CHATGPT_ACCOUNT_ID', 'CODEX_ACCOUNT_ID'])
    return {
      kind: 'codex',
      source: sourceEnv ? `${sourceEnv} set` : '~/.codex/auth.json present',
    }
  }
  const githubKey = firstSet(env, ['GITHUB_TOKEN', 'GH_TOKEN'])
  if (githubKey) {
    return {
      kind: 'github',
      source: `${githubKey} set (GitHub Copilot)`,
    }
  }
  const openaiKey = firstSet(env, ['OPENAI_API_KEYS', 'OPENAI_API_KEY'])
  if (openaiKey) {
    return {
      kind: 'openai',
      source: `${openaiKey} set`,
      baseUrl: env.OPENAI_BASE_URL ?? env.OPENAI_API_BASE,
    }
  }
  const geminiKey = firstSet(env, ['GEMINI_API_KEY', 'GOOGLE_API_KEY'])
  if (geminiKey) {
    return { kind: 'gemini', source: `${geminiKey} set` }
  }
  if (envHasNonEmpty(env, 'MISTRAL_API_KEY')) {
    return { kind: 'mistral', source: 'MISTRAL_API_KEY set' }
  }
  if (envHasNonEmpty(env, 'MINIMAX_API_KEY')) {
    return { kind: 'minimax', source: 'MINIMAX_API_KEY set' }
  }
  return null
 }
 type LocalProbe = {
  kind: DetectedProviderKind
  url: string
  timeoutMs: number
  source: string
  baseUrl: string
 }
 const DEFAULT_LOCAL_PROBE_TIMEOUT_MS = 1200
 async function probeReachable(
  url: string,
  timeoutMs: number,
  fetchImpl: typeof fetch,
 ): Promise<boolean> {
  const controller = new AbortController()
  const timer = setTimeout(() => controller.abort(), timeoutMs)
  try {
    const response = await fetchImpl(url, {
      method: 'GET',
      signal: controller.signal,
    })
    return response.ok
  } catch {
    return false
  } finally {
    clearTimeout(timer)
  }
 }
 /**
 * Returns the highest-priority local service reachable from the host.
 * Runs probes in parallel and picks by priority rather than first-response,
 * so slow-but-preferred services still win over fast-but-lower-priority ones.
 */
 export async function detectLocalService(options?: {
  env?: EnvLike
  fetchImpl?: typeof fetch
  timeoutMs?: number
 }): Promise<DetectedProvider | null> {
  const env = options?.env ?? process.env
  const fetchImpl = options?.fetchImpl ?? globalThis.fetch
  const timeoutMs = options?.timeoutMs ?? DEFAULT_LOCAL_PROBE_TIMEOUT_MS
  const ollamaBase = (env.OLLAMA_BASE_URL ?? 'http://localhost:11434').replace(
    /\/+$/,
    '',
  )
  const lmStudioBase = (env.LM_STUDIO_BASE_URL ?? 'http://localhost:1234').replace(
    /\/+$/,
    '',
  )
  const probes: LocalProbe[] = [
    {
      kind: 'ollama',
      url: `${ollamaBase}/api/tags`,
      timeoutMs,
      source: `Ollama reachable at ${ollamaBase}`,
      baseUrl: ollamaBase,
    },
    {
      kind: 'lm-studio',
      url: `${lmStudioBase}/v1/models`,
      timeoutMs,
      source: `LM Studio reachable at ${lmStudioBase}`,
      baseUrl: lmStudioBase,
    },
  ]
  const results = await Promise.all(
    probes.map(async probe => ({
      probe,
      reachable: await probeReachable(probe.url, probe.timeoutMs, fetchImpl),
    })),
  )
  for (const { probe, reachable } of results) {
    if (reachable) {
      return {
        kind: probe.kind,
        source: probe.source,
        baseUrl: probe.baseUrl,
      }
    }
  }
  return null
 }
 /**
 * Orchestrator: env scan first (sync, free), then local-service probes
 * (async, ~1-2s worst case) only if nothing was found in env.
 */
 export async function detectBestProvider(options?: {
  env?: EnvLike
  fetchImpl?: typeof fetch
  timeoutMs?: number
  /** Skip local-service probes — useful for tests or offline smoke checks. */
  skipLocal?: boolean
  /** Override for Codex auth-file detection. See detectProviderFromEnv. */
  hasCodexAuth?: () => boolean
 }): Promise<DetectedProvider | null> {
  const env = options?.env ?? process.env
  const fromEnv = detectProviderFromEnv({
    env,
    hasCodexAuth: options?.hasCodexAuth,
  })
  if (fromEnv) return fromEnv
  if (options?.skipLocal) return null
  return detectLocalService({
    env,
    fetchImpl: options?.fetchImpl,
    timeoutMs: options?.timeoutMs,
  })
 }
--- a/src/utils/providerDiscovery.test.ts
+++ b/src/utils/providerDiscovery.test.ts
@@ -1,9 +1,9 @@
 import { afterEach, expect, mock, test } from 'bun:test'
-import {
+async function loadProviderDiscoveryModule() {
-  getLocalOpenAICompatibleProviderLabel,
+  // @ts-expect-error cache-busting query string for Bun module mocks
-  listOpenAICompatibleModels,
+  return import(`./providerDiscovery.js?ts=${Date.now()}-${Math.random()}`)
-} from './providerDiscovery.js'
+}
 const originalFetch = globalThis.fetch
 const originalEnv = {
@@ -16,6 +16,8 @@ afterEach(() => {
 })
 test('lists models from a local openai-compatible /models endpoint', async () => {
  const { listOpenAICompatibleModels } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock((input, init) => {
    const url = typeof input === 'string' ? input : input.url
    expect(url).toBe('http://localhost:1234/v1/models')
@@ -47,6 +49,8 @@ test('lists models from a local openai-compatible /models endpoint', async () =>
 })
 test('returns null when a local openai-compatible /models request fails', async () => {
  const { listOpenAICompatibleModels } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock(() =>
    Promise.resolve(new Response('not available', { status: 503 })),
  ) as typeof globalThis.fetch
@@ -56,13 +60,19 @@ test('returns null when a local openai-compatible /models request fails', async
  ).resolves.toBeNull()
 })
-test('detects LM Studio from the default localhost port', () => {
+test('detects LM Studio from the default localhost port', async () => {
  const { getLocalOpenAICompatibleProviderLabel } =
    await loadProviderDiscoveryModule()
  expect(getLocalOpenAICompatibleProviderLabel('http://localhost:1234/v1')).toBe(
    'LM Studio',
  )
 })
-test('detects common local openai-compatible providers by hostname', () => {
+test('detects common local openai-compatible providers by hostname', async () => {
  const { getLocalOpenAICompatibleProviderLabel } =
    await loadProviderDiscoveryModule()
  expect(
    getLocalOpenAICompatibleProviderLabel('http://localai.local:8080/v1'),
  ).toBe('LocalAI')
@@ -71,8 +81,283 @@ test('detects common local openai-compatible providers by hostname', () => {
  ).toBe('vLLM')
 })
-test('falls back to a generic local openai-compatible label', () => {
+test('detects Moonshot (Kimi) from api.moonshot.ai hostname', async () => {
  const { getLocalOpenAICompatibleProviderLabel } =
    await loadProviderDiscoveryModule()
  expect(
    getLocalOpenAICompatibleProviderLabel('https://api.moonshot.ai/v1'),
  ).toBe('Moonshot (Kimi)')
 })
 test('falls back to a generic local openai-compatible label', async () => {
  const { getLocalOpenAICompatibleProviderLabel } =
    await loadProviderDiscoveryModule()
  expect(
    getLocalOpenAICompatibleProviderLabel('http://127.0.0.1:8080/v1'),
  ).toBe('Local OpenAI-compatible')
 })
 test('ollama generation readiness reports unreachable when tags endpoint is down', async () => {
  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
  const calledUrls: string[] = []
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    calledUrls.push(url)
    return Promise.resolve(new Response('not available', { status: 503 }))
  }) as typeof globalThis.fetch
  await expect(
    probeOllamaGenerationReadiness({
      baseUrl: 'http://localhost:11434',
    }),
  ).resolves.toMatchObject({
    state: 'unreachable',
    models: [],
  })
  expect(calledUrls).toEqual([
    'http://localhost:11434/api/tags',
  ])
 })
 test('ollama generation readiness reports no models when server is reachable', async () => {
  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
  const calledUrls: string[] = []
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    calledUrls.push(url)
    return Promise.resolve(
      new Response(JSON.stringify({ models: [] }), {
        status: 200,
        headers: { 'Content-Type': 'application/json' },
      }),
    )
  }) as typeof globalThis.fetch
  await expect(
    probeOllamaGenerationReadiness({
      baseUrl: 'http://localhost:11434',
    }),
  ).resolves.toMatchObject({
    state: 'no_models',
    models: [],
  })
  expect(calledUrls).toEqual([
    'http://localhost:11434/api/tags',
  ])
 })
 test('ollama generation readiness reports generation_failed when requested model is missing', async () => {
  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
  const calledUrls: string[] = []
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    calledUrls.push(url)
    return Promise.resolve(
      new Response(
        JSON.stringify({
          models: [{ name: 'llama3.1:8b', size: 1024 }],
        }),
        {
          status: 200,
          headers: { 'Content-Type': 'application/json' },
        },
      ),
    )
  }) as typeof globalThis.fetch
  await expect(
    probeOllamaGenerationReadiness({
      baseUrl: 'http://localhost:11434',
      model: 'qwen2.5-coder:7b',
    }),
  ).resolves.toMatchObject({
    state: 'generation_failed',
    probeModel: 'qwen2.5-coder:7b',
    detail: 'requested model not installed: qwen2.5-coder:7b',
  })
  expect(calledUrls).toEqual(['http://localhost:11434/api/tags'])
 })
 test('ollama generation readiness reports generation failures when chat probe fails', async () => {
  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    if (url.endsWith('/api/tags')) {
      return Promise.resolve(
        new Response(
          JSON.stringify({
            models: [{ name: 'qwen2.5-coder:7b', size: 42 }],
          }),
          {
            status: 200,
            headers: { 'Content-Type': 'application/json' },
          },
        ),
      )
    }
    return Promise.resolve(new Response('model not found', { status: 404 }))
  }) as typeof globalThis.fetch
  await expect(
    probeOllamaGenerationReadiness({
      baseUrl: 'http://localhost:11434',
      model: 'qwen2.5-coder:7b',
    }),
  ).resolves.toMatchObject({
    state: 'generation_failed',
    probeModel: 'qwen2.5-coder:7b',
  })
 })
 test('ollama generation readiness reports generation_failed when chat probe returns invalid JSON', async () => {
  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    if (url.endsWith('/api/tags')) {
      return Promise.resolve(
        new Response(
          JSON.stringify({
            models: [{ name: 'llama3.1:8b', size: 1024 }],
          }),
          {
            status: 200,
            headers: { 'Content-Type': 'application/json' },
          },
        ),
      )
    }
    return Promise.resolve(
      new Response('<html>proxy error</html>', {
        status: 200,
        headers: { 'Content-Type': 'text/html' },
      }),
    )
  }) as typeof globalThis.fetch
  await expect(
    probeOllamaGenerationReadiness({
      baseUrl: 'http://localhost:11434',
    }),
  ).resolves.toMatchObject({
    state: 'generation_failed',
    probeModel: 'llama3.1:8b',
    detail: 'invalid JSON response',
  })
 })
 test('ollama generation readiness reports ready when chat probe succeeds', async () => {
  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    if (url.endsWith('/api/tags')) {
      return Promise.resolve(
        new Response(
          JSON.stringify({
            models: [{ name: 'llama3.1:8b', size: 1024 }],
          }),
          {
            status: 200,
            headers: { 'Content-Type': 'application/json' },
          },
        ),
      )
    }
    return Promise.resolve(
      new Response(
        JSON.stringify({
          message: { role: 'assistant', content: 'OK' },
          done: true,
        }),
        {
          status: 200,
          headers: { 'Content-Type': 'application/json' },
        },
      ),
    )
  }) as typeof globalThis.fetch
  await expect(
    probeOllamaGenerationReadiness({
      baseUrl: 'http://localhost:11434',
    }),
  ).resolves.toMatchObject({
    state: 'ready',
    probeModel: 'llama3.1:8b',
  })
 })
 test('atomic chat readiness reports unreachable when /v1/models is down', async () => {
  const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
  const calledUrls: string[] = []
  globalThis.fetch = mock(input => {
    const url = typeof input === 'string' ? input : input.url
    calledUrls.push(url)
    return Promise.resolve(new Response('unavailable', { status: 503 }))
  }) as typeof globalThis.fetch
  await expect(
    probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
  ).resolves.toEqual({ state: 'unreachable' })
  expect(calledUrls[0]).toBe('http://127.0.0.1:1337/v1/models')
 })
 test('atomic chat readiness reports no_models when server is reachable but empty', async () => {
  const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock(() =>
    Promise.resolve(
      new Response(JSON.stringify({ data: [] }), {
        status: 200,
        headers: { 'Content-Type': 'application/json' },
      }),
    ),
  ) as typeof globalThis.fetch
  await expect(
    probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
  ).resolves.toEqual({ state: 'no_models' })
 })
 test('atomic chat readiness returns loaded model ids when ready', async () => {
  const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
  globalThis.fetch = mock(() =>
    Promise.resolve(
      new Response(
        JSON.stringify({
          data: [
            { id: 'Qwen3_5-4B_Q4_K_M' },
            { id: 'llama-3.1-8b-instruct' },
          ],
        }),
        {
          status: 200,
          headers: { 'Content-Type': 'application/json' },
        },
      ),
    ),
  ) as typeof globalThis.fetch
  await expect(
    probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
  ).resolves.toEqual({
    state: 'ready',
    models: ['Qwen3_5-4B_Q4_K_M', 'llama-3.1-8b-instruct'],
  })
 })
--- a/src/utils/providerDiscovery.ts
+++ b/src/utils/providerDiscovery.ts
@@ -4,6 +4,13 @@ import { DEFAULT_OPENAI_BASE_URL } from '../services/api/providerConfig.js'
 export const DEFAULT_OLLAMA_BASE_URL = 'http://localhost:11434'
 export const DEFAULT_ATOMIC_CHAT_BASE_URL = 'http://127.0.0.1:1337'
 export type OllamaGenerationReadiness = {
  state: 'ready' | 'unreachable' | 'no_models' | 'generation_failed'
  models: OllamaModelDescriptor[]
  probeModel?: string
  detail?: string
 }
 function withTimeoutSignal(timeoutMs: number): {
  signal: AbortSignal
  clear: () => void
@@ -20,6 +27,83 @@ function trimTrailingSlash(value: string): string {
  return value.replace(/\/+$/, '')
 }
 function compactDetail(value: string, maxLength = 180): string {
  const compact = value.trim().replace(/\s+/g, ' ')
  if (!compact) {
    return ''
  }
  if (compact.length <= maxLength) {
    return compact
  }
  return `${compact.slice(0, maxLength)}...`
 }
 type OllamaTagsPayload = {
  models?: Array<{
    name?: string
    size?: number
    details?: {
      family?: string
      families?: string[]
      parameter_size?: string
      quantization_level?: string
    }
  }>
 }
 function normalizeOllamaModels(
  payload: OllamaTagsPayload,
 ): OllamaModelDescriptor[] {
  return (payload.models ?? [])
    .filter(model => Boolean(model.name))
    .map(model => ({
      name: model.name!,
      sizeBytes: typeof model.size === 'number' ? model.size : null,
      family: model.details?.family ?? null,
      families: model.details?.families ?? [],
      parameterSize: model.details?.parameter_size ?? null,
      quantizationLevel: model.details?.quantization_level ?? null,
    }))
 }
 async function fetchOllamaModelsProbe(
  baseUrl?: string,
  timeoutMs = 5000,
 ): Promise<{
  reachable: boolean
  models: OllamaModelDescriptor[]
 }> {
  const { signal, clear } = withTimeoutSignal(timeoutMs)
  try {
    const response = await fetch(`${getOllamaApiBaseUrl(baseUrl)}/api/tags`, {
      method: 'GET',
      signal,
    })
    if (!response.ok) {
      return {
        reachable: false,
        models: [],
      }
    }
    const payload = (await response.json().catch(() => ({}))) as OllamaTagsPayload
    return {
      reachable: true,
      models: normalizeOllamaModels(payload),
    }
  } catch {
    return {
      reachable: false,
      models: [],
    }
  } finally {
    clear()
  }
 }
 export function getOllamaApiBaseUrl(baseUrl?: string): string {
  const parsed = new URL(
    baseUrl || process.env.OLLAMA_BASE_URL || DEFAULT_OLLAMA_BASE_URL,
@@ -113,6 +197,10 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
    if (host.includes('minimax') || haystack.includes('minimax')) {
      return 'MiniMax'
    }
    // Moonshot AI (Kimi) direct API
    if (host.includes('moonshot') || haystack.includes('moonshot') || haystack.includes('kimi')) {
      return 'Moonshot (Kimi)'
    }
  } catch {
    // Fall back to the generic label when the base URL is malformed.
  }
@@ -121,61 +209,15 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
 }
 export async function hasLocalOllama(baseUrl?: string): Promise<boolean> {
-  const { signal, clear } = withTimeoutSignal(1200)
+  const { reachable } = await fetchOllamaModelsProbe(baseUrl, 1200)
-  try {
+  return reachable
    const response = await fetch(`${getOllamaApiBaseUrl(baseUrl)}/api/tags`, {
      method: 'GET',
      signal,
    })
    return response.ok
  } catch {
    return false
  } finally {
    clear()
  }
 }
 export async function listOllamaModels(
  baseUrl?: string,
 ): Promise<OllamaModelDescriptor[]> {
-  const { signal, clear } = withTimeoutSignal(5000)
+  const { models } = await fetchOllamaModelsProbe(baseUrl, 5000)
-  try {
+  return models
    const response = await fetch(`${getOllamaApiBaseUrl(baseUrl)}/api/tags`, {
      method: 'GET',
      signal,
    })
    if (!response.ok) {
      return []
    }
    const data = (await response.json()) as {
      models?: Array<{
        name?: string
        size?: number
        details?: {
          family?: string
          families?: string[]
          parameter_size?: string
          quantization_level?: string
        }
      }>
    }
    return (data.models ?? [])
      .filter(model => Boolean(model.name))
      .map(model => ({
        name: model.name!,
        sizeBytes: typeof model.size === 'number' ? model.size : null,
        family: model.details?.family ?? null,
        families: model.details?.families ?? [],
        parameterSize: model.details?.parameter_size ?? null,
        quantizationLevel: model.details?.quantization_level ?? null,
      }))
  } catch {
    return []
  } finally {
    clear()
  }
 }
 export async function listOpenAICompatibleModels(options?: {
@@ -260,6 +302,24 @@ export async function listAtomicChatModels(
  }
 }
 export type AtomicChatReadiness =
  | { state: 'unreachable' }
  | { state: 'no_models' }
  | { state: 'ready'; models: string[] }
 export async function probeAtomicChatReadiness(options?: {
  baseUrl?: string
 }): Promise<AtomicChatReadiness> {
  if (!(await hasLocalAtomicChat(options?.baseUrl))) {
    return { state: 'unreachable' }
  }
  const models = await listAtomicChatModels(options?.baseUrl)
  if (models.length === 0) {
    return { state: 'no_models' }
  }
  return { state: 'ready', models }
 }
 export async function benchmarkOllamaModel(
  modelName: string,
  baseUrl?: string,
@@ -294,3 +354,106 @@ export async function benchmarkOllamaModel(
    clear()
  }
 }
 export async function probeOllamaGenerationReadiness(options?: {
  baseUrl?: string
  model?: string
  timeoutMs?: number
 }): Promise<OllamaGenerationReadiness> {
  const timeoutMs = options?.timeoutMs ?? 8000
  const { reachable, models } = await fetchOllamaModelsProbe(
    options?.baseUrl,
    timeoutMs,
  )
  if (!reachable) {
    return {
      state: 'unreachable',
      models: [],
    }
  }
  if (models.length === 0) {
    return {
      state: 'no_models',
      models: [],
    }
  }
  const requestedModel = options?.model?.trim() || undefined
  if (requestedModel && !models.some(model => model.name === requestedModel)) {
    return {
      state: 'generation_failed',
      models,
      probeModel: requestedModel,
      detail: `requested model not installed: ${requestedModel}`,
    }
  }
  const probeModel = requestedModel ?? models[0]!.name
  const { signal, clear } = withTimeoutSignal(timeoutMs)
  try {
    const response = await fetch(`${getOllamaApiBaseUrl(options?.baseUrl)}/api/chat`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      signal,
      body: JSON.stringify({
        model: probeModel,
        stream: false,
        messages: [{ role: 'user', content: 'Reply with OK.' }],
        options: {
          temperature: 0,
          num_predict: 8,
        },
      }),
    })
    if (!response.ok) {
      const responseBody = await response.text().catch(() => '')
      const detailSuffix = compactDetail(responseBody)
      return {
        state: 'generation_failed',
        models,
        probeModel,
        detail: detailSuffix
          ? `status ${response.status}: ${detailSuffix}`
          : `status ${response.status}`,
      }
    }
    try {
      await response.json()
    } catch {
      return {
        state: 'generation_failed',
        models,
        probeModel,
        detail: 'invalid JSON response',
      }
    }
    return {
      state: 'ready',
      models,
      probeModel,
    }
  } catch (error) {
    const detail =
      error instanceof Error
        ? error.name === 'AbortError'
          ? 'request timed out'
          : error.message
        : String(error)
    return {
      state: 'generation_failed',
      models,
      probeModel,
      detail,
    }
  } finally {
    clear()
  }
 }
--- a/src/utils/providerProfile.test.ts
+++ b/src/utils/providerProfile.test.ts
@@ -572,31 +572,64 @@ test('buildStartupEnvFromProfile leaves explicit provider selections untouched',
  assert.equal(env.OPENAI_API_KEY, undefined)
 })
-test('buildStartupEnvFromProfile lets saved startup profile override profile-managed env', async () => {
+test('buildStartupEnvFromProfile preserves plural-profile env when the legacy file is stale', async () => {
  // Regression: a user saves a provider via /provider (plural system).
  // addProviderProfile does NOT sync the legacy .openclaude-profile.json,
  // so the legacy file retains whatever it had from an earlier setup (e.g.
  // OpenAI defaults). At startup, applyActiveProviderProfileFromConfig()
  // correctly applies the active plural profile (Moonshot) first, marking
  // env with CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED=1. The legacy-file
  // load must NOT overwrite that env — it previously did, surfacing as
  // "banner shows the wrong provider / model".
  const processEnv = {
    CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED: '1',
-    CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID: 'saved_ollama',
+    CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID: 'saved_moonshot',
    CLAUDE_CODE_USE_OPENAI: '1',
-    OPENAI_BASE_URL: 'http://localhost:11434/v1',
+    OPENAI_BASE_URL: 'https://api.moonshot.ai/v1',
-    OPENAI_MODEL: 'llama3.1:8b',
+    OPENAI_MODEL: 'kimi-k2.6',
  }
  const env = await buildStartupEnvFromProfile({
    // Stale legacy file — points at SambaNova, but user's active plural
    // profile is Moonshot and was just applied.
    persisted: profile('openai', {
-      OPENAI_API_KEY: 'sk-persisted',
+      OPENAI_API_KEY: 'sk-stale',
      OPENAI_MODEL: 'Meta-Llama-3.1-70B-Instruct',
      OPENAI_BASE_URL: 'https://api.sambanova.ai/v1',
    }),
    processEnv,
  })
  assert.equal(env, processEnv)
  assert.equal(env.OPENAI_BASE_URL, 'https://api.moonshot.ai/v1')
  assert.equal(env.OPENAI_MODEL, 'kimi-k2.6')
  // Plural markers are retained — downstream code uses them to verify the
  // env still belongs to the profile it was applied from.
  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED, '1')
  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, 'saved_moonshot')
 })
 test('buildStartupEnvFromProfile falls back to legacy file when plural system has not applied', async () => {
  // Counter-example: first-run user with only the legacy file (no plural
  // active profile yet). The legacy file is the correct source, so the
  // load must proceed as before.
  const processEnv = {
    CLAUDE_CODE_USE_OPENAI: '1',
  }
  const env = await buildStartupEnvFromProfile({
    persisted: profile('openai', {
      OPENAI_API_KEY: 'sk-legacy',
      OPENAI_MODEL: 'gpt-4o',
      OPENAI_BASE_URL: 'https://api.openai.com/v1',
    }),
    processEnv,
  })
  assert.notEqual(env, processEnv)
-  assert.equal(env.CLAUDE_CODE_USE_OPENAI, '1')
+  assert.equal(env.OPENAI_API_KEY, 'sk-legacy')
-  assert.equal(env.OPENAI_API_KEY, 'sk-persisted')
+  assert.equal(env.OPENAI_BASE_URL, 'https://api.openai.com/v1')
-  assert.equal(env.OPENAI_MODEL, 'Meta-Llama-3.1-70B-Instruct')
+  assert.equal(env.OPENAI_MODEL, 'gpt-4o')
  assert.equal(env.OPENAI_BASE_URL, 'https://api.sambanova.ai/v1')
  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED, undefined)
  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, undefined)
 })
 test('buildStartupEnvFromProfile treats explicit falsey provider flags as user intent', async () => {
--- a/src/utils/providerProfile.ts
+++ b/src/utils/providerProfile.ts
@@ -841,43 +841,35 @@ export async function buildStartupEnvFromProfile(options?: {
  const processEnv = options?.processEnv ?? process.env
  const persisted = options?.persisted ?? loadProfileFile()
  // Saved /provider profiles should still win over provider-manager env that was
  // auto-applied during startup. Only an explicit shell/flag provider selection
  // should bypass the persisted startup profile.
  //
  const profileManagedEnv = processEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED === '1'
-  // If the user explicitly selected a provider via env, allow it to bypass
+  // The legacy single-profile file (~/.openclaude-profile.json) is a
-  // the persisted profile only when we can prove it was managed by the
+  // first-run / fallback mechanism. The newer plural provider-profile
-  // persisted profile env itself.
+  // system (`/provider` presets + activeProviderProfileId in config) is
  // applied earlier in the bootstrap via applyActiveProviderProfileFromConfig
  // and signals completion with CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED=1.
  //
-  // Practically: on initial startup, provider routing env vars can already
+  // If the plural system has already set env, trust it — do NOT overlay the
-  // be present due to earlier auto-application steps. We should still apply
+  // legacy file. addProviderProfile() does not sync the legacy file, so a
-  // the persisted profile rather than returning early.
+  // stale legacy file (e.g. OpenAI defaults from an earlier manual setup)
  // would otherwise overwrite the correct plural env and surface as the
  // "banner shows gpt-4o / api.openai.com even though my saved profile is
  // Moonshot" bug.
  if (profileManagedEnv) {
    return processEnv
  }
  if (!persisted) {
    return processEnv
  }
  const launchProcessEnv = profileManagedEnv
    ? (() => {
        const cleanedEnv = { ...processEnv }
        for (const key of PROFILE_ENV_KEYS) {
          delete cleanedEnv[key]
        }
        delete cleanedEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED
        delete cleanedEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID
        return cleanedEnv
      })()
    : processEnv
  return buildLaunchEnv({
    profile: persisted.profile,
    persisted,
    goal:
      options?.goal ??
      normalizeRecommendationGoal(processEnv.OPENCLAUDE_PROFILE_GOAL),
-    processEnv: launchProcessEnv,
+    processEnv,
    getOllamaChatBaseUrl:
      options?.getOllamaChatBaseUrl ?? getOllamaChatBaseUrl,
    resolveOllamaDefaultModel: options?.resolveOllamaDefaultModel,
--- a/src/utils/providerProfiles.test.ts
+++ b/src/utils/providerProfiles.test.ts
@@ -256,6 +256,83 @@ describe('applyActiveProviderProfileFromConfig', () => {
    expect(process.env.OPENAI_MODEL).toBe('qwen2.5:3b')
  })
  test('applies active profile when a bare CLAUDE_CODE_USE_OPENAI flag is stale (no BASE_URL/MODEL)', async () => {
    // Regression: a leftover `CLAUDE_CODE_USE_OPENAI=1` in the shell with no
    // paired OPENAI_BASE_URL / OPENAI_MODEL is not a real explicit selection
    // — it's a stale export. The previous guard treated it as intent and
    // skipped the saved profile, causing the startup banner to show hardcoded
    // defaults (gpt-4o @ api.openai.com) instead of the user's active
    // profile.
    const { applyActiveProviderProfileFromConfig } =
      await importFreshProviderProfileModules()
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
    delete process.env.OPENAI_BASE_URL
    delete process.env.OPENAI_API_BASE
    delete process.env.OPENAI_MODEL
    const applied = applyActiveProviderProfileFromConfig({
      providerProfiles: [
        buildProfile({
          id: 'saved_moonshot',
          baseUrl: 'https://api.moonshot.ai/v1',
          model: 'kimi-k2.6',
        }),
      ],
      activeProviderProfileId: 'saved_moonshot',
    } as any)
    expect(applied?.id).toBe('saved_moonshot')
    expect(process.env.OPENAI_BASE_URL).toBe('https://api.moonshot.ai/v1')
    expect(process.env.OPENAI_MODEL).toBe('kimi-k2.6')
  })
  test('still respects complete shell selection with USE flag + BASE_URL', async () => {
    // Counter-example: when the user really did set both the flag AND a
    // concrete BASE_URL, that IS explicit intent and wins over the saved
    // profile. This preserves the original "explicit startup wins" semantic.
    const { applyActiveProviderProfileFromConfig } =
      await importFreshProviderProfileModules()
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
    process.env.OPENAI_BASE_URL = 'http://192.168.1.1:8080/v1'
    delete process.env.OPENAI_MODEL
    const applied = applyActiveProviderProfileFromConfig({
      providerProfiles: [
        buildProfile({
          id: 'saved_moonshot',
          baseUrl: 'https://api.moonshot.ai/v1',
          model: 'kimi-k2.6',
        }),
      ],
      activeProviderProfileId: 'saved_moonshot',
    } as any)
    expect(applied).toBeUndefined()
    expect(process.env.OPENAI_BASE_URL).toBe('http://192.168.1.1:8080/v1')
  })
  test('still respects complete shell selection with USE flag + MODEL', async () => {
    const { applyActiveProviderProfileFromConfig } =
      await importFreshProviderProfileModules()
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
    process.env.OPENAI_MODEL = 'gpt-4o-mini'
    delete process.env.OPENAI_BASE_URL
    const applied = applyActiveProviderProfileFromConfig({
      providerProfiles: [
        buildProfile({
          id: 'saved_moonshot',
          baseUrl: 'https://api.moonshot.ai/v1',
          model: 'kimi-k2.6',
        }),
      ],
      activeProviderProfileId: 'saved_moonshot',
    } as any)
    expect(applied).toBeUndefined()
    expect(process.env.OPENAI_MODEL).toBe('gpt-4o-mini')
  })
  test('does not override explicit startup selection when profile marker is stale', async () => {
    const { applyActiveProviderProfileFromConfig } =
      await importFreshProviderProfileModules()
@@ -450,6 +527,18 @@ describe('getProviderPresetDefaults', () => {
    expect(defaults.baseUrl).toBe('http://localhost:11434/v1')
    expect(defaults.model).toBe('llama3.1:8b')
  })
  test('atomic-chat preset defaults to a local Atomic Chat endpoint', async () => {
    const { getProviderPresetDefaults } = await importFreshProviderProfileModules()
    delete process.env.OPENAI_MODEL
    const defaults = getProviderPresetDefaults('atomic-chat')
    expect(defaults.provider).toBe('openai')
    expect(defaults.name).toBe('Atomic Chat')
    expect(defaults.baseUrl).toBe('http://127.0.0.1:1337/v1')
    expect(defaults.requiresApiKey).toBe(false)
  })
 })
 describe('setActiveProviderProfile', () => {
--- a/src/utils/providerProfiles.ts
+++ b/src/utils/providerProfiles.ts
@@ -33,6 +33,7 @@ export type ProviderPreset =
  | 'custom'
  | 'nvidia-nim'
  | 'minimax'
  | 'atomic-chat'
 export type ProviderProfileInput = {
  provider?: ProviderProfile['provider']
@@ -285,6 +286,15 @@ export function getProviderPresetDefaults(
        apiKey: process.env.MINIMAX_API_KEY ?? '',
        requiresApiKey: true,
      }
    case 'atomic-chat':
      return {
        provider: 'openai',
        name: 'Atomic Chat',
        baseUrl: 'http://127.0.0.1:1337/v1',
        model: process.env.OPENAI_MODEL ?? 'local-model',
        apiKey: '',
        requiresApiKey: false,
      }
    case 'ollama':
    default:
      return {
@@ -322,6 +332,58 @@ function hasProviderSelectionFlags(
  )
 }
 /**
 * A "complete" explicit provider selection = a USE flag AND at least one
 * concrete config value that tells us WHERE to route (a base URL) or WHAT
 * to run (a model id). A bare `CLAUDE_CODE_USE_OPENAI=1` with nothing else
 * is almost always a stale shell export from a previous session, not real
 * intent — and if we respect it, we skip the user's saved active profile
 * and fall back to hardcoded defaults (gpt-4o / api.openai.com), which is
 * the exact bug users report as "my saved provider isn't picked up".
 *
 * Used to gate whether saved-profile env should override shell state at
 * startup. The weaker `hasProviderSelectionFlags` is still used for the
 * anthropic-profile conflict check (any flag is a conflict for
 * first-party anthropic) and for alignment fingerprinting.
 */
 function hasCompleteProviderSelection(
  processEnv: NodeJS.ProcessEnv = process.env,
 ): boolean {
  if (!hasProviderSelectionFlags(processEnv)) return false
  if (processEnv.CLAUDE_CODE_USE_OPENAI !== undefined) {
    return (
      trimOrUndefined(processEnv.OPENAI_BASE_URL) !== undefined ||
      trimOrUndefined(processEnv.OPENAI_API_BASE) !== undefined ||
      trimOrUndefined(processEnv.OPENAI_MODEL) !== undefined
    )
  }
  if (processEnv.CLAUDE_CODE_USE_GEMINI !== undefined) {
    return (
      trimOrUndefined(processEnv.GEMINI_BASE_URL) !== undefined ||
      trimOrUndefined(processEnv.GEMINI_MODEL) !== undefined ||
      trimOrUndefined(processEnv.GEMINI_API_KEY) !== undefined ||
      trimOrUndefined(processEnv.GOOGLE_API_KEY) !== undefined
    )
  }
  if (processEnv.CLAUDE_CODE_USE_MISTRAL !== undefined) {
    return (
      trimOrUndefined(processEnv.MISTRAL_BASE_URL) !== undefined ||
      trimOrUndefined(processEnv.MISTRAL_MODEL) !== undefined ||
      trimOrUndefined(processEnv.MISTRAL_API_KEY) !== undefined
    )
  }
  if (processEnv.CLAUDE_CODE_USE_GITHUB !== undefined) {
    return (
      trimOrUndefined(processEnv.GITHUB_TOKEN) !== undefined ||
      trimOrUndefined(processEnv.GH_TOKEN) !== undefined ||
      trimOrUndefined(processEnv.OPENAI_MODEL) !== undefined
    )
  }
  // Bedrock / Vertex / Foundry signal cloud-provider routing in env; treat
  // the flag alone as complete (these paths rely on ambient AWS/GCP creds).
  return true
 }
 function hasConflictingProviderFlagsForProfile(
  processEnv: NodeJS.ProcessEnv,
  profile: ProviderProfile,
@@ -564,9 +626,15 @@ export function applyActiveProviderProfileFromConfig(
    processEnv[PROFILE_ENV_APPLIED_FLAG] === '1' &&
    trimOrUndefined(processEnv[PROFILE_ENV_APPLIED_ID]) === activeProfile.id
-  if (!options?.force && (hasProviderSelectionFlags(processEnv) || processEnv[PROFILE_ENV_APPLIED_FLAG] === '1')) {
+  if (!options?.force && (hasCompleteProviderSelection(processEnv) || processEnv[PROFILE_ENV_APPLIED_FLAG] === '1')) {
    // Respect explicit startup provider intent. Auto-heal only when this
    // exact active profile previously applied the current env.
    // NOTE: we gate on hasCompleteProviderSelection (flag + concrete config)
    // rather than hasProviderSelectionFlags alone. A bare CLAUDE_CODE_USE_*=1
    // with no BASE_URL/MODEL is almost always a stale shell export, not
    // intent — respecting it would skip the saved profile and fall through
    // to hardcoded provider defaults, which surfaces as "my saved provider
    // isn't being picked up at startup".
    if (!isCurrentEnvProfileManaged) {
      return undefined
    }
--- a/src/utils/requestLogging.test.ts
+++ b/src/utils/requestLogging.test.ts
@@ -0,0 +1,86 @@
 import { describe, expect, it, beforeEach } from 'bun:test'
 import {
  createCorrelationId,
  logApiCallStart,
  logApiCallEnd,
 } from './requestLogging.js'
 describe('requestLogging', () => {
  describe('createCorrelationId', () => {
    it('returns a non-empty string', () => {
      const id = createCorrelationId()
      expect(id).toBeTruthy()
      expect(typeof id).toBe('string')
    })
    it('returns unique IDs', () => {
      const id1 = createCorrelationId()
      const id2 = createCorrelationId()
      expect(id1).not.toBe(id2)
    })
  })
  describe('logApiCallStart', () => {
    it('returns correlation ID and start time', () => {
      const result = logApiCallStart('openai', 'gpt-4o')
      expect(result.correlationId).toBeTruthy()
      expect(result.startTime).toBeGreaterThan(0)
    })
    it('logs without throwing', () => {
      expect(() => logApiCallStart('ollama', 'llama3')).not.toThrow()
    })
  })
  describe('logApiCallEnd', () => {
    it('logs success without throwing', () => {
      const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
      expect(() =>
        logApiCallEnd(
          correlationId,
          startTime,
          'gpt-4o',
          'success',
          100,
          50,
          false,
        ),
      ).not.toThrow()
    })
    it('logs error without throwing', () => {
      const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
      expect(() =>
        logApiCallEnd(
          correlationId,
          startTime,
          'gpt-4o',
          'error',
          0,
          0,
          false,
          undefined,
          undefined,
          'Network error',
        ),
      ).not.toThrow()
    })
    it('logs with all parameters without throwing', () => {
      const { correlationId, startTime } = logApiCallStart('openai', 'gpt-4o')
      expect(() =>
        logApiCallEnd(
          correlationId,
          startTime,
          'gpt-4o',
          'success',
          100,
          50,
          true,
          'error message',
          { provider: 'openai' },
        ),
      ).not.toThrow()
    })
  })
 })
--- a/src/utils/requestLogging.ts
+++ b/src/utils/requestLogging.ts
@@ -0,0 +1,89 @@
 /**
 * Structured Request Logging
 * 
 * Uses existing logForDebugging for structured logging.
 */
 import { randomUUID } from 'crypto'
 import { logForDebugging } from './debug.js'
 export interface RequestLog {
  correlationId: string
  timestamp: number
  provider: string
  model: string
  duration: number
  status: 'success' | 'error'
  tokensIn: number
  tokensOut: number
  error?: string
  streaming: boolean
 }
 export function createCorrelationId(): string {
  return randomUUID()
 }
 export function logApiCallStart(
  provider: string,
  model: string,
 ): { correlationId: string; startTime: number } {
  const correlationId = createCorrelationId()
  const startTime = Date.now()
  logForDebugging(
    JSON.stringify({
      type: 'api_call_start',
      correlationId,
      provider,
      model,
      timestamp: startTime,
    }),
    { level: 'debug' },
  )
  return { correlationId, startTime }
 }
 export function logApiCallEnd(
  correlationId: string,
  startTime: number,
  model: string,
  status: 'success' | 'error',
  tokensIn: number,
  tokensOut: number,
  streaming: boolean,
  firstTokenMs?: number,
  totalChunks?: number,
  error?: string,
 ): void {
  const duration = Date.now() - startTime
  const logData: Record<string, unknown> = {
    type: status === 'error' ? 'api_call_error' : 'api_call_end',
    correlationId,
    model,
    duration_ms: duration,
    status,
    tokens_in: tokensIn,
    tokens_out: tokensOut,
    streaming,
  }
  if (firstTokenMs !== undefined) {
    logData.first_token_ms = firstTokenMs
  }
  if (totalChunks !== undefined) {
    logData.total_chunks = totalChunks
  }
  if (error) {
    logData.error = error
  }
  logForDebugging(
    JSON.stringify(logData),
    { level: status === 'error' ? 'error' : 'debug' },
  )
 }
--- a/src/utils/sandbox/sandbox-adapter.ts
+++ b/src/utils/sandbox/sandbox-adapter.ts
@@ -456,10 +456,19 @@ const checkDependencies = memoize((): SandboxDependencyCheck => {
  })
 })
 /**
 * Read sandbox.enabled only from trusted settings sources.
 * projectSettings is intentionally excluded — a malicious repo could
 * otherwise disable the sandbox via .claude/settings.json.
 */
 function getSandboxEnabledSetting(): boolean {
  try {
-    const settings = getSettings_DEPRECATED()
+    return !!(
-    return settings?.sandbox?.enabled ?? false
+      getSettingsForSource('userSettings')?.sandbox?.enabled ||
      getSettingsForSource('localSettings')?.sandbox?.enabled ||
      getSettingsForSource('flagSettings')?.sandbox?.enabled ||
      getSettingsForSource('policySettings')?.sandbox?.enabled
    )
  } catch (error) {
    logForDebugging(`Failed to get settings for sandbox check: ${error}`)
    return false
--- a/Show More
+++ b/Show More