fix(api): consolidate 3P provider compatibility fixes

- Strip store field from request body for local providers (Ollama, vLLM) that reject unknown JSON fields with 400 errors - Add Gemini 3.x model context windows and output token limits (gemini-3-flash-preview, gemini-3.1-pro-preview, google/ OpenRouter variants) - Preserve reasoning_content on assistant tool-call message replays for providers that require it (Kimi k2.5, DeepSeek reasoner) - Use conservative max_output_tokens fallback (4096/16384) for unknown 3P models to prevent vLLM/Ollama 400 errors from exceeding max_model_len Consolidates fixes from: #258, #268, #237, #643, #666, #677 Co-authored-by: auriti <auriti@users.noreply.github.com> Co-authored-by: Gustavo-Falci <Gustavo-Falci@users.noreply.github.com> Co-authored-by: lttlin <lttlin@users.noreply.github.com> Co-authored-by: Durannd <Durannd@users.noreply.github.com>
2026-04-20 10:08:09 +02:00
117 changed files with 873 additions and 11801 deletions
--- a/.env.example
+++ b/.env.example
@@ -149,23 +149,6 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
 # Use a custom OpenAI-compatible endpoint (optional — defaults to api.openai.com)
 # OPENAI_BASE_URL=https://api.openai.com/v1

-# Fallback context window size (tokens) when the model is not found in the
-# built-in table (default: 128000). Increase this for models with larger
-# context windows (e.g. 200000 for Claude-sized contexts).
-# CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW=128000
-
-# Per-model context window overrides as a JSON object.
-# Takes precedence over the built-in table, so you can register new or
-# custom models without patching source.
-# Example: CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS={"my-corp/llm-v3":262144,"gpt-4o-mini":128000}
-# CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS=
-
-# Per-model maximum output token overrides as a JSON object.
-# Use this alongside CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS when your model
-# supports a different output limit than what the built-in table specifies.
-# Example: CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS={"my-corp/llm-v3":8192}
-# CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS=
-

 # -----------------------------------------------------------------------------
 # Option 3: Google Gemini
@@ -284,16 +267,6 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
 # Disable "Co-authored-by" line in git commits made by OpenClaude
 # OPENCLAUDE_DISABLE_CO_AUTHORED_BY=1

-# Disable strict tool schema normalization for non-Gemini providers
-# Useful when MCP tools with complex optional params (e.g. list[dict])
-# trigger "Extra required key ... supplied" errors from OpenAI-compatible endpoints
-# OPENCLAUDE_DISABLE_STRICT_TOOLS=1
-
-# Disable hidden <system-reminder> messages injected into tool output
-# Suppresses the file-read cyber-risk reminder and the todo/task tool nudges
-# Useful for users who want full transparency over what the model sees
-# OPENCLAUDE_DISABLE_TOOL_REMINDERS=1
-
 # Custom timeout for API requests in milliseconds (default: varies)
 # API_TIMEOUT_MS=60000

--- a/.gitignore
+++ b/.gitignore
@@ -7,8 +7,6 @@ dist/
 .openclaude-profile.json
 reports/
 GEMINI.md
-CLAUDE.md
 package-lock.json
 /.claude
 coverage/
-agent.log
--- a/.release-please-manifest.json
+++ b/.release-please-manifest.json
@@ -1,3 +1,3 @@
 {
-  ".": "0.6.0"
+  ".": "0.5.2"
 }
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,33 +1,5 @@
 # Changelog

-## [0.6.0](https://github.com/Gitlawb/openclaude/compare/v0.5.2...v0.6.0) (2026-04-22)
-
-
-### Features
-
-* add model caching and benchmarking utilities ([#671](https://github.com/Gitlawb/openclaude/issues/671)) ([2b15e16](https://github.com/Gitlawb/openclaude/commit/2b15e16421f793f954a92c53933a07094544b29d))
-* add thinking token extraction ([#798](https://github.com/Gitlawb/openclaude/issues/798)) ([268c039](https://github.com/Gitlawb/openclaude/commit/268c0398e4bf1ab898069c61500a2b3c226a0322))
-* **api:** compress old tool_result content for small-context providers ([#801](https://github.com/Gitlawb/openclaude/issues/801)) ([a6a3de5](https://github.com/Gitlawb/openclaude/commit/a6a3de5ac155fe9d00befbfcab98d439314effd8))
-* **api:** improve local provider reliability with readiness and self-healing ([#738](https://github.com/Gitlawb/openclaude/issues/738)) ([4cb963e](https://github.com/Gitlawb/openclaude/commit/4cb963e660dbd6ee438c04042700db05a9d32c59))
-* **api:** smart model routing primitive (cheap-for-simple, strong-for-hard) ([#785](https://github.com/Gitlawb/openclaude/issues/785)) ([e908864](https://github.com/Gitlawb/openclaude/commit/e908864da7e7c987a98053ac5d18d702e192db2b))
-* enable 15 additional feature flags in open build ([#667](https://github.com/Gitlawb/openclaude/issues/667)) ([6a62e3f](https://github.com/Gitlawb/openclaude/commit/6a62e3ff76ba9ba446b8e20cf2bb139ee76a9387))
-* native Anthropic API mode for Claude models on GitHub Copilot ([#579](https://github.com/Gitlawb/openclaude/issues/579)) ([fdef4a1](https://github.com/Gitlawb/openclaude/commit/fdef4a1b4ce218ded4937ca83b30acce7c726472))
-* **provider:** expose Atomic Chat in /provider picker with autodetect ([#810](https://github.com/Gitlawb/openclaude/issues/810)) ([ee19159](https://github.com/Gitlawb/openclaude/commit/ee19159c17b3de3b4a8b4a4541a6569f4261d54e))
-* **provider:** zero-config autodetection primitive ([#784](https://github.com/Gitlawb/openclaude/issues/784)) ([a5bfcbb](https://github.com/Gitlawb/openclaude/commit/a5bfcbbadf8e9a1fd42f3e103d295524b8da64b0))
-
-
-### Bug Fixes
-
-* **api:** ensure strict role sequence and filter empty assistant messages after interruption ([#745](https://github.com/Gitlawb/openclaude/issues/745) regression) ([#794](https://github.com/Gitlawb/openclaude/issues/794)) ([06e7684](https://github.com/Gitlawb/openclaude/commit/06e7684eb56df8e694ac784575e163641931c44c))
-* Collapse all-text arrays to string for DeepSeek compatibility ([#806](https://github.com/Gitlawb/openclaude/issues/806)) ([761924d](https://github.com/Gitlawb/openclaude/commit/761924daa7e225fe8acf41651408c7cae639a511))
-* **model:** codex/nvidia-nim/minimax now read OPENAI_MODEL env ([#815](https://github.com/Gitlawb/openclaude/issues/815)) ([4581208](https://github.com/Gitlawb/openclaude/commit/458120889f6ce54cc9f0b287461d5e38eae48a20))
-* **provider:** saved profile ignored when stale CLAUDE_CODE_USE_* in shell ([#807](https://github.com/Gitlawb/openclaude/issues/807)) ([13de4e8](https://github.com/Gitlawb/openclaude/commit/13de4e85df7f5fadc8cd15a76076374dc112360b))
-* rename .claude.json to .openclaude.json with legacy fallback ([#582](https://github.com/Gitlawb/openclaude/issues/582)) ([4d4fb28](https://github.com/Gitlawb/openclaude/commit/4d4fb2880e4d0e3a62d8715e1ec13d932e736279))
-* replace discontinued gemini-2.5-pro-preview-03-25 with stable gemini-2.5-pro ([#802](https://github.com/Gitlawb/openclaude/issues/802)) ([64582c1](https://github.com/Gitlawb/openclaude/commit/64582c119d5d0278195271379da4a68d59a89c1f)), closes [#398](https://github.com/Gitlawb/openclaude/issues/398)
-* **security:** harden project settings trust boundary + MCP sanitization ([#789](https://github.com/Gitlawb/openclaude/issues/789)) ([ae3b723](https://github.com/Gitlawb/openclaude/commit/ae3b723f3b297b49925cada4728f3174aee8bf12))
-* **test:** autoCompact floor assertion is flag-sensitive ([#816](https://github.com/Gitlawb/openclaude/issues/816)) ([c13842e](https://github.com/Gitlawb/openclaude/commit/c13842e91c7227246520955de6ae0636b30def9a))
-* **ui:** prevent provider manager lag by deferring sync I/O ([#803](https://github.com/Gitlawb/openclaude/issues/803)) ([85eab27](https://github.com/Gitlawb/openclaude/commit/85eab2751e7d351bb0ed6a3fe0e15461d241c9cb))
-
 ## [0.5.2](https://github.com/Gitlawb/openclaude/compare/v0.5.1...v0.5.2) (2026-04-20)


--- a/README.md
+++ b/README.md
@@ -125,7 +125,7 @@ Advanced and source-build guides:
 | Codex OAuth | `/provider` | Opens ChatGPT sign-in in your browser and stores Codex credentials securely |
 | Codex | `/provider` | Uses existing Codex CLI auth, OpenClaude secure storage, or env credentials |
 | Ollama | `/provider`, env vars, or `ollama launch` | Local inference with no API key |
-| Atomic Chat | `/provider`, env vars, or `bun run dev:atomic-chat` | Local Model Provider; auto-detects loaded models |
+| Atomic Chat | advanced setup | Local Apple Silicon backend |
 | Bedrock / Vertex / Foundry | env vars | Additional provider integrations for supported environments |

 ## What Works
--- a/docs/hook-chains.md
+++ b/docs/hook-chains.md
@@ -1,333 +0,0 @@
-# Hook Chains (Self-Healing Agent Mesh MVP)
-
-Hook Chains provide an event-driven recovery layer for important workflow failures.
-When a matching hook event occurs, OpenClaude evaluates declarative rules and can dispatch remediation actions such as:
-
- `spawn_fallback_agent`
- `notify_team`
- `warm_remote_capacity`
-
-## Disabled-By-Default Rollout
-
-> **Rollout recommendation:** keep Hook Chains disabled until you validate rules in your environment.
->
-> - Set top-level config to `"enabled": false` initially.
-> - Enable per environment when ready.
-> - Dispatch is gated by `feature('HOOK_CHAINS')`.
-> - Env gate defaults to off unless `CLAUDE_CODE_ENABLE_HOOK_CHAINS=1` is set.
-
-This keeps existing workflows unchanged while you tune guard windows and action behavior.
-
-## Feature Overview
-
-Hook Chains are loaded from a deterministic config file and evaluated on dispatched hook events.
-
-MVP runtime trigger wiring:
-
- `PostToolUseFailure` hooks dispatch Hook Chains with outcome `failed`.
- `TaskCompleted` hooks dispatch Hook Chains with outcome:
-  - `success` when completion hooks did not block.
-  - `failed` when completion hooks returned blocking errors or prevented continuation.
-
-Default config path:
-
- `.openclaude/hook-chains.json`
-
-Override path:
-
- `CLAUDE_CODE_HOOK_CHAINS_CONFIG_PATH=/abs/or/relative/path/to/hook-chains.json`
-
-Global gate:
-
- `feature('HOOK_CHAINS')` must be enabled in the build
- `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0|1` (defaults to disabled when unset)
-
-## Safety Guarantees
-
-The runtime is intentionally conservative:
-
- **Depth guard:** chain dispatch is blocked when `chainDepth >= maxChainDepth`.
- **Rule cooldown:** each rule can only re-fire after cooldown expires.
- **Dedup window:** identical event/action combinations are suppressed for a window.
- **Abort-safe behavior:** if the current signal is aborted, actions skip safely.
- **Policy-aware remote warm:** `warm_remote_capacity` skips when remote sessions are policy denied.
- **Bridge inactive no-op:** `warm_remote_capacity` safely skips when no active bridge handle exists.
- **Missing team context safety:** `notify_team` skips with structured reason if no team context/team file is available.
- **Fallback launcher safety:** `spawn_fallback_agent` fails with a structured reason when launch permissions/context are unavailable.
-
-## Configuration Schema Reference
-
-Top-level object:
-
-```json
-{
-  "version": 1,
-  "enabled": true,
-  "maxChainDepth": 2,
-  "defaultCooldownMs": 30000,
-  "defaultDedupWindowMs": 30000,
-  "rules": []
-}
-```
-
-### Top-Level Fields
-
-| Field | Type | Required | Notes |
-|---|---|---:|---|
-| `version` | `1` | No | Defaults to `1`. |
-| `enabled` | `boolean` | No | Global feature switch for this config file. |
-| `maxChainDepth` | `integer` | No | Global depth guard (default `2`, max `10`). |
-| `defaultCooldownMs` | `integer` | No | Default rule cooldown in ms (default `30000`). |
-| `defaultDedupWindowMs` | `integer` | No | Default action dedup window in ms (default `30000`). |
-| `rules` | `HookChainRule[]` | No | Defaults to `[]`. May be omitted or empty; when no rules are present, dispatch is a no-op and returns `enabled: false`. |
-
-> **Note:** An empty ruleset is valid and can be used to keep Hook Chains configured but effectively disabled until rules are added.
-### Rule Object (`HookChainRule`)
-
-```json
-{
-  "id": "task-failure-recovery",
-  "enabled": true,
-  "trigger": {
-    "event": "TaskCompleted",
-    "outcome": "failed"
-  },
-  "condition": {
-    "toolNames": ["Edit"],
-    "taskStatuses": ["failed"],
-    "errorIncludes": ["timeout", "permission denied"],
-    "eventFieldEquals": {
-      "meta.source": "scheduler"
-    }
-  },
-  "cooldownMs": 60000,
-  "dedupWindowMs": 30000,
-  "maxDepth": 2,
-  "actions": []
-}
-```
-
-| Field | Type | Required | Notes |
-|---|---|---:|---|
-| `id` | `string` | Yes | Stable identifier used in telemetry/guards. |
-| `enabled` | `boolean` | No | Per-rule switch. |
-| `trigger.event` | `HookEvent` | Yes | Event name to match. |
-| `trigger.outcome` | `"success"|"failed"|"timeout"|"unknown"` | No | Single outcome matcher. |
-| `trigger.outcomes` | `Outcome[]` | No | Multi-outcome matcher. Use either `outcome` or `outcomes`. |
-| `condition` | `object` | No | Optional extra matching constraints. |
-| `cooldownMs` | `integer` | No | Overrides global cooldown for this rule. |
-| `dedupWindowMs` | `integer` | No | Overrides global dedup for this rule. |
-| `maxDepth` | `integer` | No | Per-rule depth cap. |
-| `actions` | `HookChainAction[]` | Yes | One or more actions to execute in order. |
-
-### Condition Fields
-
-| Field | Type | Notes |
-|---|---|---|
-| `toolNames` | `string[]` | Matches `tool_name` / `toolName` in event payload. |
-| `taskStatuses` | `string[]` | Matches `task_status` / `taskStatus` / `status`. |
-| `errorIncludes` | `string[]` | Case-insensitive substring match against `error` / `reason` / `message`. |
-| `eventFieldEquals` | `Record<string, string\|number\|boolean>` | Dot-path equality against payload (example: `"meta.source": "scheduler"`). |
-
-### Actions
-
-#### `spawn_fallback_agent`
-
-```json
-{
-  "type": "spawn_fallback_agent",
-  "id": "fallback-1",
-  "enabled": true,
-  "dedupWindowMs": 30000,
-  "description": "Fallback recovery for failed task",
-  "promptTemplate": "Recover task ${TASK_SUBJECT}. Event=${EVENT_NAME}, outcome=${OUTCOME}, error=${ERROR}. Payload=${PAYLOAD_JSON}",
-  "agentType": "general-purpose",
-  "model": "sonnet"
-}
-```
-
-#### `notify_team`
-
-```json
-{
-  "type": "notify_team",
-  "id": "notify-ops",
-  "enabled": true,
-  "dedupWindowMs": 30000,
-  "teamName": "mesh-team",
-  "recipients": ["*"],
-  "summary": "Hook chain ${RULE_ID} fired",
-  "messageTemplate": "Event=${EVENT_NAME} outcome=${OUTCOME}\nTask=${TASK_ID}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
-}
-```
-
-#### `warm_remote_capacity`
-
-```json
-{
-  "type": "warm_remote_capacity",
-  "id": "warm-bridge",
-  "enabled": true,
-  "dedupWindowMs": 60000,
-  "createDefaultEnvironmentIfMissing": false
-}
-```
-
-## Complete Example Configs
-
-### 1) Retry via Fallback Agent
-
-```json
-{
-  "version": 1,
-  "enabled": true,
-  "maxChainDepth": 2,
-  "defaultCooldownMs": 30000,
-  "defaultDedupWindowMs": 30000,
-  "rules": [
-    {
-      "id": "retry-task-via-fallback",
-      "trigger": {
-        "event": "TaskCompleted",
-        "outcome": "failed"
-      },
-      "cooldownMs": 60000,
-      "actions": [
-        {
-          "type": "spawn_fallback_agent",
-          "id": "spawn-retry-agent",
-          "description": "Retry failed task with fallback agent",
-          "promptTemplate": "A task failed. Recover it safely.\nTask=${TASK_SUBJECT}\nDescription=${TASK_DESCRIPTION}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}",
-          "agentType": "general-purpose",
-          "model": "sonnet"
-        }
-      ]
-    }
-  ]
-}
-```
-
-### 2) Notify Only
-
-```json
-{
-  "version": 1,
-  "enabled": true,
-  "maxChainDepth": 2,
-  "defaultCooldownMs": 30000,
-  "defaultDedupWindowMs": 30000,
-  "rules": [
-    {
-      "id": "notify-on-tool-failure",
-      "trigger": {
-        "event": "PostToolUseFailure",
-        "outcome": "failed"
-      },
-      "condition": {
-        "toolNames": ["Edit", "Write", "Bash"]
-      },
-      "actions": [
-        {
-          "type": "notify_team",
-          "id": "notify-team-failure",
-          "recipients": ["*"],
-          "summary": "Tool failure detected",
-          "messageTemplate": "Tool failure detected.\nEvent=${EVENT_NAME} outcome=${OUTCOME}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
-        }
-      ]
-    }
-  ]
-}
-```
-
-### 3) Combined Fallback + Notify + Bridge Warm
-
-```json
-{
-  "version": 1,
-  "enabled": true,
-  "maxChainDepth": 2,
-  "defaultCooldownMs": 45000,
-  "defaultDedupWindowMs": 30000,
-  "rules": [
-    {
-      "id": "full-recovery-chain",
-      "trigger": {
-        "event": "TaskCompleted",
-        "outcomes": ["failed", "timeout"]
-      },
-      "condition": {
-        "errorIncludes": ["timeout", "capacity", "connection"]
-      },
-      "cooldownMs": 90000,
-      "actions": [
-        {
-          "type": "spawn_fallback_agent",
-          "id": "fallback-agent",
-          "description": "Recover failed task execution",
-          "promptTemplate": "Recover failed task and produce a concise fix summary.\nTask=${TASK_SUBJECT}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
-        },
-        {
-          "type": "notify_team",
-          "id": "notify-team",
-          "recipients": ["*"],
-          "summary": "Recovery chain triggered",
-          "messageTemplate": "Recovery chain ${RULE_ID} fired.\nOutcome=${OUTCOME}\nTask=${TASK_SUBJECT}\nError=${ERROR}"
-        },
-        {
-          "type": "warm_remote_capacity",
-          "id": "warm-capacity",
-          "createDefaultEnvironmentIfMissing": false
-        }
-      ]
-    }
-  ]
-}
-```
-
-## Template Variables
-
-The following placeholders are supported by `promptTemplate`, `summary`, and `messageTemplate`:
-
- `${EVENT_NAME}`
- `${OUTCOME}`
- `${RULE_ID}`
- `${TASK_SUBJECT}`
- `${TASK_DESCRIPTION}`
- `${TASK_ID}`
- `${ERROR}`
- `${PAYLOAD_JSON}`
-
-## Troubleshooting
-
-### Rule never triggers
-
- Verify `trigger.event` and `trigger.outcome`/`trigger.outcomes` exactly match dispatched event data.
- Check `condition` filters (especially `toolNames` and `eventFieldEquals` dot-path keys).
- Confirm the config file is valid JSON and schema-valid.
-
-### Actions show as skipped
-
-Common skip reasons:
-
- `action disabled`
- `rule cooldown active ...`
- `dedup window active ...`
- `max chain depth reached ...`
- `No team context is available ...`
- `Team file not found ...`
- `Remote sessions are blocked by policy`
- `Bridge is not active; warm_remote_capacity is a safe no-op`
- `No fallback agent launcher is registered in runtime context`
-
-### Config changes not reflected
-
- Loader uses memoization by file mtime/size.
- Ensure your editor writes the file fully and updates mtime.
- If needed, force reload from the caller side with `forceReloadConfig: true`.
-
-### Existing workflows changed unexpectedly
-
- Set `"enabled": false` at top-level.
- Or globally disable with `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0`.
- Re-enable gradually after validating one rule at a time.
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "@gitlawb/openclaude",
-  "version": "0.6.0",
+  "version": "0.5.2",
  "description": "Claude Code opened to any LLM — OpenAI, Gemini, DeepSeek, Ollama, and 200+ models",
  "type": "module",
  "bin": {
--- a/scripts/build.ts
+++ b/scripts/build.ts
@@ -19,46 +19,30 @@ const version = pkg.version
 // Most Anthropic-internal features stay off; open-build features can be
 // selectively enabled here when their full source exists in the mirror.
 const featureFlags: Record<string, boolean> = {
-  // ── Disabled: require Anthropic infrastructure or missing source ─────
-  VOICE_MODE: false,              // Push-to-talk STT via claude.ai OAuth endpoint
-  PROACTIVE: false,               // Autonomous agent mode (missing proactive/ module)
-  KAIROS: false,                  // Persistent assistant/session mode (cloud backend)
-  BRIDGE_MODE: false,             // Remote desktop bridge via CCR infrastructure
-  DAEMON: false,                  // Background daemon process (stubbed in open build)
-  AGENT_TRIGGERS: false,          // Scheduled remote agent triggers
-  ABLATION_BASELINE: false,       // A/B testing harness for eval experiments
-  CONTEXT_COLLAPSE: false,        // Context collapsing optimization (stubbed)
-  COMMIT_ATTRIBUTION: false,      // Co-Authored-By metadata in git commits
-  UDS_INBOX: false,               // Unix Domain Socket inter-session messaging
-  BG_SESSIONS: false,             // Background sessions via tmux (stubbed)
-  WEB_BROWSER_TOOL: false,        // Built-in browser automation (source not mirrored)
-  CHICAGO_MCP: false,             // Computer-use MCP (native Swift modules stubbed)
-  COWORKER_TYPE_TELEMETRY: false, // Telemetry for agent/coworker type classification
-  MCP_SKILLS: false,              // Dynamic MCP skill discovery (src/skills/mcpSkills.ts not mirrored; enabling this causes "fetchMcpSkillsForClient is not a function" when MCP servers with resources connect — see #856)
-
-  // ── Enabled: upstream defaults ──────────────────────────────────────
-  COORDINATOR_MODE: true,             // Multi-agent coordinator with worker delegation
-  BUILTIN_EXPLORE_PLAN_AGENTS: true,  // Built-in Explore/Plan specialized subagents
-  BUDDY: true,                        // Buddy mode for paired programming
-  MONITOR_TOOL: true,                 // MCP server monitoring/streaming tool
-  TEAMMEM: true,                      // Team memory management
-  MESSAGE_ACTIONS: true,              // Message action buttons in the UI
-
-  // ── Enabled: new activations ────────────────────────────────────────
-  DUMP_SYSTEM_PROMPT: true,           // --dump-system-prompt CLI flag for debugging
-  CACHED_MICROCOMPACT: true,          // Cache-aware tool result truncation optimization
-  AWAY_SUMMARY: true,                 // "While you were away" recap after 5min blur
-  TRANSCRIPT_CLASSIFIER: true,        // Auto-approval classifier for safe tool uses
-  ULTRATHINK: true,                   // Deep thinking mode — type "ultrathink" to boost reasoning
-  TOKEN_BUDGET: true,                 // Token budget tracking with usage warnings
-  HISTORY_PICKER: true,               // Enhanced interactive prompt history picker
-  QUICK_SEARCH: true,                 // Ctrl+G quick search across prompts
-  SHOT_STATS: true,                   // Shot distribution stats in session summary
-  EXTRACT_MEMORIES: true,             // Auto-extract durable memories from conversations
-  FORK_SUBAGENT: true,                // Implicit context-forking when omitting subagent_type
-  VERIFICATION_AGENT: true,           // Built-in read-only agent for test/verification
-  PROMPT_CACHE_BREAK_DETECTION: true, // Detect & log unexpected prompt cache invalidations
-  HOOK_PROMPTS: true,                 // Allow tools to request interactive user prompts
+  VOICE_MODE: false,
+  PROACTIVE: false,
+  KAIROS: false,
+  BRIDGE_MODE: false,
+  DAEMON: false,
+  AGENT_TRIGGERS: false,
+  MONITOR_TOOL: true,
+  ABLATION_BASELINE: false,
+  DUMP_SYSTEM_PROMPT: false,
+  CACHED_MICROCOMPACT: false,
+  COORDINATOR_MODE: true,
+  BUILTIN_EXPLORE_PLAN_AGENTS: true,
+  CONTEXT_COLLAPSE: false,
+  COMMIT_ATTRIBUTION: false,
+  TEAMMEM: true,
+  UDS_INBOX: false,
+  BG_SESSIONS: false,
+  AWAY_SUMMARY: false,
+  TRANSCRIPT_CLASSIFIER: false,
+  WEB_BROWSER_TOOL: false,
+  MESSAGE_ACTIONS: true,
+  BUDDY: true,
+  CHICAGO_MCP: false,
+  COWORKER_TYPE_TELEMETRY: false,
 }

 // ── Pre-process: replace feature() calls with boolean literals ──────
--- a/scripts/feature-flags-source-guard.test.ts
+++ b/scripts/feature-flags-source-guard.test.ts
@@ -1,47 +0,0 @@
-import { existsSync, readFileSync } from 'fs'
-import { join } from 'path'
-import { expect, test } from 'bun:test'
-
-// Regression guard for #856. Several build feature flags require source files
-// that are not mirrored into the open build. When such a flag is set to `true`
-// without the source present, the bundler falls back to a missing-module stub
-// that only exports `default`, which causes runtime errors like
-// `fetchMcpSkillsForClient is not a function` when downstream code reaches
-// through the `require()` to a named export.
-//
-// This test fails fast at test-time if someone re-enables one of these flags
-// without first mirroring the corresponding source file.
-
-const BUILD_SCRIPT = join(import.meta.dir, 'build.ts')
-const REPO_ROOT = join(import.meta.dir, '..')
-
-type FlagGuard = {
-  flag: string
-  source: string // path relative to repo root
-}
-
-const FLAG_REQUIRES_SOURCE: FlagGuard[] = [
-  { flag: 'MCP_SKILLS', source: 'src/skills/mcpSkills.ts' },
-]
-
-test('build feature flags are not enabled without their source files', () => {
-  const buildScript = readFileSync(BUILD_SCRIPT, 'utf-8')
-
-  for (const { flag, source } of FLAG_REQUIRES_SOURCE) {
-    const enabledRe = new RegExp(`^\\s*${flag}\\s*:\\s*true\\b`, 'm')
-    const isEnabled = enabledRe.test(buildScript)
-    const sourceExists = existsSync(join(REPO_ROOT, source))
-
-    if (isEnabled && !sourceExists) {
-      throw new Error(
-        `Feature flag ${flag} is enabled in scripts/build.ts, but its required source file "${source}" does not exist. ` +
-          `Enabling this flag without the source will cause runtime errors (missing named exports from the missing-module stub). ` +
-          `Either mirror the source file or set ${flag}: false.`,
-      )
-    }
-
-    // When the source IS present, the flag can be either true or false; either
-    // is fine. We only care about the "enabled but missing" combination.
-    expect(true).toBe(true)
-  }
-})
--- a/scripts/no-telemetry-growthbook-stub.test.ts
+++ b/scripts/no-telemetry-growthbook-stub.test.ts
@@ -50,23 +50,6 @@ describe('growthbook stub — local feature flag overrides', () => {
    expect(stub.getAllGrowthBookFeatures()).toEqual({})
  })

-  // ── Open-build defaults (_openBuildDefaults) ────────────────────
-
-  test('returns open-build default when flags file is absent', () => {
-    // tengu_passport_quail is in _openBuildDefaults as true; without a
-    // flags file the stub should return the open-build override, not
-    // the call-site defaultValue.
-    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', false)).toBe(true)
-    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_coral_fern', false)).toBe(true)
-  })
-
-  test('flags file overrides open-build defaults', () => {
-    // User-provided feature-flags.json takes priority over _openBuildDefaults.
-    writeFileSync(flagsFile, JSON.stringify({ tengu_passport_quail: false }))
-
-    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', true)).toBe(false)
-  })
-
  // ── Valid JSON object ────────────────────────────────────────────

  test('loads and returns values from a valid JSON file', () => {
--- a/scripts/no-telemetry-plugin.ts
+++ b/scripts/no-telemetry-plugin.ts
@@ -40,151 +40,6 @@ import _os from 'node:os';

 let _flags = undefined;

-// ── Open-build GrowthBook overrides ───────────────────────────────────
-// Override upstream defaultValue for runtime gates tied to build-time
-// features. Only keys that DIFFER from upstream belong here — the
-// catalog below is pure documentation and does NOT affect resolution.
-//
-// Priority: ~/.claude/feature-flags.json > _openBuildDefaults > defaultValue
-//
-// To override at runtime, create ~/.claude/feature-flags.json:
-//   { "tengu_some_flag": true }
-const _openBuildDefaults = {
-  'tengu_sedge_lantern': true,  // AWAY_SUMMARY — "while you were away" recap (upstream: false)
-  'tengu_hive_evidence': true,  // VERIFICATION_AGENT — read-only test/verification agent (upstream: false)
-  'tengu_passport_quail': true, // EXTRACT_MEMORIES — enable memory extraction (upstream: false)
-  'tengu_coral_fern': true,     // EXTRACT_MEMORIES — enable memory search in past context (upstream: false)
-};
-
-/* ── Known runtime feature keys (reference only) ───────────────────────
- * This catalog does NOT participate in flag resolution. It documents
- * the known GrowthBook keys and their upstream default values, scraped
- * from src/ call sites. It is NOT exhaustive — new keys may be added
- * upstream between catalog updates.
- *
- * Some keys have different defaults at different call sites — this is
- * intentional upstream (the server unifies the value at runtime).
- *
- * To activate any of these, add them to ~/.claude/feature-flags.json
- * or to _openBuildDefaults above.
- *
- * ── Reasoning & thinking ──────────────────────────────────────────────
- *   tengu_turtle_carbon            = true       ULTRATHINK deep thinking runtime gate
- *   tengu_thinkback                = gate       /thinkback replay command
- *
- * ── Agents & orchestration ────────────────────────────────────────────
- *   tengu_amber_flint              = true       Agent swarms coordination
- *   tengu_amber_stoat              = true       Built-in agent availability (Explore, Plan, etc.)
- *   tengu_agent_list_attach        = true       Attach file context to agent list
- *   tengu_auto_background_agents   = false      Auto-spawn background agents
- *   tengu_slim_subagent_claudemd   = true       Lighter ClaudeMD for subagents
- *   tengu_hive_evidence            = false      Verification agent / evidence tracking (4 call sites)
- *   tengu_ultraplan_model          = model cfg  ULTRAPLAN model selection (dynamic config)
- *
- * ── Memory & context ──────────────────────────────────────────────────
- *   tengu_passport_quail           = false      EXTRACT_MEMORIES main gate (isExtractModeActive)
- *   tengu_coral_fern               = false      EXTRACT_MEMORIES search in past context
- *   tengu_slate_thimble            = false      Memory dir paths (non-interactive sessions)
- *   tengu_herring_clock            = true/false Team memory paths (varies by call site)
- *   tengu_bramble_lintel           = null       Extract memories throttle (null → every turn)
- *   tengu_sedge_lantern            = false      AWAY_SUMMARY "while you were away" recap
- *   tengu_session_memory           = false      Session memory service
- *   tengu_sm_config                = {}         Session memory config (dynamic)
- *   tengu_sm_compact_config        = {}         Session memory compaction config (dynamic)
- *   tengu_cobalt_raccoon           = false      Reactive compaction (suppress auto-compact)
- *   tengu_pebble_leaf_prune        = false      Session storage pruning
- *
- * ── Kairos & cron ─────────────────────────────────────────────────────
- *   tengu_kairos_brief             = false      Brief layout mode (KAIROS)
- *   tengu_kairos_brief_config      = {}         Brief config (dynamic)
- *   tengu_kairos_cron              = true       Cron scheduler enable
- *   tengu_kairos_cron_durable      = true       Durable (disk-persistent) cron tasks
- *   tengu_kairos_cron_config       = {}         Cron jitter config (dynamic)
- *
- * ── Bridge & remote (require Anthropic infra) ─────────────────────────
- *   tengu_ccr_bridge               = false      CCR bridge connection
- *   tengu_ccr_bridge_multi_session = gate       Multi-session spawn mode
- *   tengu_ccr_mirror               = false      CCR session mirroring
- *   tengu_ccr_bundle_seed_enabled  = gate       Git bundle seeding for CCR
- *   tengu_ccr_bundle_max_bytes     = null       Bundle size limit (null → default)
- *   tengu_bridge_repl_v2           = false      Environment-less REPL bridge v2
- *   tengu_bridge_repl_v2_cse_shim_enabled = true CSE→Session tag retag shim
- *   tengu_bridge_min_version       = {min:'0'}  Min CLI version for bridge (dynamic)
- *   tengu_bridge_initial_history_cap = 200      Initial history cap for bridge
- *   tengu_bridge_system_init       = false      Bridge system initialization
- *   tengu_cobalt_harbor            = false      Auto-connect CCR at startup
- *   tengu_cobalt_lantern           = false      Remote setup preconditions
- *   tengu_remote_backend           = false      Remote TUI backend
- *   tengu_surreal_dali             = false      Remote agent tasks / triggers
- *
- * ── Prompt & API ──────────────────────────────────────────────────────
- *   tengu_attribution_header       = true       Attribution header in API requests
- *   tengu_basalt_3kr               = true       MCP instructions delta
- *   tengu_slate_prism              = true/false Message formatting (varies by call site)
- *   tengu_amber_prism              = false      Message content formatting
- *   tengu_amber_json_tools         = false      JSON format for tool schemas
- *   tengu_fgts                     = false      API feature gates
- *   tengu_otk_slot_v1              = false      One-time key slots for API auth
- *   tengu_cicada_nap_ms            = 0          Background GrowthBook refresh throttle (ms)
- *   tengu_miraculo_the_bard        = false      Service initialization gate
- *   tengu_immediate_model_command  = false      Immediate /model command execution
- *   tengu_chomp_inflection         = false      Prompt suggestions after responses
- *   tengu_tool_pear                = gate       API betas for tool use
- *   tengu-off-switch               = {act:false} Service kill switch (dynamic; uses dash)
- *
- * ── Permissions & security ────────────────────────────────────────────
- *   tengu_birch_trellis            = true       Bash auto-mode permissions config
- *   tengu_auto_mode_config         = {}         Auto-mode configuration (dynamic, many call sites)
- *   tengu_iron_gate_closed         = true       Permission iron gate (with refresh)
- *   tengu_destructive_command_warning = false    Warning for destructive bash commands
- *   tengu_disable_bypass_permissions_mode = security Security killswitch (always false in open build)
- *
- * ── UI & UX ───────────────────────────────────────────────────────────
- *   tengu_willow_mode              = 'off'      REPL rendering mode
- *   tengu_terminal_panel           = false      Terminal panel keybinding
- *   tengu_terminal_sidebar         = false      Terminal sidebar in REPL/config
- *   tengu_marble_sandcastle        = false      Fast mode gate
- *   tengu_jade_anvil_4             = false      Rate limit options UI ordering
- *   tengu_collage_kaleidoscope     = true       Native clipboard image paste (macOS)
- *   tengu_lapis_finch              = false      Plugin/hint recommendation
- *   tengu_lodestone_enabled        = false      Deep links claude-cli:// protocol
- *   tengu_copper_panda             = false      Skill improvement suggestions
- *   tengu_desktop_upsell           = {}         Desktop app upsell config (dynamic)
- *   tengu-top-of-feed-tip          = {}         Emergency tip of feed (dynamic; uses dash)
- *
- * ── File operations ───────────────────────────────────────────────────
- *   tengu_quartz_lantern           = false      File read/write dedup optimization
- *   tengu_moth_copse               = false      Attachments handling (variant A)
- *   tengu_marble_fox               = false      Attachments handling (variant B)
- *   tengu_scratch                  = gate       Scratchpad filesystem access / coordinator
- *
- * ── MCP & plugins ─────────────────────────────────────────────────────
- *   tengu_harbor                   = false      MCP channel allowlist verification
- *   tengu_harbor_permissions       = false      MCP channel permissions enforcement
- *   tengu_copper_bridge            = false      Chrome MCP bridge
- *   tengu_chrome_auto_enable       = false      Auto-enable Chrome MCP on startup
- *   tengu_glacier_2xr              = false      Enhanced tool search / ToolSearchTool
- *   tengu_malort_pedway            = {}         Computer-use (Chicago) config (dynamic)
- *
- * ── VSCode / IDE ──────────────────────────────────────────────────────
- *   tengu_quiet_fern               = false      VSCode browser support
- *   tengu_vscode_cc_auth           = false      VSCode in-band OAuth via claude_authenticate
- *   tengu_vscode_review_upsell     = gate       VSCode review upsell
- *   tengu_vscode_onboarding        = gate       VSCode onboarding experience
- *
- * ── Voice ─────────────────────────────────────────────────────────────
- *   tengu_amber_quartz_disabled    = false      VOICE_MODE kill-switch (false = voice allowed)
- *
- * ── Auto-updater (stubbed in open build) ──────────────────────────────
- *   tengu_version_config           = {min:'0'}  Min version enforcement (dynamic)
- *   tengu_max_version_config       = {}         Max version / deprecation config (dynamic)
- *
- * ── Telemetry & tracing ───────────────────────────────────────────────
- *   tengu_trace_lantern            = false      Beta session tracing
- *   tengu_chair_sermon             = gate       Analytics / message formatting gate
- *   tengu_strap_foyer              = false      Settings sync to cloud
- */
-
 function _loadFlags() {
  if (_flags !== undefined) return;
  try {
@@ -200,7 +55,6 @@ function _loadFlags() {
 function _getFlagValue(key, defaultValue) {
  _loadFlags();
  if (_flags != null && Object.hasOwn(_flags, key)) return _flags[key];
-  if (Object.hasOwn(_openBuildDefaults, key)) return _openBuildDefaults[key];
  return defaultValue;
 }

--- a/scripts/system-check.test.ts
+++ b/scripts/system-check.test.ts
@@ -20,23 +20,6 @@ describe('formatReachabilityFailureDetail', () => {
    )
  })

-  test('redacts credentials and sensitive query parameters in endpoint details', () => {
-    const detail = formatReachabilityFailureDetail(
-      'http://user:pass@localhost:11434/v1/models?token=abc123&mode=test',
-      502,
-      'bad gateway',
-      {
-        transport: 'chat_completions',
-        requestedModel: 'llama3.1:8b',
-        resolvedModel: 'llama3.1:8b',
-      },
-    )
-
-    expect(detail).toBe(
-      'Unexpected status 502 from http://redacted:redacted@localhost:11434/v1/models?token=redacted&mode=test. Body: bad gateway',
-    )
-  })
-
  test('adds alias/entitlement hint for codex model support 400s', () => {
    const detail = formatReachabilityFailureDetail(
      'https://chatgpt.com/backend-api/codex/responses',
--- a/scripts/system-check.ts
+++ b/scripts/system-check.ts
@@ -7,11 +7,6 @@ import {
  resolveProviderRequest,
  isLocalProviderUrl as isProviderLocalUrl,
 } from '../src/services/api/providerConfig.js'
-import {
-  getLocalOpenAICompatibleProviderLabel,
-  probeOllamaGenerationReadiness,
-} from '../src/utils/providerDiscovery.js'
-import { redactUrlForDisplay } from '../src/utils/urlRedaction.js'

 type CheckResult = {
  ok: boolean
@@ -74,7 +69,7 @@ export function formatReachabilityFailureDetail(
  },
 ): string {
  const compactBody = responseBody.trim().replace(/\s+/g, ' ').slice(0, 240)
-  const base = `Unexpected status ${status} from ${redactUrlForDisplay(endpoint)}.`
+  const base = `Unexpected status ${status} from ${endpoint}.`
  const bodySuffix = compactBody ? ` Body: ${compactBody}` : ''

  if (request.transport !== 'codex_responses' || status !== 400) {
@@ -260,7 +255,7 @@ function checkOpenAIEnv(): CheckResult[] {
    results.push(pass('OPENAI_MODEL', process.env.OPENAI_MODEL))
  }

-  results.push(pass('OPENAI_BASE_URL', redactUrlForDisplay(request.baseUrl)))
+  results.push(pass('OPENAI_BASE_URL', request.baseUrl))

  if (request.transport === 'codex_responses') {
    const credentials = resolveCodexApiCredentials(process.env)
@@ -313,7 +308,7 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    return pass('Provider reachability', 'Skipped (OpenAI-compatible mode disabled).')
  }

-  if (useGithub && !useOpenAI) {
+  if (useGithub) {
    return pass(
      'Provider reachability',
      'Skipped for GitHub Models (inference endpoint differs from OpenAI /models probe).',
@@ -331,7 +326,6 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
  const endpoint = request.transport === 'codex_responses'
    ? `${request.baseUrl}/responses`
    : `${request.baseUrl}/models`
-  const redactedEndpoint = redactUrlForDisplay(endpoint)

  const controller = new AbortController()
  const timeout = setTimeout(() => controller.abort(), 4000)
@@ -381,10 +375,7 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    })

    if (response.status === 200 || response.status === 401 || response.status === 403) {
-      return pass(
-        'Provider reachability',
-        `Reached ${redactedEndpoint} (status ${response.status}).`,
-      )
+      return pass('Provider reachability', `Reached ${endpoint} (status ${response.status}).`)
    }

    const responseBody = await response.text().catch(() => '')
@@ -400,100 +391,12 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    )
  } catch (error) {
    const message = error instanceof Error ? error.message : String(error)
-    return fail(
-      'Provider reachability',
-      `Failed to reach ${redactedEndpoint}: ${message}`,
-    )
+    return fail('Provider reachability', `Failed to reach ${endpoint}: ${message}`)
  } finally {
    clearTimeout(timeout)
  }
 }

-async function checkProviderGenerationReadiness(): Promise<CheckResult> {
-  const useGemini = isTruthy(process.env.CLAUDE_CODE_USE_GEMINI)
-  const useOpenAI = isTruthy(process.env.CLAUDE_CODE_USE_OPENAI)
-  const useGithub = isTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
-  const useMistral = isTruthy(process.env.CLAUDE_CODE_USE_MISTRAL)
-
-  if (!useGemini && !useOpenAI && !useGithub && !useMistral) {
-    return pass('Provider generation readiness', 'Skipped (OpenAI-compatible mode disabled).')
-  }
-
-  if (useGithub && !useOpenAI) {
-    return pass(
-      'Provider generation readiness',
-      'Skipped for GitHub Models (runtime generation uses a different endpoint flow).',
-    )
-  }
-
-  if (useGemini || useMistral) {
-    return pass(
-      'Provider generation readiness',
-      'Skipped for managed provider mode.',
-    )
-  }
-
-  if (!useOpenAI) {
-    return pass('Provider generation readiness', 'Skipped (OpenAI-compatible mode disabled).')
-  }
-
-  const request = resolveProviderRequest({
-    model: process.env.OPENAI_MODEL,
-    baseUrl: process.env.OPENAI_BASE_URL,
-  })
-
-  if (request.transport === 'codex_responses') {
-    return pass(
-      'Provider generation readiness',
-      'Skipped for Codex responses (reachability probe already performs a lightweight generation request).',
-    )
-  }
-
-  if (!isLocalBaseUrl(request.baseUrl)) {
-    return pass('Provider generation readiness', 'Skipped for non-local provider URL.')
-  }
-
-  const localProviderLabel = getLocalOpenAICompatibleProviderLabel(request.baseUrl)
-  if (localProviderLabel !== 'Ollama') {
-    return pass(
-      'Provider generation readiness',
-      `Skipped for ${localProviderLabel} (no provider-specific generation probe).`,
-    )
-  }
-
-  const readiness = await probeOllamaGenerationReadiness({
-    baseUrl: request.baseUrl,
-    model: request.requestedModel,
-  })
-
-  if (readiness.state === 'ready') {
-    return pass(
-      'Provider generation readiness',
-      `Generated a test response with ${readiness.probeModel ?? request.requestedModel}.`,
-    )
-  }
-
-  if (readiness.state === 'unreachable') {
-    return fail(
-      'Provider generation readiness',
-      `Could not reach Ollama at ${redactUrlForDisplay(request.baseUrl)}.`,
-    )
-  }
-
-  if (readiness.state === 'no_models') {
-    return fail(
-      'Provider generation readiness',
-      'Ollama is reachable, but no installed models were found. Pull a model first (for example: ollama pull qwen2.5-coder:7b).',
-    )
-  }
-
-  const detailSuffix = readiness.detail ? ` Detail: ${readiness.detail}.` : ''
-  return fail(
-    'Provider generation readiness',
-    `Ollama is reachable, but generation failed for ${readiness.probeModel ?? request.requestedModel}.${detailSuffix}`,
-  )
-}
-
 function isAtomicChatUrl(baseUrl: string): boolean {
  try {
    const parsed = new URL(baseUrl)
@@ -664,7 +567,6 @@ async function main(): Promise<void> {
  results.push(checkBuildArtifacts())
  results.push(...checkOpenAIEnv())
  results.push(await checkBaseUrlReachability())
-  results.push(await checkProviderGenerationReadiness())
  results.push(checkOllamaProcessorMode())

  if (!options.json) {
--- a/src/Tool.ts
+++ b/src/Tool.ts
@@ -249,11 +249,6 @@ export type ToolUseContext = {
  /** When true, canUseTool must always be called even when hooks auto-approve.
   *  Used by speculation for overlay file path rewriting. */
  requireCanUseTool?: boolean
-  /**
-   * Optional callback used by hook-chain fallback actions that launch
-   * AgentTool from hook runtime paths.
-   */
-  hookChainsCanUseTool?: CanUseToolFn
  messages: Message[]
  fileReadingLimits?: {
    maxTokens?: number
--- a/src/tests/bugfixes.test.ts
+++ b/src/tests/bugfixes.test.ts
@@ -21,11 +21,11 @@ describe('Gemini store field fix', () => {
  test('isGeminiMode is imported and used in openaiShim', async () => {
    const content = await file('services/api/openaiShim.ts').text()

-    // Verify the fix: store deletion should check for Gemini mode
+    // Verify the fix: store deletion should check for Gemini mode and local providers
    expect(content).toContain('isGeminiMode()')
-    expect(content).toContain("mistral and gemini don't recognize body.store")
-    // Ensure the delete body.store is guarded for both Mistral and Gemini
-    expect(content).toMatch(/isMistral\s*\|\|\s*isGeminiMode\(\)/)
+    expect(content).toContain("Strip store for providers that don't recognize it")
+    // Ensure the delete body.store is guarded for Mistral, Gemini, and local providers
+    expect(content).toMatch(/isMistral\s*\|\|\s*isGeminiMode\(\)\s*\|\|\s*isLocal/)
  })

  test('store: false is still set by default (OpenAI needs it)', async () => {
@@ -169,14 +169,6 @@ describe('Web search result count improvements', () => {

    expect(content).toMatch(/max_uses:\s*15/)
  })
-
-  test('codex web search path guarantees a non-empty result body', async () => {
-    const content = await file(
-      'tools/WebSearchTool/WebSearchTool.ts',
-    ).text()
-
-    expect(content).toContain("results.push('No results found.')")
-  })
 })

 // ---------------------------------------------------------------------------
--- a/src/tests/security-hardening.test.ts
+++ b/src/tests/security-hardening.test.ts
@@ -1,191 +0,0 @@
-/**
- * Security hardening regression tests.
- *
- * Covers:
- * 1. MCP tool result Unicode sanitization
- * 2. Sandbox settings source filtering (exclude projectSettings)
- * 3. Plugin git clone/pull hooks disabled
- * 4. ANTHROPIC_FOUNDRY_API_KEY removed from SAFE_ENV_VARS
- * 5. WebFetch SSRF protection via ssrfGuardedLookup
- */
-
-import { describe, test, expect } from 'bun:test'
-import { resolve } from 'path'
-
-const SRC = resolve(import.meta.dir, '..')
-const file = (relative: string) => Bun.file(resolve(SRC, relative))
-
-// ---------------------------------------------------------------------------
-// Fix 1: MCP tool result Unicode sanitization
-// ---------------------------------------------------------------------------
-describe('MCP tool result sanitization', () => {
-  test('transformResultContent sanitizes text content', async () => {
-    const content = await file('services/mcp/client.ts').text()
-    // Tool definitions are already sanitized (line ~1798)
-    expect(content).toContain('recursivelySanitizeUnicode(result.tools)')
-    // Tool results must also be sanitized
-    expect(content).toMatch(
-      /case 'text':[\s\S]*?recursivelySanitizeUnicode\(resultContent\.text\)/,
-    )
-  })
-
-  test('resource text content is also sanitized', async () => {
-    const content = await file('services/mcp/client.ts').text()
-    expect(content).toMatch(
-      /recursivelySanitizeUnicode\(\s*`\$\{prefix\}\$\{resource\.text\}`/,
-    )
-  })
-})
-
-// ---------------------------------------------------------------------------
-// Fix 2: Sandbox settings source filtering
-// ---------------------------------------------------------------------------
-describe('Sandbox settings trust boundary', () => {
-  test('getSandboxEnabledSetting does not use getSettings_DEPRECATED', async () => {
-    const content = await file('utils/sandbox/sandbox-adapter.ts').text()
-    // Extract the getSandboxEnabledSetting function body
-    const fnMatch = content.match(
-      /function getSandboxEnabledSetting\(\)[^{]*\{([\s\S]*?)\n\}/,
-    )
-    expect(fnMatch).not.toBeNull()
-    const fnBody = fnMatch![1]
-    // Must NOT use getSettings_DEPRECATED (reads all sources including project)
-    expect(fnBody).not.toContain('getSettings_DEPRECATED')
-    // Must use getSettingsForSource for individual trusted sources
-    expect(fnBody).toContain("getSettingsForSource('userSettings')")
-    expect(fnBody).toContain("getSettingsForSource('policySettings')")
-    // Must NOT read from projectSettings
-    expect(fnBody).not.toContain("'projectSettings'")
-  })
-})
-
-// ---------------------------------------------------------------------------
-// Fix 3: Plugin git hooks disabled
-// ---------------------------------------------------------------------------
-describe('Plugin git operations disable hooks', () => {
-  test('gitClone includes core.hooksPath=/dev/null', async () => {
-    const content = await file('utils/plugins/marketplaceManager.ts').text()
-    // The clone args must disable hooks
-    const cloneSection = content.slice(
-      content.indexOf('export async function gitClone('),
-      content.indexOf('export async function gitClone(') + 2000,
-    )
-    expect(cloneSection).toContain("'core.hooksPath=/dev/null'")
-  })
-
-  test('gitPull includes core.hooksPath=/dev/null', async () => {
-    const content = await file('utils/plugins/marketplaceManager.ts').text()
-    const pullSection = content.slice(
-      content.indexOf('export async function gitPull('),
-      content.indexOf('export async function gitPull(') + 2000,
-    )
-    expect(pullSection).toContain("'core.hooksPath=/dev/null'")
-  })
-
-  test('gitSubmoduleUpdate includes core.hooksPath=/dev/null', async () => {
-    const content = await file('utils/plugins/marketplaceManager.ts').text()
-    const subSection = content.slice(
-      content.indexOf('async function gitSubmoduleUpdate('),
-      content.indexOf('async function gitSubmoduleUpdate(') + 1000,
-    )
-    expect(subSection).toContain("'core.hooksPath=/dev/null'")
-  })
-})
-
-// ---------------------------------------------------------------------------
-// Fix 4: ANTHROPIC_FOUNDRY_API_KEY not in SAFE_ENV_VARS
-// ---------------------------------------------------------------------------
-describe('SAFE_ENV_VARS excludes credentials', () => {
-  test('ANTHROPIC_FOUNDRY_API_KEY is not in SAFE_ENV_VARS', async () => {
-    const content = await file('utils/managedEnvConstants.ts').text()
-    // Extract the SAFE_ENV_VARS set definition
-    const safeStart = content.indexOf('export const SAFE_ENV_VARS')
-    const safeEnd = content.indexOf('])', safeStart)
-    const safeSection = content.slice(safeStart, safeEnd)
-    expect(safeSection).not.toContain('ANTHROPIC_FOUNDRY_API_KEY')
-  })
-})
-
-// ---------------------------------------------------------------------------
-// Fix 5: WebFetch SSRF protection
-// ---------------------------------------------------------------------------
-describe('WebFetch SSRF guard', () => {
-  test('getWithPermittedRedirects uses ssrfGuardedLookup', async () => {
-    const content = await file('tools/WebFetchTool/utils.ts').text()
-    expect(content).toContain(
-      "import { ssrfGuardedLookup } from '../../utils/hooks/ssrfGuard.js'",
-    )
-    // The axios.get call in getWithPermittedRedirects must include lookup
-    const fnSection = content.slice(
-      content.indexOf('export async function getWithPermittedRedirects('),
-      content.indexOf('export async function getWithPermittedRedirects(') +
-        1000,
-    )
-    expect(fnSection).toContain('lookup: ssrfGuardedLookup')
-  })
-})
-
-// ---------------------------------------------------------------------------
-// Fix 6: Swarm permission file polling removed (security hardening)
-// ---------------------------------------------------------------------------
-describe('Swarm permission file polling removed', () => {
-  test('useSwarmPermissionPoller hook no longer exists', async () => {
-    const content = await file(
-      'hooks/useSwarmPermissionPoller.ts',
-    ).text()
-    // The file-based polling hook must not exist — it read from an
-    // unauthenticated resolved/ directory where any local process could
-    // forge approval files.
-    expect(content).not.toContain('function useSwarmPermissionPoller(')
-    // The file-based processResponse must not exist
-    expect(content).not.toContain('function processResponse(')
-  })
-
-  test('poller does not import from permissionSync', async () => {
-    const content = await file(
-      'hooks/useSwarmPermissionPoller.ts',
-    ).text()
-    // Must not import anything from permissionSync — all file-based
-    // functions have been removed from this module's dependencies
-    expect(content).not.toContain('permissionSync')
-  })
-
-  test('file-based permission functions are marked deprecated', async () => {
-    const content = await file(
-      'utils/swarm/permissionSync.ts',
-    ).text()
-    // All file-based functions must have @deprecated JSDoc
-    const deprecatedFns = [
-      'writePermissionRequest',
-      'readPendingPermissions',
-      'readResolvedPermission',
-      'resolvePermission',
-      'pollForResponse',
-      'removeWorkerResponse',
-    ]
-    for (const fn of deprecatedFns) {
-      // Find the function and check that @deprecated appears before it
-      const fnIndex = content.indexOf(`export async function ${fn}(`)
-      if (fnIndex === -1) continue // submitPermissionRequest is a const, not async function
-      const preceding = content.slice(Math.max(0, fnIndex - 500), fnIndex)
-      expect(preceding).toContain('@deprecated')
-    }
-  })
-
-  test('mailbox-based functions are NOT deprecated', async () => {
-    const content = await file(
-      'utils/swarm/permissionSync.ts',
-    ).text()
-    // These are the active path — must not be deprecated
-    const activeFns = [
-      'sendPermissionRequestViaMailbox',
-      'sendPermissionResponseViaMailbox',
-    ]
-    for (const fn of activeFns) {
-      const fnIndex = content.indexOf(`export async function ${fn}(`)
-      expect(fnIndex).not.toBe(-1)
-      const preceding = content.slice(Math.max(0, fnIndex - 300), fnIndex)
-      expect(preceding).not.toContain('@deprecated')
-    }
-  })
-})
--- a/src/commands/benchmark.ts
+++ b/src/commands/benchmark.ts
@@ -1,56 +0,0 @@
-import type { ToolUseContext } from '../Tool.js'
-import type { Command } from '../types/command.js'
-import {
-  benchmarkModel,
-  benchmarkMultipleModels,
-  formatBenchmarkResults,
-  isBenchmarkSupported,
-} from '../utils/model/benchmark.js'
-import { getOllamaModelOptions } from '../utils/model/ollamaModels.js'
-
-async function runBenchmark(
-  model?: string,
-  context?: ToolUseContext,
-): Promise<void> {
-  if (!isBenchmarkSupported()) {
-    context?.stdout?.write(
-      'Benchmark not supported for this provider.\n' +
-        'Supported: OpenAI-compatible endpoints (Ollama, NVIDIA NIM, MiniMax)\n',
-    )
-    return
-  }
-
-  let modelsToBenchmark: string[]
-
-  if (model) {
-    modelsToBenchmark = [model]
-  } else {
-    const ollamaModels = getOllamaModelOptions()
-    modelsToBenchmark = ollamaModels.slice(0, 3).map((m) => m.value)
-  }
-
-  context?.stdout?.write(`Benchmarking ${modelsToBenchmark.length} model(s)...\n`)
-
-  const results = await benchmarkMultipleModels(
-    modelsToBenchmark,
-    (completed, total, result) => {
-      context?.stdout?.write(
-        `[${completed}/${total}] ${result.model}: ` +
-          `${result.success ? result.tokensPerSecond.toFixed(1) + ' tps' : 'FAILED'}\n`,
-      )
-    },
-  )
-
-  context?.stdout?.write('\n' + formatBenchmarkResults(results) + '\n')
-}
-
-export const benchmark: Command = {
-  name: 'benchmark',
-
-  async onExecute(context: ToolUseContext): Promise<void> {
-    const args = context.args ?? {}
-    const model = args.model as string | undefined
-
-    await runBenchmark(model, context)
-  },
-}
--- a/src/commands/provider/provider.tsx
+++ b/src/commands/provider/provider.tsx
@@ -66,44 +66,10 @@ import {
 import {
  getOllamaChatBaseUrl,
  getLocalOpenAICompatibleProviderLabel,
-  probeOllamaGenerationReadiness,
-  type OllamaGenerationReadiness,
+  hasLocalOllama,
+  listOllamaModels,
 } from '../../utils/providerDiscovery.js'

-function describeOllamaReadinessIssue(
-  readiness: OllamaGenerationReadiness,
-  options?: {
-    baseUrl?: string
-    allowManualFallback?: boolean
-  },
-): string {
-  const endpoint = options?.baseUrl ?? 'http://localhost:11434'
-
-  if (readiness.state === 'unreachable') {
-    return `Could not reach Ollama at ${endpoint}. Start Ollama first, then run /provider again.`
-  }
-
-  if (readiness.state === 'no_models') {
-    const manualSuffix = options?.allowManualFallback
-      ? ', or enter details manually'
-      : ''
-    return `Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first${manualSuffix}.`
-  }
-
-  if (readiness.state === 'generation_failed') {
-    const modelHint = readiness.probeModel ?? 'the selected model'
-    const detailSuffix = readiness.detail
-      ? ` Details: ${readiness.detail}.`
-      : ''
-    const manualSuffix = options?.allowManualFallback
-      ? ' You can also enter details manually.'
-      : ''
-    return `Ollama is reachable and models are installed, but a generation probe failed for ${modelHint}.${detailSuffix} Run "ollama run ${modelHint}" once and retry.${manualSuffix}`
-  }
-
-  return ''
-}
-
 type ProviderChoice = 'auto' | ProviderProfile | 'codex-oauth' | 'clear'

 type Step =
@@ -749,7 +715,6 @@ function AutoRecommendationStep({
    | {
        state: 'openai'
        defaultModel: string
-        reason: string
      }
    | {
        state: 'error'
@@ -763,27 +728,19 @@ function AutoRecommendationStep({
    void (async () => {
      const defaultModel = getGoalDefaultOpenAIModel(goal)
      try {
-        const readiness = await probeOllamaGenerationReadiness()
-        if (readiness.state !== 'ready') {
+        const ollamaAvailable = await hasLocalOllama()
+        if (!ollamaAvailable) {
          if (!cancelled) {
-            setStatus({
-              state: 'openai',
-              defaultModel,
-              reason: describeOllamaReadinessIssue(readiness),
-            })
+            setStatus({ state: 'openai', defaultModel })
          }
          return
        }

-        const recommended = recommendOllamaModel(readiness.models, goal)
+        const models = await listOllamaModels()
+        const recommended = recommendOllamaModel(models, goal)
        if (!recommended) {
          if (!cancelled) {
-            setStatus({
-              state: 'openai',
-              defaultModel,
-              reason:
-                'Ollama responded to a generation probe, but no recommended chat model matched this goal.',
-            })
+            setStatus({ state: 'openai', defaultModel })
          }
          return
        }
@@ -839,10 +796,10 @@ function AutoRecommendationStep({
      <Dialog title="Auto setup fallback" onCancel={onCancel}>
        <Box flexDirection="column" gap={1}>
          <Text>
-            Auto setup can continue into OpenAI-compatible setup with a default model of{' '}
+            No viable local Ollama chat model was detected. Auto setup can
+            continue into OpenAI-compatible setup with a default model of{' '}
            {status.defaultModel}.
          </Text>
-          <Text dimColor>{status.reason}</Text>
          <Select
            options={[
              { label: 'Continue to OpenAI-compatible setup', value: 'continue' },
@@ -926,19 +883,32 @@ function OllamaModelStep({
    let cancelled = false

    void (async () => {
-      const readiness = await probeOllamaGenerationReadiness()
-      if (readiness.state !== 'ready') {
+      const available = await hasLocalOllama()
+      if (!available) {
        if (!cancelled) {
          setStatus({
            state: 'unavailable',
-            message: describeOllamaReadinessIssue(readiness),
+            message:
+              'Could not reach Ollama at http://localhost:11434. Start Ollama first, then run /provider again.',
          })
        }
        return
      }

-      const ranked = rankOllamaModels(readiness.models, 'balanced')
-      const recommended = recommendOllamaModel(readiness.models, 'balanced')
+      const models = await listOllamaModels()
+      if (models.length === 0) {
+        if (!cancelled) {
+          setStatus({
+            state: 'unavailable',
+            message:
+              'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first.',
+          })
+        }
+        return
+      }
+
+      const ranked = rankOllamaModels(models, 'balanced')
+      const recommended = recommendOllamaModel(models, 'balanced')
      if (!cancelled) {
        setStatus({
          state: 'ready',
--- a/src/components/ConsoleOAuthFlow.test.tsx
+++ b/src/components/ConsoleOAuthFlow.test.tsx
@@ -112,10 +112,8 @@ test('third-party provider branch opens the first-run provider manager', async (
  )

  expect(output).toContain('Set up provider')
-  // Use alphabetically-early sentinels so they remain visible in the
-  // 13-row test frame after the provider list was sorted A→Z.
  expect(output).toContain('Anthropic')
-  expect(output).toContain('Azure OpenAI')
-  expect(output).toContain('DeepSeek')
-  expect(output).toContain('Google Gemini')
+  expect(output).toContain('OpenAI')
+  expect(output).toContain('Ollama')
+  expect(output).toContain('LM Studio')
 })
--- a/src/components/ProviderManager.test.tsx
+++ b/src/components/ProviderManager.test.tsx
@@ -97,47 +97,6 @@ async function waitForCondition(
  throw new Error('Timed out waiting for ProviderManager test condition')
 }

-// Provider list is sorted alphabetically by label in the preset picker, so
-// reaching a given provider takes more keypresses than it used to. Keep the
-// target-by-label indirection here so these tests survive future list edits
-// without further churn.
-//
-// Order matches ProviderManager.renderPresetSelection() when
-// canUseCodexOAuth === true (default in mocked tests).
-const PRESET_ORDER = [
-  'Alibaba Coding Plan',
-  'Alibaba Coding Plan (China)',
-  'Anthropic',
-  'Atomic Chat',
-  'Azure OpenAI',
-  'Codex OAuth',
-  'DeepSeek',
-  'Google Gemini',
-  'Groq',
-  'LM Studio',
-  'MiniMax',
-  'Mistral',
-  'Moonshot AI',
-  'NVIDIA NIM',
-  'Ollama',
-  'OpenAI',
-  'OpenRouter',
-  'Together AI',
-  'Custom',
-] as const
-
-async function navigateToPreset(
-  stdin: { write: (data: string) => void },
-  label: (typeof PRESET_ORDER)[number],
-): Promise<void> {
-  const index = PRESET_ORDER.indexOf(label)
-  if (index < 0) throw new Error(`Unknown preset label: ${label}`)
-  for (let i = 0; i < index; i++) {
-    stdin.write('j')
-    await Bun.sleep(25)
-  }
-}
-
 function createDeferred<T>(): {
  promise: Promise<T>
  resolve: (value: T) => void
@@ -190,21 +149,17 @@ function mockProviderManagerDependencies(
    applySavedProfileToCurrentSession?: (...args: unknown[]) => Promise<string | null>
    clearCodexCredentials?: () => { success: boolean; warning?: string }
    getProviderProfiles?: () => unknown[]
-    probeOllamaGenerationReadiness?: () => Promise<{
-      state: 'ready' | 'unreachable' | 'no_models' | 'generation_failed'
-      models: Array<
-        {
-          name: string
-          sizeBytes?: number | null
-          family?: string | null
-          families?: string[]
-          parameterSize?: string | null
-          quantizationLevel?: string | null
-        }
-      >
-      probeModel?: string
-      detail?: string
-    }>
+    hasLocalOllama?: () => Promise<boolean>
+    listOllamaModels?: () => Promise<
+      Array<{
+        name: string
+        sizeBytes?: number | null
+        family?: string | null
+        families?: string[]
+        parameterSize?: string | null
+        quantizationLevel?: string | null
+      }>
+    >
    codexSyncRead?: () => unknown
    codexAsyncRead?: () => Promise<unknown>
    updateProviderProfile?: (...args: unknown[]) => unknown
@@ -234,12 +189,8 @@ function mockProviderManagerDependencies(
  })

  mock.module('../utils/providerDiscovery.js', () => ({
-    probeOllamaGenerationReadiness:
-      options?.probeOllamaGenerationReadiness ??
-      (async () => ({
-        state: 'unreachable' as const,
-        models: [],
-      })),
+    hasLocalOllama: options?.hasLocalOllama ?? (async () => false),
+    listOllamaModels: options?.listOllamaModels ?? (async () => []),
  }))

  mock.module('../utils/githubModelsCredentials.js', () => ({
@@ -504,22 +455,19 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
    async () => undefined,
    {
      addProviderProfile,
-      probeOllamaGenerationReadiness: async () => ({
-        state: 'ready',
-        models: [
-          {
-            name: 'gemma4:31b-cloud',
-            family: 'gemma',
-            parameterSize: '31b',
-          },
-          {
-            name: 'kimi-k2.5:cloud',
-            family: 'kimi',
-            parameterSize: '2.5b',
-          },
-        ],
-        probeModel: 'gemma4:31b-cloud',
-      }),
+      hasLocalOllama: async () => true,
+      listOllamaModels: async () => [
+        {
+          name: 'gemma4:31b-cloud',
+          family: 'gemma',
+          parameterSize: '31b',
+        },
+        {
+          name: 'kimi-k2.5:cloud',
+          family: 'kimi',
+          parameterSize: '2.5b',
+        },
+      ],
    },
  )

@@ -532,10 +480,11 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as

  await waitForFrameOutput(
    mounted.getOutput,
-    frame => frame.includes('Set up provider'),
+    frame => frame.includes('Set up provider') && frame.includes('Ollama'),
  )

-  await navigateToPreset(mounted.stdin, 'Ollama')
+  mounted.stdin.write('j')
+  await Bun.sleep(50)
  mounted.stdin.write('\r')

  const modelFrame = await waitForFrameOutput(
@@ -630,7 +579,12 @@ test('ProviderManager first-run Codex OAuth switches the current session after l
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  await navigateToPreset(mounted.stdin, 'Codex OAuth')
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -722,7 +676,12 @@ test('ProviderManager first-run Codex OAuth reports next-startup fallback when s
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  await navigateToPreset(mounted.stdin, 'Codex OAuth')
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -816,7 +775,12 @@ test('ProviderManager does not hijack a manual Codex profile when OAuth credenti
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  await navigateToPreset(mounted.stdin, 'Codex OAuth')
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
+  mounted.stdin.write('j')
+  await Bun.sleep(25)
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
--- a/src/components/ProviderManager.tsx
+++ b/src/components/ProviderManager.tsx
@@ -37,16 +37,13 @@ import {
  readGithubModelsTokenAsync,
 } from '../utils/githubModelsCredentials.js'
 import {
-  probeAtomicChatReadiness,
-  probeOllamaGenerationReadiness,
-  type AtomicChatReadiness,
-  type OllamaGenerationReadiness,
+  hasLocalOllama,
+  listOllamaModels,
 } from '../utils/providerDiscovery.js'
 import {
  rankOllamaModels,
  recommendOllamaModel,
 } from '../utils/providerRecommendation.js'
-import { redactUrlForDisplay } from '../utils/urlRedaction.js'
 import { updateSettingsForSource } from '../utils/settings/settings.js'
 import {
  type OptionWithDescription,
@@ -55,6 +52,7 @@ import {
 import { Pane } from './design-system/Pane.js'
 import TextInput from './TextInput.js'
 import { useCodexOAuthFlow } from './useCodexOAuthFlow.js'
+import { useSetAppState } from '../state/AppState.js'

 export type ProviderManagerResult = {
  action: 'saved' | 'cancelled'
@@ -71,7 +69,6 @@ type Screen =
  | 'menu'
  | 'select-preset'
  | 'select-ollama-model'
-  | 'select-atomic-chat-model'
  | 'codex-oauth'
  | 'form'
  | 'select-active'
@@ -92,16 +89,6 @@ type OllamaSelectionState =
    }
  | { state: 'unavailable'; message: string }

-type AtomicChatSelectionState =
-  | { state: 'idle' }
-  | { state: 'loading' }
-  | {
-      state: 'ready'
-      options: OptionWithDescription<string>[]
-      defaultValue?: string
-    }
-  | { state: 'unavailable'; message: string }
-
 const FORM_STEPS: Array<{
  key: DraftField
  label: string
@@ -235,44 +222,6 @@ function getGithubProviderSummary(
  return `github-models · ${GITHUB_PROVIDER_DEFAULT_BASE_URL} · ${getGithubProviderModel(processEnv)} · ${credentialSummary}${activeSuffix}`
 }

-function describeAtomicChatSelectionIssue(
-  readiness: AtomicChatReadiness,
-  baseUrl: string,
-): string {
-  if (readiness.state === 'unreachable') {
-    return `Could not reach Atomic Chat at ${redactUrlForDisplay(baseUrl)}. Start the Atomic Chat app first, or enter the endpoint manually.`
-  }
-
-  if (readiness.state === 'no_models') {
-    return 'Atomic Chat is running, but no models are loaded. Download and load a model inside the Atomic Chat app first, or enter details manually.'
-  }
-
-  return ''
-}
-
-function describeOllamaSelectionIssue(
-  readiness: OllamaGenerationReadiness,
-  baseUrl: string,
-): string {
-  if (readiness.state === 'unreachable') {
-    return `Could not reach Ollama at ${redactUrlForDisplay(baseUrl)}. Start Ollama first, or enter the endpoint manually.`
-  }
-
-  if (readiness.state === 'no_models') {
-    return 'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first, or enter details manually.'
-  }
-
-  if (readiness.state === 'generation_failed') {
-    const modelHint = readiness.probeModel ?? 'the selected model'
-    const detailSuffix = readiness.detail
-      ? ` Details: ${readiness.detail}.`
-      : ''
-    return `Ollama is reachable and models are installed, but a generation probe failed for ${modelHint}.${detailSuffix} Run "ollama run ${modelHint}" once and retry, or enter details manually.`
-  }
-
-  return ''
-}
-
 function findCodexOAuthProfile(
  profiles: ProviderProfile[],
  profileId?: string,
@@ -384,12 +333,10 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  const initialIsGithubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
  const initialHasGithubCredential = initialGithubCredentialSource !== 'none'

-  // Deferred initialization: useState initializers run synchronously during
-  // render, so getProviderProfiles() and getActiveProviderProfile() would block
-  // the UI on first mount (sync file I/O). Use empty initial values and load
-  // asynchronously in useEffect with queueMicrotask to keep UI responsive.
-  const [profiles, setProfiles] = React.useState<ProviderProfile[]>([])
-  const [activeProfileId, setActiveProfileId] = React.useState<string | undefined>()
+  const [profiles, setProfiles] = React.useState(() => getProviderProfiles())
+  const [activeProfileId, setActiveProfileId] = React.useState(
+    () => getActiveProviderProfile()?.id,
+  )
  const [githubProviderAvailable, setGithubProviderAvailable] = React.useState(
    () => isGithubProviderAvailable(initialGithubCredentialSource),
  )
@@ -423,88 +370,11 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  const [ollamaSelection, setOllamaSelection] = React.useState<OllamaSelectionState>({
    state: 'idle',
  })
-  const [atomicChatSelection, setAtomicChatSelection] =
-    React.useState<AtomicChatSelectionState>({ state: 'idle' })
-  // Deferred initialization: useState initializers run synchronously during
-  // render, so getProviderProfiles() and getActiveProviderProfile() would block
-  // the UI (sync file I/O). Defer to queueMicrotask after first render.
-  // In test environment, skip defer to avoid timing issues with mocks.
-  const [isInitializing, setIsInitializing] = React.useState(
-    process.env.NODE_ENV !== 'test',
-  )
-  const [isActivating, setIsActivating] = React.useState(false)
-  const isRefreshingRef = React.useRef(false)
-
-  React.useEffect(() => {
-    // Skip deferred initialization in test environment (mocks are synchronous)
-    if (process.env.NODE_ENV === 'test') {
-      setProfiles(getProviderProfiles())
-      setActiveProfileId(getActiveProviderProfile()?.id)
-      setIsInitializing(false)
-      return
-    }
-
-    queueMicrotask(() => {
-      const profilesData = getProviderProfiles()
-      const activeId = getActiveProviderProfile()?.id
-      setProfiles(profilesData)
-      setActiveProfileId(activeId)
-      setIsInitializing(false)
-    })
-  }, [])

  const currentStep = FORM_STEPS[formStepIndex] ?? FORM_STEPS[0]
  const currentStepKey = currentStep.key
  const currentValue = draft[currentStepKey]

-  // Memoize menu options to prevent unnecessary re-renders when navigating
-  // the select menu. Without this, each arrow key press creates a new options
-  // array reference, causing Select to re-render and feel sluggish.
-  const hasProfiles = profiles.length > 0
-  const hasSelectableProviders = hasProfiles || githubProviderAvailable
-  const menuOptions = React.useMemo(
-    () => [
-      {
-        value: 'add',
-        label: 'Add provider',
-        description: 'Create a new provider profile',
-      },
-      {
-        value: 'activate',
-        label: 'Set active provider',
-        description: 'Switch the active provider profile',
-        disabled: !hasSelectableProviders,
-      },
-      {
-        value: 'edit',
-        label: 'Edit provider',
-        description: 'Update URL, model, or key',
-        disabled: !hasProfiles,
-      },
-      {
-        value: 'delete',
-        label: 'Delete provider',
-        description: 'Remove a provider profile',
-        disabled: !hasSelectableProviders,
-      },
-      ...(hasStoredCodexOAuthCredentials
-        ? [
-            {
-              value: 'logout-codex-oauth',
-              label: 'Log out Codex OAuth',
-              description: 'Clear securely stored Codex OAuth credentials',
-            },
-          ]
-        : []),
-      {
-        value: 'done',
-        label: 'Done',
-        description: 'Return to chat',
-      },
-    ],
-    [hasSelectableProviders, hasProfiles, hasStoredCodexOAuthCredentials],
-  )
-
  const refreshGithubProviderState = React.useCallback((): void => {
    const envCredentialSource = getGithubCredentialSourceFromEnv()
    const githubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
@@ -580,21 +450,32 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    setOllamaSelection({ state: 'loading' })

    void (async () => {
-      const readiness = await probeOllamaGenerationReadiness({
-        baseUrl: draft.baseUrl,
-      })
-      if (readiness.state !== 'ready') {
+      const available = await hasLocalOllama(draft.baseUrl)
+      if (!available) {
        if (!cancelled) {
          setOllamaSelection({
            state: 'unavailable',
-            message: describeOllamaSelectionIssue(readiness, draft.baseUrl),
+            message:
+              'Could not reach Ollama. Start Ollama first, or enter the endpoint manually.',
          })
        }
        return
      }

-      const ranked = rankOllamaModels(readiness.models, 'balanced')
-      const recommended = recommendOllamaModel(readiness.models, 'balanced')
+      const models = await listOllamaModels(draft.baseUrl)
+      if (models.length === 0) {
+        if (!cancelled) {
+          setOllamaSelection({
+            state: 'unavailable',
+            message:
+              'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first, or enter details manually.',
+          })
+        }
+        return
+      }
+
+      const ranked = rankOllamaModels(models, 'balanced')
+      const recommended = recommendOllamaModel(models, 'balanced')
      if (!cancelled) {
        setOllamaSelection({
          state: 'ready',
@@ -613,61 +494,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    }
  }, [draft.baseUrl, screen])

-  React.useEffect(() => {
-    if (screen !== 'select-atomic-chat-model') {
-      return
-    }
-
-    let cancelled = false
-    setAtomicChatSelection({ state: 'loading' })
-
-    void (async () => {
-      const readiness = await probeAtomicChatReadiness({
-        baseUrl: draft.baseUrl,
-      })
-      if (readiness.state !== 'ready') {
-        if (!cancelled) {
-          setAtomicChatSelection({
-            state: 'unavailable',
-            message: describeAtomicChatSelectionIssue(readiness, draft.baseUrl),
-          })
-        }
-        return
-      }
-
-      if (!cancelled) {
-        setAtomicChatSelection({
-          state: 'ready',
-          defaultValue: readiness.models[0],
-          options: readiness.models.map(model => ({
-            label: model,
-            value: model,
-          })),
-        })
-      }
-    })()
-
-    return () => {
-      cancelled = true
-    }
-  }, [draft.baseUrl, screen])
-
  function refreshProfiles(): void {
-    // Defer sync I/O to next microtask to prevent UI freeze.
-    // getProviderProfiles() and getActiveProviderProfile() read config files
-    // synchronously, which can block the main thread on Windows (antivirus, disk cache).
-    // queueMicrotask ensures the current render completes first.
-    if (isRefreshingRef.current) return
-    isRefreshingRef.current = true
-
-    queueMicrotask(() => {
-      const nextProfiles = getProviderProfiles()
-      setProfiles(nextProfiles)
-      setActiveProfileId(getActiveProviderProfile()?.id)
-      refreshGithubProviderState()
-      refreshCodexOAuthCredentialState()
-      isRefreshingRef.current = false
-    })
+    const nextProfiles = getProviderProfiles()
+    setProfiles(nextProfiles)
+    setActiveProfileId(getActiveProviderProfile()?.id)
+    refreshGithubProviderState()
+    refreshCodexOAuthCredentialState()
  }

  function clearStartupProviderOverrideFromUserSettings(): string | null {
@@ -740,24 +572,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  async function activateSelectedProvider(profileId: string): Promise<void> {
    let providerLabel = 'provider'

-    // Set loading state before sync I/O to keep UI responsive
-    setIsActivating(true)
-    setStatusMessage('Activating provider...')
-
    try {
-      // Defer sync I/O to next microtask - UI renders loading state first.
-      // setActiveProviderProfile(), activateGithubProvider(), and
-      // clearStartupProviderOverrideFromUserSettings() all perform sync file writes
-      // (saveGlobalConfig, saveProfileFile, updateSettingsForSource) which can
-      // block the main thread on Windows (antivirus, disk cache, NTFS metadata).
-      await new Promise<void>(resolve => queueMicrotask(resolve))
-
      if (profileId === GITHUB_PROVIDER_ID) {
        providerLabel = GITHUB_PROVIDER_LABEL
        const githubError = activateGithubProvider()
        if (githubError) {
          setErrorMessage(`Could not activate GitHub provider: ${githubError}`)
-          setIsActivating(false)
          returnToMenu()
          return
        }
@@ -773,7 +593,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          mainLoopModel: GITHUB_PROVIDER_DEFAULT_MODEL,
        }))
        setStatusMessage(`Active provider: ${GITHUB_PROVIDER_LABEL}`)
-        setIsActivating(false)
        returnToMenu()
        return
      }
@@ -781,7 +600,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      const active = setActiveProviderProfile(profileId)
      if (!active) {
        setErrorMessage('Could not change active provider.')
-        setIsActivating(false)
        returnToMenu()
        return
      }
@@ -829,12 +647,10 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            ? `Active provider: ${active.name}. Warning: could not clear startup provider override (${settingsOverrideError}).`
            : `Active provider: ${active.name}`,
      )
-      setIsActivating(false)
      returnToMenu()
    } catch (error) {
      refreshProfiles()
      setStatusMessage(undefined)
-      setIsActivating(false)
      const detail = error instanceof Error ? error.message : String(error)
      setErrorMessage(`Could not finish activating ${providerLabel}: ${detail}`)
      returnToMenu()
@@ -958,12 +774,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      return
    }

-    if (preset === 'atomic-chat') {
-      setAtomicChatSelection({ state: 'loading' })
-      setScreen('select-atomic-chat-model')
-      return
-    }
-
    setScreen('form')
  }

@@ -1039,86 +849,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    returnToMenu()
  }

-  function renderAtomicChatSelection(): React.ReactNode {
-    if (
-      atomicChatSelection.state === 'loading' ||
-      atomicChatSelection.state === 'idle'
-    ) {
-      return (
-        <Box flexDirection="column" gap={1}>
-          <Text color="remember" bold>
-            Checking Atomic Chat
-          </Text>
-          <Text dimColor>Looking for loaded Atomic Chat models...</Text>
-        </Box>
-      )
-    }
-
-    if (atomicChatSelection.state === 'unavailable') {
-      return (
-        <Box flexDirection="column" gap={1}>
-          <Text color="remember" bold>
-            Atomic Chat setup
-          </Text>
-          <Text dimColor>{atomicChatSelection.message}</Text>
-          <Select
-            options={[
-              {
-                value: 'manual',
-                label: 'Enter manually',
-                description: 'Fill in the base URL and model yourself',
-              },
-              {
-                value: 'back',
-                label: 'Back',
-                description: 'Choose another provider preset',
-              },
-            ]}
-            onChange={(value: string) => {
-              if (value === 'manual') {
-                setFormStepIndex(0)
-                setCursorOffset(draft.name.length)
-                setScreen('form')
-                return
-              }
-              setScreen('select-preset')
-            }}
-            onCancel={() => setScreen('select-preset')}
-            visibleOptionCount={2}
-          />
-        </Box>
-      )
-    }
-
-    return (
-      <Box flexDirection="column" gap={1}>
-        <Text color="remember" bold>
-          Choose an Atomic Chat model
-        </Text>
-        <Text dimColor>
-          Pick one of the models loaded in Atomic Chat to save into a local
-          provider profile.
-        </Text>
-        <Select
-          options={atomicChatSelection.options}
-          defaultValue={atomicChatSelection.defaultValue}
-          defaultFocusValue={atomicChatSelection.defaultValue}
-          inlineDescriptions
-          visibleOptionCount={Math.min(8, atomicChatSelection.options.length)}
-          onChange={(value: string) => {
-            const nextDraft = {
-              ...draft,
-              model: value,
-            }
-            setDraft(nextDraft)
-            persistDraft(nextDraft)
-          }}
-          onCancel={() => setScreen('select-preset')}
-        />
-      </Box>
-    )
-  }
-
  function renderOllamaSelection(): React.ReactNode {
    if (ollamaSelection.state === 'loading' || ollamaSelection.state === 'idle') {
      return (
@@ -1249,35 +979,21 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {

  function renderPresetSelection(): React.ReactNode {
    const canUseCodexOAuth = !isBareMode()
-    // Providers sorted alphabetically by label. `Custom` is pinned to the end
-    // because it's the catch-all / escape hatch — users scanning the list
-    // should always find known providers first. `Skip for now` (first-run
-    // only) comes last, after Custom.
    const options = [
-      {
-        value: 'dashscope-intl',
-        label: 'Alibaba Coding Plan',
-        description: 'Alibaba DashScope International endpoint',
-      },
-      {
-        value: 'dashscope-cn',
-        label: 'Alibaba Coding Plan (China)',
-        description: 'Alibaba DashScope China endpoint',
-      },
      {
        value: 'anthropic',
        label: 'Anthropic',
        description: 'Native Claude API (x-api-key auth)',
      },
      {
-        value: 'atomic-chat',
-        label: 'Atomic Chat',
-        description: 'Local Model Provider',
+        value: 'ollama',
+        label: 'Ollama',
+        description: 'Local or remote Ollama endpoint',
      },
      {
-        value: 'azure-openai',
-        label: 'Azure OpenAI',
-        description: 'Azure OpenAI endpoint (model=deployment name)',
+        value: 'openai',
+        label: 'OpenAI',
+        description: 'OpenAI API with API key',
      },
      ...(canUseCodexOAuth
        ? [
@@ -1289,6 +1005,11 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            },
          ]
        : []),
+      {
+        value: 'moonshotai',
+        label: 'Moonshot AI',
+        description: 'Kimi OpenAI-compatible endpoint',
+      },
      {
        value: 'deepseek',
        label: 'DeepSeek',
@@ -1299,45 +1020,25 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        label: 'Google Gemini',
        description: 'Gemini OpenAI-compatible endpoint',
      },
+      {
+        value: 'together',
+        label: 'Together AI',
+        description: 'Together chat/completions endpoint',
+      },
      {
        value: 'groq',
        label: 'Groq',
        description: 'Groq OpenAI-compatible endpoint',
      },
-      {
-        value: 'lmstudio',
-        label: 'LM Studio',
-        description: 'Local LM Studio endpoint',
-      },
-      {
-        value: 'minimax',
-        label: 'MiniMax',
-        description: 'MiniMax API endpoint',
-      },
      {
        value: 'mistral',
        label: 'Mistral',
        description: 'Mistral OpenAI-compatible endpoint',
      },
      {
-        value: 'moonshotai',
-        label: 'Moonshot AI',
-        description: 'Kimi OpenAI-compatible endpoint',
-      },
-      {
-        value: 'nvidia-nim',
-        label: 'NVIDIA NIM',
-        description: 'NVIDIA NIM endpoint',
-      },
-      {
-        value: 'ollama',
-        label: 'Ollama',
-        description: 'Local or remote Ollama endpoint',
-      },
-      {
-        value: 'openai',
-        label: 'OpenAI',
-        description: 'OpenAI API with API key',
+        value: 'azure-openai',
+        label: 'Azure OpenAI',
+        description: 'Azure OpenAI endpoint (model=deployment name)',
      },
      {
        value: 'openrouter',
@@ -1345,15 +1046,35 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        description: 'OpenRouter OpenAI-compatible endpoint',
      },
      {
-        value: 'together',
-        label: 'Together AI',
-        description: 'Together chat/completions endpoint',
+        value: 'lmstudio',
+        label: 'LM Studio',
+        description: 'Local LM Studio endpoint',
+      },
+      {
+        value: 'dashscope-cn',
+        label: 'Alibaba Coding Plan (China)',
+        description: 'Alibaba DashScope China endpoint',
+      },
+      {
+        value: 'dashscope-intl',
+        label: 'Alibaba Coding Plan',
+        description: 'Alibaba DashScope International endpoint',
      },
      {
        value: 'custom',
        label: 'Custom',
        description: 'Any OpenAI-compatible provider',
      },
+      {
+        value: 'nvidia-nim',
+        label: 'NVIDIA NIM',
+        description: 'NVIDIA NIM endpoint',
+      },
+      {
+        value: 'minimax',
+        label: 'MiniMax',
+        description: 'MiniMax API endpoint',
+      },
      ...(mode === 'first-run'
        ? [
            {
@@ -1444,10 +1165,49 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  }

  function renderMenu(): React.ReactNode {
-    // Use memoized menuOptions from component scope
    const hasProfiles = profiles.length > 0
    const hasSelectableProviders = hasProfiles || githubProviderAvailable

+    const options = [
+      {
+        value: 'add',
+        label: 'Add provider',
+        description: 'Create a new provider profile',
+      },
+      {
+        value: 'activate',
+        label: 'Set active provider',
+        description: 'Switch the active provider profile',
+        disabled: !hasSelectableProviders,
+      },
+      {
+        value: 'edit',
+        label: 'Edit provider',
+        description: 'Update URL, model, or key',
+        disabled: !hasProfiles,
+      },
+      {
+        value: 'delete',
+        label: 'Delete provider',
+        description: 'Remove a provider profile',
+        disabled: !hasSelectableProviders,
+      },
+      ...(hasStoredCodexOAuthCredentials
+        ? [
+            {
+              value: 'logout-codex-oauth',
+              label: 'Log out Codex OAuth',
+              description: 'Clear securely stored Codex OAuth credentials',
+            },
+          ]
+        : []),
+      {
+        value: 'done',
+        label: 'Done',
+        description: 'Return to chat',
+      },
+    ]
+
    return (
      <Box flexDirection="column" gap={1}>
        <Text color="remember" bold>
@@ -1484,7 +1244,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          )}
        </Box>
        <Select
-          options={menuOptions}
+          options={options}
          onChange={(value: string) => {
            setErrorMessage(undefined)
            switch (value) {
@@ -1497,7 +1257,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
                }
                break
              case 'edit':
-                if (hasProfiles) {
+                if (profiles.length > 0) {
                  setScreen('select-edit')
                }
                break
@@ -1554,7 +1314,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          }}
          onCancel={() => closeWithCancelled('Provider manager closed')}
          defaultFocusValue={menuFocusValue}
-          visibleOptionCount={menuOptions.length}
+          visibleOptionCount={options.length}
        />
      </Box>
    )
@@ -1633,9 +1393,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    case 'select-ollama-model':
      content = renderOllamaSelection()
      break
-    case 'select-atomic-chat-model':
-      content = renderAtomicChatSelection()
-      break
    case 'codex-oauth':
      content = (
        <CodexOAuthSetup
@@ -1793,21 +1550,5 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      break
  }

-  return (
-    <Pane color="permission">
-      {isInitializing ? (
-        <Box flexDirection="column" gap={1}>
-          <Text color="remember" bold>Loading providers...</Text>
-          <Text dimColor>Reading provider profiles from disk.</Text>
-        </Box>
-      ) : isActivating ? (
-        <Box flexDirection="column" gap={1}>
-          <Text color="remember" bold>Activating provider...</Text>
-          <Text dimColor>Please wait while the provider is being configured.</Text>
-        </Box>
-      ) : (
-        content
-      )}
-    </Pane>
-  )
+  return <Pane color="permission">{content}</Pane>
 }
--- a/src/components/Settings/Config.tsx
+++ b/src/components/Settings/Config.tsx
@@ -281,24 +281,6 @@ export function Config({
        enabled: autoCompactEnabled
      });
    }
-  }, {
-    id: 'toolHistoryCompressionEnabled',
-    label: 'Tool history compression',
-    value: globalConfig.toolHistoryCompressionEnabled,
-    type: 'boolean' as const,
-    onChange(toolHistoryCompressionEnabled: boolean) {
-      saveGlobalConfig(current => ({
-        ...current,
-        toolHistoryCompressionEnabled
-      }));
-      setGlobalConfig({
-        ...getGlobalConfig(),
-        toolHistoryCompressionEnabled
-      });
-      logEvent('tengu_tool_history_compression_setting_changed', {
-        enabled: toolHistoryCompressionEnabled
-      });
-    }
  }, {
    id: 'spinnerTipsEnabled',
    label: 'Show tips',
@@ -1176,9 +1158,6 @@ export function Config({
    if (globalConfig.autoCompactEnabled !== initialConfig.current.autoCompactEnabled) {
      formattedChanges.push(`${globalConfig.autoCompactEnabled ? 'Enabled' : 'Disabled'} auto-compact`);
    }
-    if (globalConfig.toolHistoryCompressionEnabled !== initialConfig.current.toolHistoryCompressionEnabled) {
-      formattedChanges.push(`${globalConfig.toolHistoryCompressionEnabled ? 'Enabled' : 'Disabled'} tool history compression`);
-    }
    if (globalConfig.respectGitignore !== initialConfig.current.respectGitignore) {
      formattedChanges.push(`${globalConfig.respectGitignore ? 'Enabled' : 'Disabled'} respect .gitignore in file picker`);
    }
--- a/src/components/StartupScreen.test.ts
+++ b/src/components/StartupScreen.test.ts
@@ -1,158 +0,0 @@
-import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
-import { detectProvider } from './StartupScreen.js'
-
-const ENV_KEYS = [
-  'CLAUDE_CODE_USE_OPENAI',
-  'CLAUDE_CODE_USE_GEMINI',
-  'CLAUDE_CODE_USE_GITHUB',
-  'CLAUDE_CODE_USE_BEDROCK',
-  'CLAUDE_CODE_USE_VERTEX',
-  'CLAUDE_CODE_USE_MISTRAL',
-  'OPENAI_BASE_URL',
-  'OPENAI_API_KEY',
-  'OPENAI_MODEL',
-  'GEMINI_MODEL',
-  'MISTRAL_MODEL',
-  'ANTHROPIC_MODEL',
-  'NVIDIA_NIM',
-  'MINIMAX_API_KEY',
-]
-
-const originalEnv: Record<string, string | undefined> = {}
-
-beforeEach(() => {
-  for (const key of ENV_KEYS) {
-    originalEnv[key] = process.env[key]
-    delete process.env[key]
-  }
-})
-
-afterEach(() => {
-  for (const key of ENV_KEYS) {
-    if (originalEnv[key] === undefined) {
-      delete process.env[key]
-    } else {
-      process.env[key] = originalEnv[key]
-    }
-  }
-})
-
-function setupOpenAIMode(baseUrl: string, model: string): void {
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_BASE_URL = baseUrl
-  process.env.OPENAI_MODEL = model
-  process.env.OPENAI_API_KEY = 'test-key'
-}
-
-// --- Issue #855: aggregator URL must win over vendor-prefixed model name ---
-
-describe('detectProvider — aggregator URL authoritative over model-name substring (#855)', () => {
-  test('OpenRouter + deepseek/deepseek-chat labels as OpenRouter', () => {
-    setupOpenAIMode('https://openrouter.ai/api/v1', 'deepseek/deepseek-chat')
-    expect(detectProvider().name).toBe('OpenRouter')
-  })
-
-  test('OpenRouter + moonshotai/kimi-k2 labels as OpenRouter', () => {
-    setupOpenAIMode('https://openrouter.ai/api/v1', 'moonshotai/kimi-k2')
-    expect(detectProvider().name).toBe('OpenRouter')
-  })
-
-  test('OpenRouter + mistralai/mistral-large labels as OpenRouter', () => {
-    setupOpenAIMode('https://openrouter.ai/api/v1', 'mistralai/mistral-large')
-    expect(detectProvider().name).toBe('OpenRouter')
-  })
-
-  test('OpenRouter + meta-llama/llama-3.3 labels as OpenRouter', () => {
-    setupOpenAIMode('https://openrouter.ai/api/v1', 'meta-llama/llama-3.3-70b-instruct')
-    expect(detectProvider().name).toBe('OpenRouter')
-  })
-
-  test('Together + deepseek-ai/DeepSeek-V3 labels as Together AI', () => {
-    setupOpenAIMode('https://api.together.xyz/v1', 'deepseek-ai/DeepSeek-V3')
-    expect(detectProvider().name).toBe('Together AI')
-  })
-
-  test('Together + meta-llama/Llama-3.3 labels as Together AI', () => {
-    setupOpenAIMode('https://api.together.xyz/v1', 'meta-llama/Llama-3.3-70B-Instruct-Turbo')
-    expect(detectProvider().name).toBe('Together AI')
-  })
-
-  test('Groq + deepseek-r1-distill-llama-70b labels as Groq', () => {
-    setupOpenAIMode('https://api.groq.com/openai/v1', 'deepseek-r1-distill-llama-70b')
-    expect(detectProvider().name).toBe('Groq')
-  })
-
-  test('Groq + llama-3.3-70b-versatile labels as Groq', () => {
-    setupOpenAIMode('https://api.groq.com/openai/v1', 'llama-3.3-70b-versatile')
-    expect(detectProvider().name).toBe('Groq')
-  })
-
-  test('Azure + any deepseek deployment labels as Azure OpenAI', () => {
-    setupOpenAIMode('https://my-resource.openai.azure.com/', 'deepseek-chat')
-    expect(detectProvider().name).toBe('Azure OpenAI')
-  })
-})
-
-// --- Direct vendor endpoints still label correctly (regression) ---
-
-describe('detectProvider — direct vendor endpoints', () => {
-  test('api.deepseek.com labels as DeepSeek', () => {
-    setupOpenAIMode('https://api.deepseek.com/v1', 'deepseek-chat')
-    expect(detectProvider().name).toBe('DeepSeek')
-  })
-
-  test('api.moonshot.cn labels as Moonshot (Kimi)', () => {
-    setupOpenAIMode('https://api.moonshot.cn/v1', 'moonshot-v1-8k')
-    expect(detectProvider().name).toBe('Moonshot (Kimi)')
-  })
-
-  test('api.mistral.ai labels as Mistral', () => {
-    setupOpenAIMode('https://api.mistral.ai/v1', 'mistral-large-latest')
-    expect(detectProvider().name).toBe('Mistral')
-  })
-
-  test('default OpenAI URL + gpt-4o labels as OpenAI', () => {
-    setupOpenAIMode('https://api.openai.com/v1', 'gpt-4o')
-    expect(detectProvider().name).toBe('OpenAI')
-  })
-})
-
-// --- rawModel fallback for generic/custom endpoints ---
-
-describe('detectProvider — rawModel fallback when URL is generic', () => {
-  test('custom proxy + deepseek-chat falls back to DeepSeek', () => {
-    setupOpenAIMode('https://my-proxy.internal/v1', 'deepseek-chat')
-    expect(detectProvider().name).toBe('DeepSeek')
-  })
-
-  test('custom proxy + kimi-k2 falls back to Moonshot (Kimi)', () => {
-    setupOpenAIMode('https://my-proxy.internal/v1', 'kimi-k2-instruct')
-    expect(detectProvider().name).toBe('Moonshot (Kimi)')
-  })
-
-  test('custom proxy + llama-3.3 falls back to Meta Llama', () => {
-    setupOpenAIMode('https://my-proxy.internal/v1', 'llama-3.3-70b')
-    expect(detectProvider().name).toBe('Meta Llama')
-  })
-
-  test('custom proxy + mistral-large falls back to Mistral', () => {
-    setupOpenAIMode('https://my-proxy.internal/v1', 'mistral-large-latest')
-    expect(detectProvider().name).toBe('Mistral')
-  })
-})
-
-// --- Explicit env flags win over URL heuristics ---
-
-describe('detectProvider — explicit dedicated-provider env flags', () => {
-  test('NVIDIA_NIM=1 overrides aggregator URL', () => {
-    setupOpenAIMode('https://openrouter.ai/api/v1', 'some-nim-model')
-    process.env.NVIDIA_NIM = '1'
-    expect(detectProvider().name).toBe('NVIDIA NIM')
-  })
-
-  test('MINIMAX_API_KEY overrides aggregator URL', () => {
-    setupOpenAIMode('https://openrouter.ai/api/v1', 'any-model')
-    process.env.MINIMAX_API_KEY = 'test-key'
-    expect(detectProvider().name).toBe('MiniMax')
-  })
-})
--- a/src/components/StartupScreen.ts
+++ b/src/components/StartupScreen.ts
@@ -83,7 +83,7 @@ const LOGO_CLAUDE = [

 // ─── Provider detection ───────────────────────────────────────────────────────

-export function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
+function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
  const useGemini = process.env.CLAUDE_CODE_USE_GEMINI === '1' || process.env.CLAUDE_CODE_USE_GEMINI === 'true'
  const useGithub = process.env.CLAUDE_CODE_USE_GITHUB === '1' || process.env.CLAUDE_CODE_USE_GITHUB === 'true'
  const useOpenAI = process.env.CLAUDE_CODE_USE_OPENAI === '1' || process.env.CLAUDE_CODE_USE_OPENAI === 'true'
@@ -117,34 +117,28 @@ export function detectProvider(): { name: string; model: string; baseUrl: string
    const baseUrl = resolvedRequest.baseUrl
    const isLocal = isLocalProviderUrl(baseUrl)
    let name = 'OpenAI'
-    // Explicit dedicated-provider env flags win.
-    if (process.env.NVIDIA_NIM) name = 'NVIDIA NIM'
-    else if (process.env.MINIMAX_API_KEY) name = 'MiniMax'
-    else if (
-      resolvedRequest.transport === 'codex_responses' ||
-      baseUrl.includes('chatgpt.com/backend-api/codex')
-    )
+    if (/nvidia/i.test(baseUrl) || /nvidia/i.test(rawModel) || process.env.NVIDIA_NIM)
+      name = 'NVIDIA NIM'
+    else if (/minimax/i.test(baseUrl) || /minimax/i.test(rawModel) || process.env.MINIMAX_API_KEY)
+      name = 'MiniMax'
+    else if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex'))
      name = 'Codex'
-    // Base URL is authoritative — must precede rawModel checks so aggregators
-    // (OpenRouter/Together/Groq) aren't mislabelled as DeepSeek/Kimi/etc.
-    // when routed to models whose IDs contain a vendor prefix. See issue #855.
-    else if (/openrouter/i.test(baseUrl)) name = 'OpenRouter'
-    else if (/together/i.test(baseUrl)) name = 'Together AI'
-    else if (/groq/i.test(baseUrl)) name = 'Groq'
-    else if (/azure/i.test(baseUrl)) name = 'Azure OpenAI'
-    else if (/nvidia/i.test(baseUrl)) name = 'NVIDIA NIM'
-    else if (/minimax/i.test(baseUrl)) name = 'MiniMax'
-    else if (/moonshot/i.test(baseUrl)) name = 'Moonshot (Kimi)'
-    else if (/deepseek/i.test(baseUrl)) name = 'DeepSeek'
-    else if (/mistral/i.test(baseUrl)) name = 'Mistral'
-    // rawModel fallback — fires only when base URL is generic/custom.
-    else if (/nvidia/i.test(rawModel)) name = 'NVIDIA NIM'
-    else if (/minimax/i.test(rawModel)) name = 'MiniMax'
-    else if (/kimi/i.test(rawModel)) name = 'Moonshot (Kimi)'
-    else if (/deepseek/i.test(rawModel)) name = 'DeepSeek'
-    else if (/mistral/i.test(rawModel)) name = 'Mistral'
-    else if (/llama/i.test(rawModel)) name = 'Meta Llama'
-    else if (isLocal) name = getLocalOpenAICompatibleProviderLabel(baseUrl)
+    else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))
+      name = 'DeepSeek'
+    else if (/openrouter/i.test(baseUrl))
+      name = 'OpenRouter'
+    else if (/together/i.test(baseUrl))
+      name = 'Together AI'
+    else if (/groq/i.test(baseUrl))
+      name = 'Groq'
+    else if (/mistral/i.test(baseUrl) || /mistral/i.test(rawModel))
+      name = 'Mistral'
+    else if (/azure/i.test(baseUrl))
+      name = 'Azure OpenAI'
+    else if (/llama/i.test(rawModel))
+      name = 'Meta Llama'
+    else if (isLocal)
+      name = getLocalOpenAICompatibleProviderLabel(baseUrl)
    
    // Resolve model alias to actual model name + reasoning effort
    let displayModel = resolvedRequest.resolvedModel
--- a/src/components/memory/memoryFileSelectorPaths.test.ts
+++ b/src/components/memory/memoryFileSelectorPaths.test.ts
@@ -53,20 +53,17 @@ describe('getProjectMemoryPathForSelector', () => {
  })

  test('defaults to a new AGENTS.md in the current cwd when no project file is loaded', () => {
-    const cwd = join('/repo', 'packages', 'app')
-    expect(getProjectMemoryPathForSelector([], cwd)).toBe(
-      join(cwd, 'AGENTS.md'),
+    expect(getProjectMemoryPathForSelector([], '/repo/packages/app')).toBe(
+      '/repo/packages/app/AGENTS.md',
    )
  })

  test('ignores loaded project instruction files outside the current cwd ancestry', () => {
-    const outsideRepoPath = join('/other-worktree', 'AGENTS.md')
-    const cwd = join('/repo', 'packages', 'app')
    expect(
      getProjectMemoryPathForSelector(
-        [projectFile(outsideRepoPath)],
-        cwd,
+        [projectFile('/other-worktree/AGENTS.md')],
+        '/repo/packages/app',
      ),
-    ).toBe(join(cwd, 'AGENTS.md'))
+    ).toBe('/repo/packages/app/AGENTS.md')
  })
 })
--- a/src/constants/prompts.ts
+++ b/src/constants/prompts.ts
@@ -823,11 +823,6 @@ function getFunctionResultClearingSection(model: string): string | null {
    return null
  }
  const config = getCachedMCConfigForFRC()
-  if (!config) {
-    // External/stub builds return null from getCachedMCConfig — abort the
-    // section rather than trying to read .supportedModels off null.
-    return null
-  }
  const isModelSupported = config.supportedModels?.some(pattern =>
    model.includes(pattern),
  )
--- a/src/hooks/useOfficialMarketplaceNotification.tsx
+++ b/src/hooks/useOfficialMarketplaceNotification.tsx
@@ -19,7 +19,7 @@ async function _temp() {
    logForDebugging("Showing marketplace config save failure notification");
    notifs.push({
      key: "marketplace-config-save-failed",
-      jsx: <Text color="error">Failed to save marketplace retry info · Check ~/.openclaude.json permissions</Text>,
+      jsx: <Text color="error">Failed to save marketplace retry info · Check ~/.claude.json permissions</Text>,
      priority: "immediate",
      timeoutMs: 10000
    });
--- a/src/hooks/usePasteHandler.test.ts
+++ b/src/hooks/usePasteHandler.test.ts
@@ -1,8 +1,5 @@
 import { expect, test } from 'bun:test'
-import {
-  shouldHandleInputAsPaste,
-  supportsClipboardImageFallback,
-} from './usePasteHandler.ts'
+import { supportsClipboardImageFallback } from './usePasteHandler.ts'

 test('supports clipboard image fallback on Windows', () => {
  expect(supportsClipboardImageFallback('windows')).toBe(true)
@@ -23,42 +20,3 @@ test('does not support clipboard image fallback on WSL', () => {
 test('does not support clipboard image fallback on unknown platforms', () => {
  expect(supportsClipboardImageFallback('unknown')).toBe(false)
 })
-
-test('does not treat a bracketed paste as pending when no paste handlers are provided', () => {
-  expect(
-    shouldHandleInputAsPaste({
-      hasTextPasteHandler: false,
-      hasImagePasteHandler: false,
-      inputLength: 'kimi-k2.5'.length,
-      pastePending: false,
-      hasImageFilePath: false,
-      isFromPaste: true,
-    }),
-  ).toBe(false)
-})
-
-test('treats bracketed text paste as pending when a text paste handler exists', () => {
-  expect(
-    shouldHandleInputAsPaste({
-      hasTextPasteHandler: true,
-      hasImagePasteHandler: false,
-      inputLength: 'kimi-k2.5'.length,
-      pastePending: false,
-      hasImageFilePath: false,
-      isFromPaste: true,
-    }),
-  ).toBe(true)
-})
-
-test('treats image path paste as pending when only an image handler exists', () => {
-  expect(
-    shouldHandleInputAsPaste({
-      hasTextPasteHandler: false,
-      hasImagePasteHandler: true,
-      inputLength: 'C:\\Users\\jat\\image.png'.length,
-      pastePending: false,
-      hasImageFilePath: true,
-      isFromPaste: false,
-    }),
-  ).toBe(true)
-})
--- a/src/hooks/usePasteHandler.ts
+++ b/src/hooks/usePasteHandler.ts
@@ -35,24 +35,6 @@ type PasteHandlerProps = {
  ) => void
 }

-export function shouldHandleInputAsPaste(options: {
-  hasTextPasteHandler: boolean
-  hasImagePasteHandler: boolean
-  inputLength: number
-  pastePending: boolean
-  hasImageFilePath: boolean
-  isFromPaste: boolean
-}): boolean {
-  return (
-    (options.hasTextPasteHandler &&
-      (options.inputLength > PASTE_THRESHOLD ||
-        options.pastePending ||
-        options.hasImageFilePath ||
-        options.isFromPaste)) ||
-    (options.hasImagePasteHandler && options.hasImageFilePath)
-  )
-}
-
 export function usePasteHandler({
  onPaste,
  onInput,
@@ -254,6 +236,11 @@ export function usePasteHandler({
    // The keypress parser sets isPasted=true for content within bracketed paste.
    const isFromPaste = event.keypress.isPasted

+    // If this is pasted content, set isPasting state for UI feedback
+    if (isFromPaste) {
+      setIsPasting(true)
+    }
+
    // Handle large pastes (>PASTE_THRESHOLD chars)
    // Usually we get one or two input characters at a time. If we
    // get more than the threshold, the user has probably pasted.
@@ -281,7 +268,6 @@ export function usePasteHandler({
      canFallbackToClipboardImage &&
      onImagePaste
    ) {
-      setIsPasting(true)
      checkClipboardForImage()
      // Reset isPasting since there's no text content to process
      setIsPasting(false)
@@ -289,17 +275,14 @@ export function usePasteHandler({
    }

    // Check if we should handle as paste (from bracketed paste, large input, or continuation)
-    const shouldHandleAsPaste = shouldHandleInputAsPaste({
-      hasTextPasteHandler: Boolean(onPaste),
-      hasImagePasteHandler: Boolean(onImagePaste),
-      inputLength: input.length,
-      pastePending: pastePendingRef.current,
-      hasImageFilePath,
-      isFromPaste,
-    })
+    const shouldHandleAsPaste =
+      onPaste &&
+      (input.length > PASTE_THRESHOLD ||
+        pastePendingRef.current ||
+        hasImageFilePath ||
+        isFromPaste)

    if (shouldHandleAsPaste) {
-      setIsPasting(true)
      pastePendingRef.current = true
      setPasteState(({ chunks, timeoutId }) => {
        return {
--- a/src/hooks/useSwarmPermissionPoller.ts
+++ b/src/hooks/useSwarmPermissionPoller.ts
@@ -1,23 +1,34 @@
 /**
- * Swarm Permission Callback Registry
+ * Swarm Permission Poller Hook
 *
- * Manages callback registrations for permission requests and responses
- * in agent swarms. Responses are delivered exclusively via the mailbox
- * system (useInboxPoller → processMailboxPermissionResponse).
+ * This hook polls for permission responses from the team leader when running
+ * as a worker agent in a swarm. When a response is received, it calls the
+ * appropriate callback (onAllow/onReject) to continue execution.
 *
- * The legacy file-based polling (resolved/ directory) has been removed
- * because it created an unauthenticated attack surface — any local process
- * could forge approval files. The mailbox path is the sole active channel.
+ * This hook should be used in conjunction with the worker-side integration
+ * in useCanUseTool.ts, which creates pending requests that this hook monitors.
 */

+import { useCallback, useEffect, useRef } from 'react'
+import { useInterval } from 'usehooks-ts'
 import { logForDebugging } from '../utils/debug.js'
+import { errorMessage } from '../utils/errors.js'
 import {
  type PermissionUpdate,
  permissionUpdateSchema,
 } from '../utils/permissions/PermissionUpdateSchema.js'
+import {
+  isSwarmWorker,
+  type PermissionResponse,
+  pollForResponse,
+  removeWorkerResponse,
+} from '../utils/swarm/permissionSync.js'
+import { getAgentName, getTeamName } from '../utils/teammate.js'
+
+const POLL_INTERVAL_MS = 500

 /**
- * Validate permissionUpdates from external sources (mailbox IPC).
+ * Validate permissionUpdates from external sources (mailbox IPC, disk polling).
 * Malformed entries from buggy/old teammate processes are filtered out rather
 * than propagated unchecked into callback.onAllow().
 */
@@ -214,9 +225,106 @@ export function processSandboxPermissionResponse(params: {
  return true
 }

-// Legacy file-based polling (useSwarmPermissionPoller, processResponse)
-// has been removed. Permission responses are now delivered exclusively
-// via the mailbox system:
-//   Leader: sendPermissionResponseViaMailbox() → writeToMailbox()
-//   Worker: useInboxPoller → processMailboxPermissionResponse()
-// See: fix(security) — remove unauthenticated file-based permission channel
+/**
+ * Process a permission response by invoking the registered callback
+ */
+function processResponse(response: PermissionResponse): boolean {
+  const callback = pendingCallbacks.get(response.requestId)
+
+  if (!callback) {
+    logForDebugging(
+      `[SwarmPermissionPoller] No callback registered for request ${response.requestId}`,
+    )
+    return false
+  }
+
+  logForDebugging(
+    `[SwarmPermissionPoller] Processing response for request ${response.requestId}: ${response.decision}`,
+  )
+
+  // Remove from registry before invoking callback
+  pendingCallbacks.delete(response.requestId)
+
+  if (response.decision === 'approved') {
+    const permissionUpdates = parsePermissionUpdates(response.permissionUpdates)
+    const updatedInput = response.updatedInput
+    callback.onAllow(updatedInput, permissionUpdates)
+  } else {
+    callback.onReject(response.feedback)
+  }
+
+  return true
+}
+
+/**
+ * Hook that polls for permission responses when running as a swarm worker.
+ *
+ * This hook:
+ * 1. Only activates when isSwarmWorker() returns true
+ * 2. Polls every 500ms for responses
+ * 3. When a response is found, invokes the registered callback
+ * 4. Cleans up the response file after processing
+ */
+export function useSwarmPermissionPoller(): void {
+  const isProcessingRef = useRef(false)
+
+  const poll = useCallback(async () => {
+    // Don't poll if not a swarm worker
+    if (!isSwarmWorker()) {
+      return
+    }
+
+    // Prevent concurrent polling
+    if (isProcessingRef.current) {
+      return
+    }
+
+    // Don't poll if no callbacks are registered
+    if (pendingCallbacks.size === 0) {
+      return
+    }
+
+    isProcessingRef.current = true
+
+    try {
+      const agentName = getAgentName()
+      const teamName = getTeamName()
+
+      if (!agentName || !teamName) {
+        return
+      }
+
+      // Check each pending request for a response
+      for (const [requestId, _callback] of pendingCallbacks) {
+        const response = await pollForResponse(requestId, agentName, teamName)
+
+        if (response) {
+          // Process the response
+          const processed = processResponse(response)
+
+          if (processed) {
+            // Clean up the response from the worker's inbox
+            await removeWorkerResponse(requestId, agentName, teamName)
+          }
+        }
+      }
+    } catch (error) {
+      logForDebugging(
+        `[SwarmPermissionPoller] Error during poll: ${errorMessage(error)}`,
+      )
+    } finally {
+      isProcessingRef.current = false
+    }
+  }, [])
+
+  // Only poll if we're a swarm worker
+  const shouldPoll = isSwarmWorker()
+  useInterval(() => void poll(), shouldPoll ? POLL_INTERVAL_MS : null)
+
+  // Initial poll on mount
+  useEffect(() => {
+    if (isSwarmWorker()) {
+      void poll()
+    }
+  }, [poll])
+}
--- a/src/ink/termio/osc.test.ts
+++ b/src/ink/termio/osc.test.ts
@@ -11,16 +11,14 @@ const execFileNoThrowMock = mock(
  async () => ({ code: 0, stdout: '', stderr: '' }),
 )

-function installOscMocks(): void {
-  mock.module('../../utils/execFileNoThrow.js', () => ({
-    execFileNoThrow: execFileNoThrowMock,
-    execFileNoThrowWithCwd: execFileNoThrowMock,
-  }))
+mock.module('../../utils/execFileNoThrow.js', () => ({
+  execFileNoThrow: execFileNoThrowMock,
+  execFileNoThrowWithCwd: execFileNoThrowMock,
+}))

-  mock.module('../../utils/tempfile.js', () => ({
-    generateTempFilePath: generateTempFilePathMock,
-  }))
-}
+mock.module('../../utils/tempfile.js', () => ({
+  generateTempFilePath: generateTempFilePathMock,
+}))

 async function importFreshOscModule() {
  return import(`./osc.ts?ts=${Date.now()}-${Math.random()}`)
@@ -47,7 +45,6 @@ async function waitForExecCall(

 describe('Windows clipboard fallback', () => {
  beforeEach(() => {
-    installOscMocks()
    execFileNoThrowMock.mockClear()
    generateTempFilePathMock.mockClear()
    process.env = { ...originalEnv }
@@ -65,12 +62,14 @@ describe('Windows clipboard fallback', () => {
    const { setClipboard } = await importFreshOscModule()

    await setClipboard('Привет мир')
-    const windowsCall = await waitForExecCall('powershell')
+    await flushClipboardCopy()

    expect(execFileNoThrowMock.mock.calls.some(([cmd]) => cmd === 'clip')).toBe(
      false,
    )
-    expect(windowsCall).toBeDefined()
+    expect(
+      execFileNoThrowMock.mock.calls.some(([cmd]) => cmd === 'powershell'),
+    ).toBe(true)
  })

  test('passes Windows clipboard text through a UTF-8 temp file instead of stdin', async () => {
@@ -98,7 +97,6 @@ describe('Windows clipboard fallback', () => {

 describe('clipboard path behavior remains stable', () => {
  beforeEach(() => {
-    installOscMocks()
    execFileNoThrowMock.mockClear()
    process.env = { ...originalEnv }
    delete process.env['SSH_CONNECTION']
--- a/src/migrations/resetAutoModeOptInForDefaultOffer.ts
+++ b/src/migrations/resetAutoModeOptInForDefaultOffer.ts
@@ -12,7 +12,7 @@ import {
 * One-shot migration: clear skipAutoPermissionPrompt for users who accepted
 * the old 2-option AutoModeOptInDialog but don't have auto as their default.
 * Re-surfaces the dialog so they see the new "make it my default mode" option.
- * Guard lives in GlobalConfig (~/.openclaude.json), not settings.json, so it
+ * Guard lives in GlobalConfig (~/.claude.json), not settings.json, so it
 * survives settings resets and doesn't re-arm itself.
 *
 * Only runs when tengu_auto_mode_config.enabled === 'enabled'. For 'opt-in'
--- a/src/screens/REPL.tsx
+++ b/src/screens/REPL.tsx
@@ -3873,7 +3873,7 @@ export function REPL({
  // empty to non-empty, not on every length change -- otherwise a render loop
  // (concurrent onQuery thrashing, etc.) spams saveGlobalConfig, which hits
  // ELOCKED under concurrent sessions and falls back to unlocked writes.
-  // That write storm is the primary trigger for ~/.openclaude.json corruption
+  // That write storm is the primary trigger for ~/.claude.json corruption
  // (GH #3117).
  const hasCountedQueueUseRef = useRef(false);
  useEffect(() => {
--- a/src/services/analytics/growthbook.ts
+++ b/src/services/analytics/growthbook.ts
@@ -334,7 +334,7 @@ async function processRemoteEvalPayload(
  // Empty object is truthy — without the length check, `{features: {}}`
  // (transient server bug, truncated response) would pass, clear the maps
  // below, return true, and syncRemoteEvalToDisk would wholesale-write `{}`
-  // to disk: total flag blackout for every process sharing ~/.openclaude.json.
+  // to disk: total flag blackout for every process sharing ~/.claude.json.
  if (!payload?.features || Object.keys(payload.features).length === 0) {
    return false
  }
--- a/src/services/api/claude.ts
+++ b/src/services/api/claude.ts
@@ -23,7 +23,6 @@ import { randomUUID } from 'crypto'
 import {
  getAPIProvider,
  isFirstPartyAnthropicBaseUrl,
-  isGithubNativeAnthropicMode,
 } from 'src/utils/model/providers.js'
 import {
  getAttributionHeader,
@@ -335,13 +334,8 @@ export function getPromptCachingEnabled(model: string): boolean {
  // Prompt caching is an Anthropic-specific feature. Third-party providers
  // do not understand cache_control blocks and strict backends (e.g. Azure
  // Foundry) reject or flag requests that contain them.
-  //
-  // Exception: when the GitHub provider is configured in native Anthropic API
-  // mode (CLAUDE_CODE_GITHUB_ANTHROPIC_API=1), requests are sent in Anthropic
-  // format, so cache_control blocks are supported.
  const provider = getAPIProvider()
-  const isNativeGithub = isGithubNativeAnthropicMode(model)
-  if (provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex' && !isNativeGithub) {
+  if (provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex') {
    return false
  }

@@ -1217,7 +1211,7 @@ async function* queryModel(
    cachedMCEnabled = featureEnabled && modelSupported
    const config = getCachedMCConfig()
    logForDebugging(
-      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config?.supportedModels)}`,
+      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config.supportedModels)}`,
    )
  }

--- a/src/services/api/client.ts
+++ b/src/services/api/client.ts
@@ -14,7 +14,6 @@ import { getSmallFastModel } from 'src/utils/model/model.js'
 import {
  getAPIProvider,
  isFirstPartyAnthropicBaseUrl,
-  isGithubNativeAnthropicMode,
 } from 'src/utils/model/providers.js'
 import { getProxyFetchOptions } from 'src/utils/proxy.js'
 import {
@@ -175,25 +174,6 @@ export async function getAnthropicClient({
      providerOverride,
    }) as unknown as Anthropic
  }
-  // GitHub provider in native Anthropic API mode: send requests in Anthropic
-  // format so cache_control blocks are honoured and prompt caching works.
-  // Requires the GitHub endpoint (OPENAI_BASE_URL) to support Anthropic's
-  // messages API — set CLAUDE_CODE_GITHUB_ANTHROPIC_API=1 to opt in.
-  if (isGithubNativeAnthropicMode(model)) {
-    const githubBaseUrl =
-      process.env.OPENAI_BASE_URL?.replace(/\/$/, '') ??
-      'https://api.githubcopilot.com'
-    const githubToken =
-      process.env.GITHUB_TOKEN ?? process.env.GH_TOKEN ?? ''
-    const nativeArgs: ConstructorParameters<typeof Anthropic>[0] = {
-      ...ARGS,
-      baseURL: githubBaseUrl,
-      authToken: githubToken,
-      // No apiKey — we authenticate via Bearer token (authToken)
-      apiKey: null,
-    }
-    return new Anthropic(nativeArgs)
-  }
  if (
    isEnvTruthy(process.env.CLAUDE_CODE_USE_OPENAI) ||
    isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) ||
--- a/src/services/api/codexShim.test.ts
+++ b/src/services/api/codexShim.test.ts
@@ -8,7 +8,6 @@ import {
  convertCodexResponseToAnthropicMessage,
  convertToolsToResponsesTools,
 } from './codexShim.js'
-import { __test as webSearchToolTest } from '../../tools/WebSearchTool/WebSearchTool.js'

 const tempDirs: string[] = []
 const originalEnv = {
@@ -610,164 +609,6 @@ describe('Codex request translation', () => {
    ])
  })

-  test('recovers Codex web search text and sources from sparse completed response', () => {
-    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
-      {
-        output: [
-          {
-            type: 'web_search_call',
-            sources: [
-              {
-                title: 'OpenClaude repo',
-                url: 'https://github.com/example/openclaude',
-              },
-            ],
-          },
-          {
-            type: 'message',
-            role: 'assistant',
-            content: [
-              {
-                type: 'text',
-                text: 'OpenClaude is available on GitHub.',
-                sources: [
-                  {
-                    title: 'Docs',
-                    url: 'https://docs.example.com/openclaude',
-                  },
-                ],
-              },
-            ],
-          },
-        ],
-      },
-      'OpenClaude GitHub 2026',
-      0.42,
-    )
-
-    expect(output.results).toEqual([
-      'OpenClaude is available on GitHub.',
-      {
-        tool_use_id: 'codex-web-search',
-        content: [
-          {
-            title: 'OpenClaude repo',
-            url: 'https://github.com/example/openclaude',
-          },
-          {
-            title: 'Docs',
-            url: 'https://docs.example.com/openclaude',
-          },
-        ],
-      },
-    ])
-  })
-
-  test('falls back to a non-empty Codex web search result message', () => {
-    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
-      { output: [] },
-      'OpenClaude GitHub 2026',
-      0.11,
-    )
-
-    expect(output.results).toEqual(['No results found.'])
-  })
-
-  test('surfaces Codex web search failure reason with a message', () => {
-    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
-      {
-        output: [
-          {
-            type: 'web_search_call',
-            status: 'failed',
-            error: { message: 'upstream search provider rate-limited' },
-          },
-        ],
-      },
-      'OpenClaude GitHub 2026',
-      0.05,
-    )
-
-    expect(output.results).toEqual([
-      'Web search failed: upstream search provider rate-limited',
-    ])
-  })
-
-  test('surfaces Codex web search failure reason nested under action.error', () => {
-    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
-      {
-        output: [
-          {
-            type: 'web_search_call',
-            status: 'failed',
-            action: { error: { message: 'query blocked' } },
-          },
-        ],
-      },
-      'OpenClaude GitHub 2026',
-      0.05,
-    )
-
-    expect(output.results).toEqual(['Web search failed: query blocked'])
-  })
-
-  test('handles Codex web search failure with no reason attached', () => {
-    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
-      {
-        output: [
-          {
-            type: 'web_search_call',
-            status: 'failed',
-          },
-        ],
-      },
-      'OpenClaude GitHub 2026',
-      0.05,
-    )
-
-    expect(output.results).toEqual(['Web search failed.'])
-  })
-
-  test('a failure item does not suppress sources from a later message item', () => {
-    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
-      {
-        output: [
-          {
-            type: 'web_search_call',
-            status: 'failed',
-            error: { message: 'partial outage' },
-          },
-          {
-            type: 'message',
-            role: 'assistant',
-            content: [
-              {
-                type: 'output_text',
-                text: 'Partial results below.',
-                sources: [
-                  { title: 'Docs', url: 'https://docs.example.com/openclaude' },
-                ],
-              },
-            ],
-          },
-        ],
-      },
-      'OpenClaude GitHub 2026',
-      0.05,
-    )
-
-    expect(output.results).toEqual([
-      'Web search failed: partial outage',
-      'Partial results below.',
-      {
-        tool_use_id: 'codex-web-search',
-        content: [
-          { title: 'Docs', url: 'https://docs.example.com/openclaude' },
-        ],
-      },
-    ])
-  })
-
  test('translates Codex SSE text stream into Anthropic events', async () => {
    const responseText = [
      'event: response.output_item.added',
--- a/src/services/api/codexShim.ts
+++ b/src/services/api/codexShim.ts
@@ -1,5 +1,4 @@
 import { APIError } from '@anthropic-ai/sdk'
-import { compressToolHistory } from './compressToolHistory.js'
 import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
 import type {
  ResolvedCodexCredentials,
@@ -485,15 +484,13 @@ export async function performCodexRequest(options: {
  defaultHeaders: Record<string, string>
  signal?: AbortSignal
 }): Promise<Response> {
-  const compressedMessages = compressToolHistory(
+  const input = convertAnthropicMessagesToResponsesInput(
    options.params.messages as Array<{
      role?: string
      message?: { role?: string; content?: unknown }
      content?: unknown
    }>,
-    options.request.resolvedModel,
  )
-  const input = convertAnthropicMessagesToResponsesInput(compressedMessages)
  const body: Record<string, unknown> = {
    model: options.request.resolvedModel,
    input: input.length > 0
--- a/src/services/api/compressToolHistory.test.ts
+++ b/src/services/api/compressToolHistory.test.ts
@@ -1,572 +0,0 @@
-import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
-import { compressToolHistory, getTiers } from './compressToolHistory.js'
-
-// Mock the two dependencies so tests are deterministic and don't read disk config.
-const mockState = {
-  enabled: true,
-  effectiveWindow: 100_000,
-}
-
-mock.module('../../utils/config.js', () => ({
-  getGlobalConfig: () => ({
-    toolHistoryCompressionEnabled: mockState.enabled,
-  }),
-}))
-
-mock.module('../compact/autoCompact.js', () => ({
-  getEffectiveContextWindowSize: () => mockState.effectiveWindow,
-}))
-
-beforeEach(() => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 100_000
-})
-
-afterEach(() => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 100_000
-})
-
-type Block = Record<string, unknown>
-type Msg = { role: string; content: Block[] | string }
-
-function bigText(n: number): string {
-  return 'x'.repeat(n)
-}
-
-function buildToolExchange(id: number, resultLength: number): Msg[] {
-  return [
-    {
-      role: 'assistant',
-      content: [
-        {
-          type: 'tool_use',
-          id: `toolu_${id}`,
-          name: 'Read',
-          input: { file_path: `/path/to/file${id}.ts` },
-        },
-      ],
-    },
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'tool_result',
-          tool_use_id: `toolu_${id}`,
-          content: bigText(resultLength),
-        },
-      ],
-    },
-  ]
-}
-
-function buildConversation(numToolExchanges: number, resultLength = 5_000): Msg[] {
-  const out: Msg[] = [{ role: 'user', content: 'Initial request' }]
-  for (let i = 0; i < numToolExchanges; i++) {
-    out.push(...buildToolExchange(i, resultLength))
-  }
-  return out
-}
-
-function getResultMessages(messages: Msg[]): Msg[] {
-  return messages.filter(
-    m => Array.isArray(m.content) && m.content.some((b: any) => b.type === 'tool_result'),
-  )
-}
-
-function getResultBlock(msg: Msg): Block {
-  return (msg.content as Block[]).find((b: any) => b.type === 'tool_result') as Block
-}
-
-function getResultText(msg: Msg): string {
-  const block = getResultBlock(msg)
-  const c = block.content
-  if (typeof c === 'string') return c
-  if (Array.isArray(c)) {
-    return c
-      .filter((b: any) => b.type === 'text')
-      .map((b: any) => b.text)
-      .join('\n')
-  }
-  return ''
-}
-
-// ---------- getTiers ----------
-
-test('getTiers: < 16k window → recent=2, mid=3', () => {
-  expect(getTiers(8_000)).toEqual({ recent: 2, mid: 3 })
-})
-
-test('getTiers: 16k–32k → recent=3, mid=5', () => {
-  expect(getTiers(20_000)).toEqual({ recent: 3, mid: 5 })
-})
-
-test('getTiers: 32k–64k → recent=4, mid=8', () => {
-  expect(getTiers(48_000)).toEqual({ recent: 4, mid: 8 })
-})
-
-test('getTiers: 64k–128k (Copilot gpt-4o) → recent=5, mid=10', () => {
-  expect(getTiers(100_000)).toEqual({ recent: 5, mid: 10 })
-})
-
-test('getTiers: 128k–256k (Copilot Claude) → recent=8, mid=15', () => {
-  expect(getTiers(200_000)).toEqual({ recent: 8, mid: 15 })
-})
-
-test('getTiers: 256k–500k → recent=12, mid=25', () => {
-  expect(getTiers(400_000)).toEqual({ recent: 12, mid: 25 })
-})
-
-test('getTiers: ≥ 500k (gpt-4.1 1M) → recent=25, mid=50', () => {
-  expect(getTiers(1_000_000)).toEqual({ recent: 25, mid: 50 })
-})
-
-// ---------- master switch ----------
-
-test('pass-through when toolHistoryCompressionEnabled is false', () => {
-  mockState.enabled = false
-  const messages = buildConversation(20)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  expect(result).toBe(messages) // same reference (no transformation)
-})
-
-test('pass-through when total tool_results <= recent tier', () => {
-  // 100k effective → recent=5; only 4 exchanges → no compression
-  const messages = buildConversation(4)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  expect(result).toBe(messages)
-})
-
-// ---------- per-tier behavior ----------
-
-test('recent tier: tool_result content untouched', () => {
-  // 100k effective → recent=5, mid=10. With 6 exchanges, only the oldest is touched.
-  const messages = buildConversation(6, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // Last 5 should be untouched (full 5000 chars)
-  for (let i = resultMsgs.length - 5; i < resultMsgs.length; i++) {
-    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
-  }
-})
-
-test('mid tier: long content truncated to MID_MAX_CHARS with marker', () => {
-  // 100k → recent=5, mid=10. 10 exchanges: 5 recent + 5 mid (none old).
-  const messages = buildConversation(10, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // First 5 are mid tier — should be truncated to ~2000 chars + marker
-  for (let i = 0; i < 5; i++) {
-    const text = getResultText(resultMsgs[i])
-    expect(text).toContain('[…truncated')
-    expect(text).toContain('chars from tool history]')
-    // Should be roughly 2000 chars + marker (under 2200)
-    expect(text.length).toBeLessThan(2_200)
-    expect(text.length).toBeGreaterThan(2_000)
-  }
-})
-
-test('mid tier: short content (< MID_MAX_CHARS) untouched', () => {
-  const messages = buildConversation(10, 500) // 500 < MID_MAX_CHARS
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  for (let i = 0; i < 5; i++) {
-    expect(getResultText(resultMsgs[i])).toBe(bigText(500))
-  }
-})
-
-test('old tier: content replaced with stub [name args={...} → N chars omitted]', () => {
-  // 100k → recent=5, mid=10, old=rest. 20 exchanges → 5 old + 10 mid + 5 recent.
-  const messages = buildConversation(20, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // First 5 are old tier — should be stubs
-  for (let i = 0; i < 5; i++) {
-    const text = getResultText(resultMsgs[i])
-    expect(text).toMatch(/^\[Read args=\{.*\} → 5000 chars omitted\]$/)
-  }
-})
-
-test('old tier: stub args truncated to 200 chars', () => {
-  const longArg = bigText(500)
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [
-        {
-          type: 'tool_use',
-          id: 'toolu_x',
-          name: 'Bash',
-          input: { command: longArg },
-        },
-      ],
-    },
-    {
-      role: 'user',
-      content: [
-        { type: 'tool_result', tool_use_id: 'toolu_x', content: 'output' },
-      ],
-    },
-    // Pad with enough recent exchanges to push the above into old tier
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-  const text = getResultText(resultMsgs[0])
-
-  // Stub format: [Bash args=<json≤200chars> → N chars omitted]
-  // The args portion (between args= and →) must be ≤ 200 chars.
-  const argsMatch = text.match(/args=(.*?) →/)
-  expect(argsMatch).not.toBeNull()
-  expect(argsMatch![1].length).toBeLessThanOrEqual(200)
-})
-
-test('old tier: orphan tool_result (no matching tool_use) falls back to "tool"', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    // Orphan: tool_result without matching tool_use in history
-    {
-      role: 'user',
-      content: [
-        { type: 'tool_result', tool_use_id: 'orphan_id', content: 'data' },
-      ],
-    },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-  const text = getResultText(resultMsgs[0])
-
-  expect(text).toMatch(/^\[tool args=\{\} → 4 chars omitted\]$/)
-})
-
-// ---------- structural preservation ----------
-
-test('tool_use blocks always preserved', () => {
-  const messages = buildConversation(20, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-
-  const useCount = (msgs: Msg[]) =>
-    msgs.reduce((sum, m) => {
-      if (!Array.isArray(m.content)) return sum
-      return sum + m.content.filter((b: any) => b.type === 'tool_use').length
-    }, 0)
-
-  expect(useCount(result as Msg[])).toBe(useCount(messages))
-})
-
-test('text blocks always preserved', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'first' },
-    {
-      role: 'assistant',
-      content: [
-        { type: 'text', text: 'reasoning before tool' },
-        { type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
-      ],
-    },
-    {
-      role: 'user',
-      content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
-    },
-    ...buildConversation(20, 5_000).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const assistantMsg = (result as Msg[])[1]
-  const textBlock = (assistantMsg.content as Block[]).find((b: any) => b.type === 'text')
-
-  expect(textBlock).toEqual({ type: 'text', text: 'reasoning before tool' })
-})
-
-test('thinking blocks always preserved', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'first' },
-    {
-      role: 'assistant',
-      content: [
-        { type: 'thinking', thinking: 'internal reasoning', signature: 'sig' },
-        { type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
-      ],
-    },
-    {
-      role: 'user',
-      content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
-    },
-    ...buildConversation(20, 5_000).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const assistantMsg = (result as Msg[])[1]
-  const thinking = (assistantMsg.content as Block[]).find((b: any) => b.type === 'thinking')
-
-  expect(thinking).toEqual({
-    type: 'thinking',
-    thinking: 'internal reasoning',
-    signature: 'sig',
-  })
-})
-
-test('non-array content (string) handled gracefully', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'plain string content' },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  expect((result as Msg[])[0].content).toBe('plain string content')
-})
-
-test('empty content array handled gracefully', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: [] },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  expect(() => compressToolHistory(messages, 'gpt-4o')).not.toThrow()
-})
-
-// ---------- message shape compatibility ----------
-
-test('wrapped shape ({ message: { role, content } }) handled', () => {
-  type WrappedMsg = { message: { role: string; content: Block[] | string } }
-  const wrap = (m: Msg): WrappedMsg => ({ message: { role: m.role, content: m.content } })
-  const messages = buildConversation(20, 5_000).map(wrap)
-  const result = compressToolHistory(messages as any, 'gpt-4o')
-
-  // First wrapped tool-result message should have stub content (old tier)
-  const firstResultMsg = (result as WrappedMsg[]).find(
-    m =>
-      Array.isArray(m.message.content) &&
-      m.message.content.some((b: any) => b.type === 'tool_result'),
-  )
-  const block = (firstResultMsg!.message.content as Block[]).find(
-    (b: any) => b.type === 'tool_result',
-  ) as Block
-  const text = ((block.content as Block[])[0] as any).text
-  expect(text).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
-})
-
-test('flat shape ({ role, content }) handled', () => {
-  const messages = buildConversation(20, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  expect(getResultText(resultMsgs[0])).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
-})
-
-// ---------- tier boundary correctness ----------
-
-test('tier boundaries: 6 exchanges → 1 mid + 5 recent (recent=5)', () => {
-  const messages = buildConversation(6, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // Oldest: mid (truncated)
-  expect(getResultText(resultMsgs[0])).toContain('[…truncated')
-  // Last 5: untouched
-  for (let i = 1; i < 6; i++) {
-    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
-  }
-})
-
-test('tier boundaries: 16 exchanges → 1 old + 10 mid + 5 recent', () => {
-  const messages = buildConversation(16, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // Oldest 1: stub (old tier)
-  expect(getResultText(resultMsgs[0])).toMatch(/^\[Read .*chars omitted\]$/)
-  // Next 10: mid (truncated)
-  for (let i = 1; i < 11; i++) {
-    expect(getResultText(resultMsgs[i])).toContain('[…truncated')
-  }
-  // Last 5: untouched
-  for (let i = 11; i < 16; i++) {
-    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
-  }
-})
-
-test('large window (1M) with 30 exchanges: all untouched (recent=25 ≥ 30 - 5)', () => {
-  // ≥500k → recent=25, mid=50. 30 exchanges → 5 mid + 25 recent. None old.
-  mockState.effectiveWindow = 1_000_000
-  const messages = buildConversation(30, 5_000)
-  const result = compressToolHistory(messages, 'gpt-4.1')
-  const resultMsgs = getResultMessages(result)
-
-  // Last 25: untouched
-  for (let i = 5; i < 30; i++) {
-    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
-  }
-})
-
-// ---------- attribute preservation ----------
-
-test('is_error flag preserved in mid tier', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
-    },
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'tool_result',
-          tool_use_id: 'toolu_err',
-          is_error: true,
-          content: bigText(5_000),
-        },
-      ],
-    },
-    // Pad with enough recent exchanges to push the above into MID tier
-    ...buildConversation(10, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-  const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
-
-  expect(block.is_error).toBe(true)
-  expect(getResultText(resultMsgs[0])).toContain('[…truncated')
-})
-
-test('is_error flag preserved in old tier (stub)', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
-    },
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'tool_result',
-          tool_use_id: 'toolu_err',
-          is_error: true,
-          content: bigText(5_000),
-        },
-      ],
-    },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-  const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
-
-  expect(block.is_error).toBe(true)
-  expect(getResultText(resultMsgs[0])).toMatch(/^\[Bash .*chars omitted\]$/)
-})
-
-// ---------- COMPACTABLE_TOOLS filter ----------
-
-test('non-compactable tool (e.g. Task/Agent) is NEVER compressed', () => {
-  // Build conversation where the OLDEST exchange uses a non-compactable tool name
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [
-        { type: 'tool_use', id: 'task_1', name: 'Task', input: { goal: 'plan' } },
-      ],
-    },
-    {
-      role: 'user',
-      content: [
-        { type: 'tool_result', tool_use_id: 'task_1', content: bigText(5_000) },
-      ],
-    },
-    // Pad with 20 compactable exchanges to push Task into old tier
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // First tool_result is for Task (non-compactable) → must remain full
-  expect(getResultText(resultMsgs[0]).length).toBe(5_000)
-  expect(getResultText(resultMsgs[0])).not.toContain('chars omitted')
-  expect(getResultText(resultMsgs[0])).not.toContain('[…truncated')
-})
-
-test('mcp__ prefixed tools ARE compactable (matches microCompact behavior)', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [
-        { type: 'tool_use', id: 'mcp_1', name: 'mcp__github__get_issue', input: {} },
-      ],
-    },
-    {
-      role: 'user',
-      content: [
-        { type: 'tool_result', tool_use_id: 'mcp_1', content: bigText(5_000) },
-      ],
-    },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // MCP tool result is compressed (gets stub since it's in old tier)
-  expect(getResultText(resultMsgs[0])).toMatch(/^\[mcp__github__get_issue .*chars omitted\]$/)
-})
-
-// ---------- skip already-cleared blocks ----------
-
-test('blocks already cleared by microCompact are NOT re-compressed', () => {
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [{ type: 'tool_use', id: 'cleared_1', name: 'Read', input: {} }],
-    },
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'tool_result',
-          tool_use_id: 'cleared_1',
-          content: '[Old tool result content cleared]', // microCompact's marker
-        },
-      ],
-    },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-
-  // Already-cleared marker survives untouched (no double processing)
-  expect(getResultText(resultMsgs[0])).toBe('[Old tool result content cleared]')
-})
-
-test('extra block attributes (e.g. cache_control) preserved across rewrites', () => {
-  const cacheControl = { type: 'ephemeral' }
-  const messages: Msg[] = [
-    { role: 'user', content: 'start' },
-    {
-      role: 'assistant',
-      content: [{ type: 'tool_use', id: 'toolu_cc', name: 'Read', input: {} }],
-    },
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'tool_result',
-          tool_use_id: 'toolu_cc',
-          cache_control: cacheControl,
-          content: bigText(5_000),
-        },
-      ],
-    },
-    ...buildConversation(20, 100).slice(1),
-  ]
-  const result = compressToolHistory(messages, 'gpt-4o')
-  const resultMsgs = getResultMessages(result)
-  const block = getResultBlock(resultMsgs[0]) as { cache_control?: unknown }
-
-  // The custom attribute survived the stub rewrite via ...block spread
-  expect(block.cache_control).toEqual(cacheControl)
-})
--- a/src/services/api/compressToolHistory.ts
+++ b/src/services/api/compressToolHistory.ts
@@ -1,255 +0,0 @@
-/**
- * Compresses old tool_result content for stateless OpenAI-compatible providers
- * (Copilot, Mistral, Ollama). Preserves all conversation structure — tool_use,
- * tool_result pairing, text, thinking, and is_error all survive intact. Only
- * the BULK text of older tool_results is shrunk to delay context saturation.
- *
- * Tier sizes scale with the model's effective context window via
- * getEffectiveContextWindowSize() — same calculation used by auto-compact, so
- * the two systems stay aligned.
- *
- * Complements (does not replace) microCompact.ts:
- * - microCompact: time/cache-based, runs from query.ts, binary clear/keep,
- *   limited to Claude (cache editing) or idle gaps (time-based).
- * - compressToolHistory: size-based, runs at the shim layer, tiered
- *   compression, covers the gap for active sessions on non-Claude providers.
- *
- * Reuses isCompactableTool from microCompact to avoid touching tools the
- * project already classifies as unsafe to compress (e.g. Task, Agent).
- * Skips blocks already cleared by microCompact (TOOL_RESULT_CLEARED_MESSAGE).
- *
- * Anthropic native bypasses both shims, so it is unaffected by this module.
- */
-import { getEffectiveContextWindowSize } from '../compact/autoCompact.js'
-import { isCompactableTool } from '../compact/microCompact.js'
-import { TOOL_RESULT_CLEARED_MESSAGE } from '../../utils/toolResultStorage.js'
-import { getGlobalConfig } from '../../utils/config.js'
-
-// Mid-tier truncation budget. 2k chars ≈ 500 tokens, enough to preserve the
-// shape of most tool outputs (file headers, command stderr, top grep hits)
-// without ballooning context. Bump too high and the tier loses its purpose.
-const MID_MAX_CHARS = 2_000
-
-// Stub args budget. JSON.stringify of a typical tool input fits in 200 chars
-// (file paths, short commands, small queries). Long inputs are rare and clamping
-// here keeps the stub size bounded even when callers pass oversized arguments.
-const STUB_ARGS_MAX_CHARS = 200
-
-type AnyMessage = {
-  role?: string
-  message?: { role?: string; content?: unknown }
-  content?: unknown
-}
-
-type ToolResultBlock = {
-  type: 'tool_result'
-  tool_use_id?: string
-  is_error?: boolean
-  content?: unknown
-}
-
-type ToolUseBlock = {
-  type: 'tool_use'
-  id?: string
-  name?: string
-  input?: unknown
-}
-
-type Tiers = { recent: number; mid: number }
-
-// Tier sizes scale with effective window. Targets roughly:
-// - recent tier stays under ~25% of available window (full fidelity kept)
-// - recent + mid tier stays under ~50% of available window (bounded bulk)
-// - everything older collapses to ~15-token stubs
-// Values assume ~5KB avg tool_result, which matches the Copilot default case
-// (parallel_tool_calls=true means multiple Read/Bash outputs per turn). For
-// ≥ 500k models the tiers are so generous that compression is effectively
-// inert for any realistic session — see compressToolHistory.test.ts.
-export function getTiers(effectiveWindow: number): Tiers {
-  if (effectiveWindow < 16_000) return { recent: 2, mid: 3 }
-  if (effectiveWindow < 32_000) return { recent: 3, mid: 5 }
-  if (effectiveWindow < 64_000) return { recent: 4, mid: 8 }
-  if (effectiveWindow < 128_000) return { recent: 5, mid: 10 }
-  if (effectiveWindow < 256_000) return { recent: 8, mid: 15 }
-  if (effectiveWindow < 500_000) return { recent: 12, mid: 25 }
-  return { recent: 25, mid: 50 }
-}
-
-function extractText(content: unknown): string {
-  if (typeof content === 'string') return content
-  if (Array.isArray(content)) {
-    return content
-      .filter(
-        (b: { type?: string; text?: string }) =>
-          b?.type === 'text' && typeof b.text === 'string',
-      )
-      .map((b: { text?: string }) => b.text ?? '')
-      .join('\n')
-  }
-  return ''
-}
-
-// Old-tier compression strategy. Replaces content entirely with a one-line
-// metadata marker ~10× more token-efficient than a 500-char truncation AND
-// unambiguous — partial truncations can look authoritative to the model. The
-// stub format encodes tool name + args so the model can re-invoke the same
-// tool if it needs the omitted output back.
-function buildStub(
-  block: ToolResultBlock,
-  toolUsesById: Map<string, ToolUseBlock>,
-): ToolResultBlock {
-  const original = extractText(block.content)
-  const toolUse = toolUsesById.get(block.tool_use_id ?? '')
-  const name = toolUse?.name ?? 'tool'
-  const args = toolUse?.input
-    ? JSON.stringify(toolUse.input).slice(0, STUB_ARGS_MAX_CHARS)
-    : '{}'
-  return {
-    ...block,
-    content: [
-      {
-        type: 'text',
-        text: `[${name} args=${args} → ${original.length} chars omitted]`,
-      },
-    ],
-  }
-}
-
-// Mid-tier compression. The trailing marker is load-bearing: without it, the
-// model can't distinguish "tool returned 2000 chars" from "tool returned 20k
-// chars that we cut to 2000". Distinguishing those matters for the model's
-// decision to re-invoke the tool.
-function truncateBlock(
-  block: ToolResultBlock,
-  maxChars: number,
-): ToolResultBlock {
-  const text = extractText(block.content)
-  if (text.length <= maxChars) return block
-  const omitted = text.length - maxChars
-  return {
-    ...block,
-    content: [
-      {
-        type: 'text',
-        text: `${text.slice(0, maxChars)}\n[…truncated ${omitted} chars from tool history]`,
-      },
-    ],
-  }
-}
-
-function getInner(msg: AnyMessage): { role?: string; content?: unknown } {
-  return (msg.message ?? msg) as { role?: string; content?: unknown }
-}
-
-function indexToolUses(messages: AnyMessage[]): Map<string, ToolUseBlock> {
-  const map = new Map<string, ToolUseBlock>()
-  for (const msg of messages) {
-    const content = getInner(msg).content
-    if (!Array.isArray(content)) continue
-    for (const b of content as Array<{ type?: string; id?: string }>) {
-      if (b?.type === 'tool_use' && b.id) {
-        map.set(b.id, b as ToolUseBlock)
-      }
-    }
-  }
-  return map
-}
-
-function indexToolResultMessages(messages: AnyMessage[]): number[] {
-  const indices: number[] = []
-  for (let i = 0; i < messages.length; i++) {
-    const inner = getInner(messages[i])
-    const role = inner.role ?? messages[i].role
-    const content = inner.content
-    if (
-      role === 'user' &&
-      Array.isArray(content) &&
-      content.some((b: { type?: string }) => b?.type === 'tool_result')
-    ) {
-      indices.push(i)
-    }
-  }
-  return indices
-}
-
-function rewriteMessage<T extends AnyMessage>(
-  msg: T,
-  newContent: unknown[],
-): T {
-  if (msg.message) {
-    return { ...msg, message: { ...msg.message, content: newContent } }
-  }
-  return { ...msg, content: newContent }
-}
-
-// microCompact.maybeTimeBasedMicrocompact may have already replaced old
-// tool_result content with TOOL_RESULT_CLEARED_MESSAGE before we see it.
-// Re-compressing produces a stub over a marker (e.g. `[Read args={} → 40
-// chars omitted]`), wasteful and less informative than the canonical marker.
-function isAlreadyCleared(block: ToolResultBlock): boolean {
-  const text = extractText(block.content)
-  return text === TOOL_RESULT_CLEARED_MESSAGE
-}
-
-function shouldCompressBlock(
-  block: ToolResultBlock,
-  toolUsesById: Map<string, ToolUseBlock>,
-): boolean {
-  if (isAlreadyCleared(block)) return false
-  const toolUse = toolUsesById.get(block.tool_use_id ?? '')
-  // Unknown tool name (orphan tool_result with no matching tool_use) falls
-  // through to compression with a generic "tool" stub. Safer default: the
-  // original tool_use vanished so there's no downstream use for the output.
-  if (!toolUse?.name) return true
-  // Respect microCompact's curated safe-to-compress set (Read/Bash/Grep/…/
-  // mcp__*) so user-facing flow tools (Task, Agent, custom) stay intact.
-  return isCompactableTool(toolUse.name)
-}
-
-export function compressToolHistory<T extends AnyMessage>(
-  messages: T[],
-  model: string,
-): T[] {
-  // Master kill-switch. Returns the original reference so callers skip a
-  // defensive copy when the feature is disabled.
-  if (!getGlobalConfig().toolHistoryCompressionEnabled) return messages
-
-  const tiers = getTiers(getEffectiveContextWindowSize(model))
-
-  const toolResultIndices = indexToolResultMessages(messages)
-  const total = toolResultIndices.length
-  // If every tool-result fits in the recent tier, no boundary crosses; return
-  // the same reference for the same copy-elision reason.
-  if (total <= tiers.recent) return messages
-
-  // O(1) lookup: messageIndex → tool-result position (0 = oldest). Replaces
-  // the naive Array.indexOf(i) that was O(n²) across the .map below.
-  const positionByIndex = new Map<number, number>()
-  for (let pos = 0; pos < toolResultIndices.length; pos++) {
-    positionByIndex.set(toolResultIndices[pos], pos)
-  }
-
-  const toolUsesById = indexToolUses(messages)
-
-  return messages.map((msg, i) => {
-    const pos = positionByIndex.get(i)
-    if (pos === undefined) return msg
-
-    const fromEnd = total - 1 - pos
-    if (fromEnd < tiers.recent) return msg
-
-    const inMidWindow = fromEnd < tiers.recent + tiers.mid
-    const content = getInner(msg).content as unknown[]
-    const newContent = content.map(block => {
-      const b = block as { type?: string }
-      if (b?.type !== 'tool_result') return block
-      const tr = block as ToolResultBlock
-      if (!shouldCompressBlock(tr, toolUsesById)) return block
-      return inMidWindow
-        ? truncateBlock(tr, MID_MAX_CHARS)
-        : buildStub(tr, toolUsesById)
-    })
-
-    return rewriteMessage(msg, newContent)
-  })
-}
--- a/src/services/api/openaiErrorClassification.ts
+++ b/src/services/api/openaiErrorClassification.ts
@@ -320,7 +320,10 @@ export function classifyOpenAIHttpFailure(options: {
    }
  }

-  if (options.status >= 400 && isMalformedProviderResponse(body)) {
+  if (
+    (options.status >= 200 && options.status < 300 && isMalformedProviderResponse(body)) ||
+    (options.status >= 400 && isMalformedProviderResponse(body))
+  ) {
    return {
      source: 'http',
      category: 'malformed_provider_response',
--- a/src/services/api/openaiShim.compression.test.ts
+++ b/src/services/api/openaiShim.compression.test.ts
@@ -1,317 +0,0 @@
-import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
-import { createOpenAIShimClient } from './openaiShim.js'
-
-type FetchType = typeof globalThis.fetch
-const originalFetch = globalThis.fetch
-
-const originalEnv = {
-  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
-  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
-  OPENAI_MODEL: process.env.OPENAI_MODEL,
-}
-
-// Mock config + autoCompact so the shim sees deterministic state.
-const mockState = {
-  enabled: true,
-  effectiveWindow: 100_000, // Copilot gpt-4o tier
-}
-
-mock.module('../../utils/config.js', () => ({
-  getGlobalConfig: () => ({
-    toolHistoryCompressionEnabled: mockState.enabled,
-    autoCompactEnabled: false,
-  }),
-}))
-
-mock.module('../compact/autoCompact.js', () => ({
-  getEffectiveContextWindowSize: () => mockState.effectiveWindow,
-}))
-
-type OpenAIShimClient = {
-  beta: {
-    messages: {
-      create: (
-        params: Record<string, unknown>,
-        options?: Record<string, unknown>,
-      ) => Promise<unknown>
-    }
-  }
-}
-
-function bigText(n: number): string {
-  return 'A'.repeat(n)
-}
-
-function buildToolExchange(id: number, resultLength: number) {
-  return [
-    {
-      role: 'assistant',
-      content: [
-        {
-          type: 'tool_use',
-          id: `toolu_${id}`,
-          name: 'Read',
-          input: { file_path: `/path/to/file${id}.ts` },
-        },
-      ],
-    },
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'tool_result',
-          tool_use_id: `toolu_${id}`,
-          content: bigText(resultLength),
-        },
-      ],
-    },
-  ]
-}
-
-function buildLongConversation(numExchanges: number, resultLength = 5_000) {
-  const out: Array<{ role: string; content: unknown }> = [
-    { role: 'user', content: 'start the work' },
-  ]
-  for (let i = 0; i < numExchanges; i++) {
-    out.push(...buildToolExchange(i, resultLength))
-  }
-  return out
-}
-
-function makeFakeResponse(): Response {
-  return new Response(
-    JSON.stringify({
-      id: 'chatcmpl-1',
-      model: 'gpt-4o',
-      choices: [
-        {
-          message: { role: 'assistant', content: 'done' },
-          finish_reason: 'stop',
-        },
-      ],
-      usage: { prompt_tokens: 8, completion_tokens: 2, total_tokens: 10 },
-    }),
-    { headers: { 'Content-Type': 'application/json' } },
-  )
-}
-
-beforeEach(() => {
-  process.env.OPENAI_BASE_URL = 'http://example.test/v1'
-  process.env.OPENAI_API_KEY = 'test-key'
-  delete process.env.OPENAI_MODEL
-  mockState.enabled = true
-  mockState.effectiveWindow = 100_000
-})
-
-afterEach(() => {
-  if (originalEnv.OPENAI_BASE_URL === undefined) delete process.env.OPENAI_BASE_URL
-  else process.env.OPENAI_BASE_URL = originalEnv.OPENAI_BASE_URL
-  if (originalEnv.OPENAI_API_KEY === undefined) delete process.env.OPENAI_API_KEY
-  else process.env.OPENAI_API_KEY = originalEnv.OPENAI_API_KEY
-  if (originalEnv.OPENAI_MODEL === undefined) delete process.env.OPENAI_MODEL
-  else process.env.OPENAI_MODEL = originalEnv.OPENAI_MODEL
-  globalThis.fetch = originalFetch
-})
-
-async function captureRequestBody(
-  messages: Array<{ role: string; content: unknown }>,
-  model: string,
-): Promise<Record<string, unknown>> {
-  let captured: Record<string, unknown> | undefined
-
-  globalThis.fetch = (async (_input, init) => {
-    captured = JSON.parse(String(init?.body))
-    return makeFakeResponse()
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-  await client.beta.messages.create({
-    model,
-    system: 'system prompt',
-    messages,
-  })
-
-  if (!captured) throw new Error('request not captured')
-  return captured
-}
-
-function getToolMessages(body: Record<string, unknown>): Array<{ content: string }> {
-  const messages = body.messages as Array<{ role: string; content: string }>
-  return messages.filter(m => m.role === 'tool')
-}
-
-function getAssistantToolCalls(body: Record<string, unknown>): unknown[] {
-  const messages = body.messages as Array<{
-    role: string
-    tool_calls?: unknown[]
-  }>
-  return messages
-    .filter(m => m.role === 'assistant' && Array.isArray(m.tool_calls))
-    .flatMap(m => m.tool_calls ?? [])
-}
-
-// ============================================================================
-// BUG REPRO: without compression, full tool history is resent every turn
-// ============================================================================
-
-test('BUG REPRO: without compression, all 30 tool results are sent at full size', async () => {
-  mockState.enabled = false
-  const messages = buildLongConversation(30, 5_000)
-
-  const body = await captureRequestBody(messages, 'gpt-4o')
-  const toolMessages = getToolMessages(body)
-  const payloadSize = JSON.stringify(body).length
-
-  // All 30 tool results present, none truncated
-  expect(toolMessages.length).toBe(30)
-  for (const m of toolMessages) {
-    expect(m.content.length).toBeGreaterThanOrEqual(5_000)
-    expect(m.content).not.toContain('[…truncated')
-    expect(m.content).not.toContain('chars omitted')
-  }
-
-  // Total payload is large (~150KB raw) — this is the cost being paid every turn
-  expect(payloadSize).toBeGreaterThan(150_000)
-})
-
-// ============================================================================
-// FIX: with compression, recent kept full, mid truncated, old stubbed
-// ============================================================================
-
-test('FIX: with compression on Copilot gpt-4o (tier 5/10/rest), 30 turns shrinks dramatically', async () => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 100_000 // 64–128k → recent=5, mid=10
-  const messages = buildLongConversation(30, 5_000)
-
-  const body = await captureRequestBody(messages, 'gpt-4o')
-  const toolMessages = getToolMessages(body)
-  const payloadSize = JSON.stringify(body).length
-
-  // Structure preserved: still 30 tool messages, no orphan tool_calls
-  expect(toolMessages.length).toBe(30)
-  expect(getAssistantToolCalls(body).length).toBe(30)
-
-  // Tier breakdown (oldest → newest):
-  //   indices 0..14  → old tier (stubs)
-  //   indices 15..24 → mid tier (truncated)
-  //   indices 25..29 → recent (full)
-  for (let i = 0; i <= 14; i++) {
-    expect(toolMessages[i].content).toMatch(/^\[Read args=.*chars omitted\]$/)
-  }
-  for (let i = 15; i <= 24; i++) {
-    expect(toolMessages[i].content).toContain('[…truncated')
-  }
-  for (let i = 25; i <= 29; i++) {
-    expect(toolMessages[i].content.length).toBe(5_000)
-    expect(toolMessages[i].content).not.toContain('[…truncated')
-    expect(toolMessages[i].content).not.toContain('chars omitted')
-  }
-
-  // Significant reduction: from ~150KB to <60KB (10 mid×2KB + structure overhead)
-  expect(payloadSize).toBeLessThan(60_000)
-})
-
-// ============================================================================
-// FIX: large-context model gets generous tiers — compression effectively inert
-// ============================================================================
-
-test('FIX: gpt-4.1 (1M context) with 25 exchanges keeps all full (recent tier=25)', async () => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 1_000_000 // ≥500k → recent=25, mid=50
-  const messages = buildLongConversation(25, 5_000)
-
-  const body = await captureRequestBody(messages, 'gpt-4.1')
-  const toolMessages = getToolMessages(body)
-
-  expect(toolMessages.length).toBe(25)
-  for (const m of toolMessages) {
-    expect(m.content.length).toBe(5_000)
-    expect(m.content).not.toContain('[…truncated')
-    expect(m.content).not.toContain('chars omitted')
-  }
-})
-
-test('FIX: gpt-4.1 (1M context) with 30 exchanges → only first 5 mid-truncated', async () => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 1_000_000 // recent=25, mid=50
-  const messages = buildLongConversation(30, 5_000)
-
-  const body = await captureRequestBody(messages, 'gpt-4.1')
-  const toolMessages = getToolMessages(body)
-
-  // 30 total: indices 0..4 mid, indices 5..29 recent
-  for (let i = 0; i < 5; i++) {
-    expect(toolMessages[i].content).toContain('[…truncated')
-  }
-  for (let i = 5; i < 30; i++) {
-    expect(toolMessages[i].content.length).toBe(5_000)
-  }
-})
-
-// ============================================================================
-// FIX: stub preserves tool name and args — model can re-invoke if needed
-// ============================================================================
-
-test('FIX: stub format includes original tool name and arguments', async () => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 100_000
-  const messages = buildLongConversation(30, 5_000)
-
-  const body = await captureRequestBody(messages, 'gpt-4o')
-  const toolMessages = getToolMessages(body)
-  const oldestStub = toolMessages[0].content
-
-  // Format: [<tool_name> args=<json> → <N> chars omitted]
-  expect(oldestStub).toMatch(/^\[Read /)
-  expect(oldestStub).toMatch(/file_path/)
-  expect(oldestStub).toMatch(/→ 5000 chars omitted\]$/)
-})
-
-// ============================================================================
-// FIX: tool_use blocks (assistant tool_calls) are never modified
-// ============================================================================
-
-test('FIX: every tool_call retains its full id, name, and arguments', async () => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 100_000
-  const messages = buildLongConversation(30, 5_000)
-
-  const body = await captureRequestBody(messages, 'gpt-4o')
-  const toolCalls = getAssistantToolCalls(body) as Array<{
-    id: string
-    function: { name: string; arguments: string }
-  }>
-
-  expect(toolCalls.length).toBe(30)
-  for (let i = 0; i < toolCalls.length; i++) {
-    expect(toolCalls[i].id).toBe(`toolu_${i}`)
-    expect(toolCalls[i].function.name).toBe('Read')
-    expect(JSON.parse(toolCalls[i].function.arguments)).toEqual({
-      file_path: `/path/to/file${i}.ts`,
-    })
-  }
-})
-
-// ============================================================================
-// FIX: small-context provider (Mistral 32k) gets aggressive compression
-// ============================================================================
-
-test('FIX: 32k window (Mistral tier) → recent=3 keeps last 3 only', async () => {
-  mockState.enabled = true
-  mockState.effectiveWindow = 24_000 // 16–32k → recent=3, mid=5
-  const messages = buildLongConversation(15, 3_000)
-
-  const body = await captureRequestBody(messages, 'mistral-large-latest')
-  const toolMessages = getToolMessages(body)
-
-  // 15 total: indices 0..6 old, 7..11 mid, 12..14 recent
-  for (let i = 0; i <= 6; i++) {
-    expect(toolMessages[i].content).toContain('chars omitted')
-  }
-  for (let i = 7; i <= 11; i++) {
-    expect(toolMessages[i].content).toContain('[…truncated')
-  }
-  for (let i = 12; i <= 14; i++) {
-    expect(toolMessages[i].content.length).toBe(3_000)
-  }
-})
--- a/src/services/api/openaiShim.diagnostics.test.ts
+++ b/src/services/api/openaiShim.diagnostics.test.ts
@@ -117,170 +117,3 @@ test('redacts credentials in transport diagnostic URL logs', async () => {
  expect(logLine).not.toContain('user:supersecret')
  expect(logLine).not.toContain('supersecret@')
 })
-test('logs self-heal localhost fallback with redacted from/to URLs', async () => {
-  const debugSpy = mock(() => {})
-  mock.module('../../utils/debug.js', () => ({
-    logForDebugging: debugSpy,
-  }))
-
-  const nonce = `${Date.now()}-${Math.random()}`
-  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
-
-  process.env.OPENAI_BASE_URL = 'http://user:supersecret@localhost:11434/v1'
-  process.env.OPENAI_API_KEY = 'supersecret'
-
-  globalThis.fetch = mock(async (input: string | Request) => {
-    const url = typeof input === 'string' ? input : input.url
-    if (url.includes('localhost')) {
-      throw Object.assign(new TypeError('fetch failed'), {
-        code: 'ENOTFOUND',
-      })
-    }
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'qwen2.5-coder:7b',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'ok',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 5,
-          completion_tokens: 2,
-          total_tokens: 7,
-        },
-      }),
-      {
-        status: 200,
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as typeof globalThis.fetch
-
-  const client = createOpenAIShimClient({}) as {
-    beta: {
-      messages: {
-        create: (params: Record<string, unknown>) => Promise<unknown>
-      }
-    }
-  }
-
-  await expect(
-    client.beta.messages.create({
-      model: 'qwen2.5-coder:7b',
-      messages: [{ role: 'user', content: 'hello' }],
-      max_tokens: 64,
-      stream: false,
-    }),
-  ).resolves.toBeDefined()
-
-  const fallbackLog = debugSpy.mock.calls.find(call =>
-    typeof call?.[0] === 'string' &&
-    call[0].includes('self-heal retry reason=localhost_resolution_failed'),
-  )
-
-  expect(fallbackLog).toBeDefined()
-  const logLine = String(fallbackLog?.[0])
-  expect(logLine).toContain('from=http://redacted:redacted@localhost:11434/v1/chat/completions')
-  expect(logLine).toContain('to=http://redacted:redacted@127.0.0.1:11434/v1/chat/completions')
-  expect(logLine).not.toContain('supersecret')
-})
-
-test('logs self-heal toolless retry for local tool-call incompatibility', async () => {
-  const debugSpy = mock(() => {})
-  mock.module('../../utils/debug.js', () => ({
-    logForDebugging: debugSpy,
-  }))
-
-  const nonce = `${Date.now()}-${Math.random()}`
-  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
-
-  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
-  process.env.OPENAI_API_KEY = 'ollama'
-
-  let callCount = 0
-  globalThis.fetch = mock(async () => {
-    callCount += 1
-    if (callCount === 1) {
-      return new Response('tool_calls are not supported', {
-        status: 400,
-        headers: {
-          'Content-Type': 'text/plain',
-        },
-      })
-    }
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'qwen2.5-coder:7b',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'ok',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 7,
-          completion_tokens: 3,
-          total_tokens: 10,
-        },
-      }),
-      {
-        status: 200,
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as typeof globalThis.fetch
-
-  const client = createOpenAIShimClient({}) as {
-    beta: {
-      messages: {
-        create: (params: Record<string, unknown>) => Promise<unknown>
-      }
-    }
-  }
-
-  await expect(
-    client.beta.messages.create({
-      model: 'qwen2.5-coder:7b',
-      messages: [{ role: 'user', content: 'hello' }],
-      tools: [
-        {
-          name: 'Read',
-          description: 'Read file',
-          input_schema: {
-            type: 'object',
-            properties: {
-              filePath: { type: 'string' },
-            },
-            required: ['filePath'],
-          },
-        },
-      ],
-      max_tokens: 64,
-      stream: false,
-    }),
-  ).resolves.toBeDefined()
-
-  const fallbackLog = debugSpy.mock.calls.find(call =>
-    typeof call?.[0] === 'string' &&
-    call[0].includes('self-heal retry reason=tool_call_incompatible mode=toolless'),
-  )
-
-  expect(fallbackLog).toBeDefined()
-  expect(fallbackLog?.[1]).toEqual({ level: 'warn' })
-})
--- a/src/services/api/openaiShim.test.ts
+++ b/src/services/api/openaiShim.test.ts
@@ -2931,204 +2931,6 @@ test('classifies chat-completions endpoint 404 failures with endpoint_not_found
    }),
  ).rejects.toThrow('openai_category=endpoint_not_found')
 })
-test('self-heals localhost resolution failures by retrying local loopback base URL', async () => {
-  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
-
-  const requestUrls: string[] = []
-  globalThis.fetch = (async (input, _init) => {
-    const url = typeof input === 'string' ? input : input.url
-    requestUrls.push(url)
-
-    if (url.includes('localhost')) {
-      const error = Object.assign(new TypeError('fetch failed'), {
-        code: 'ENOTFOUND',
-      })
-      throw error
-    }
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'qwen2.5-coder:7b',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'hello from loopback',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 4,
-          completion_tokens: 3,
-          total_tokens: 7,
-        },
-      }),
-      {
-        status: 200,
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
-  await expect(
-    client.beta.messages.create({
-      model: 'qwen2.5-coder:7b',
-      messages: [{ role: 'user', content: 'hello' }],
-      max_tokens: 64,
-      stream: false,
-    }),
-  ).resolves.toBeDefined()
-
-  expect(requestUrls[0]).toBe('http://localhost:11434/v1/chat/completions')
-  expect(requestUrls).toContain('http://127.0.0.1:11434/v1/chat/completions')
-})
-
-test('self-heals local endpoint_not_found by retrying with /v1 base URL', async () => {
-  process.env.OPENAI_BASE_URL = 'http://localhost:11434'
-
-  const requestUrls: string[] = []
-  globalThis.fetch = (async (input, _init) => {
-    const url = typeof input === 'string' ? input : input.url
-    requestUrls.push(url)
-
-    if (url === 'http://localhost:11434/chat/completions') {
-      return new Response('Not Found', {
-        status: 404,
-        headers: {
-          'Content-Type': 'text/plain',
-        },
-      })
-    }
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'qwen2.5-coder:7b',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'hello from /v1',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 5,
-          completion_tokens: 2,
-          total_tokens: 7,
-        },
-      }),
-      {
-        status: 200,
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
-  await expect(
-    client.beta.messages.create({
-      model: 'qwen2.5-coder:7b',
-      messages: [{ role: 'user', content: 'hello' }],
-      max_tokens: 64,
-      stream: false,
-    }),
-  ).resolves.toBeDefined()
-
-  expect(requestUrls).toEqual([
-    'http://localhost:11434/chat/completions',
-    'http://localhost:11434/v1/chat/completions',
-  ])
-})
-
-test('self-heals tool-call incompatibility by retrying local Ollama requests without tools', async () => {
-  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
-
-  const requestBodies: Array<Record<string, unknown>> = []
-  globalThis.fetch = (async (_input, init) => {
-    const requestBody = JSON.parse(String(init?.body)) as Record<string, unknown>
-    requestBodies.push(requestBody)
-
-    if (requestBodies.length === 1) {
-      return new Response('tool_calls are not supported', {
-        status: 400,
-        headers: {
-          'Content-Type': 'text/plain',
-        },
-      })
-    }
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'qwen2.5-coder:7b',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'fallback without tools',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 8,
-          completion_tokens: 4,
-          total_tokens: 12,
-        },
-      }),
-      {
-        status: 200,
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
-  await expect(
-    client.beta.messages.create({
-      model: 'qwen2.5-coder:7b',
-      messages: [{ role: 'user', content: 'hello' }],
-      tools: [
-        {
-          name: 'Read',
-          description: 'Read a file',
-          input_schema: {
-            type: 'object',
-            properties: {
-              filePath: { type: 'string' },
-            },
-            required: ['filePath'],
-          },
-        },
-      ],
-      max_tokens: 64,
-      stream: false,
-    }),
-  ).resolves.toBeDefined()
-
-  expect(requestBodies).toHaveLength(2)
-  expect(Array.isArray(requestBodies[0]?.tools)).toBe(true)
-  expect(requestBodies[0]?.tool_choice).toBeUndefined()
-  expect(
-    requestBodies[1]?.tools === undefined ||
-      (Array.isArray(requestBodies[1]?.tools) && requestBodies[1]?.tools.length === 0),
-  ).toBe(true)
-  expect(requestBodies[1]?.tool_choice).toBeUndefined()
-})

 test('preserves valid tool_result and drops orphan tool_result', async () => {
  let requestBody: Record<string, unknown> | undefined
@@ -3197,7 +2999,7 @@ test('preserves valid tool_result and drops orphan tool_result', async () => {
          {
            role: 'user',
            content: 'What happened?',
-          },
+          }
        ],
      },
    ],
@@ -3206,526 +3008,134 @@ test('preserves valid tool_result and drops orphan tool_result', async () => {
  })

  const messages = requestBody?.messages as Array<Record<string, unknown>>
-
+  
  // Should have: system, user, assistant (tool_use), tool (valid_call_1), user
  // Should NOT have: tool (orphan_call_2)
-
+  
  const toolMessages = messages.filter(m => m.role === 'tool')
  expect(toolMessages.length).toBe(1)
  expect(toolMessages[0].tool_call_id).toBe('valid_call_1')
-
+  
  const orphanMessage = toolMessages.find(m => m.tool_call_id === 'orphan_call_2')
  expect(orphanMessage).toBeUndefined()
-  
-  // Actually, the semantic message IS injected here because the user block with orphan 
-  // tool result is converted to:
-  // 1. Tool result (valid_call_1) -> role 'tool'
-  // 2. User content ("What happened?") -> role 'user'
-  // This triggers the tool -> assistant injection.
-  const assistantMessages = messages.filter(m => m.role === 'assistant')
-  expect(assistantMessages.some(m => m.content === '[Tool execution interrupted by user]')).toBe(true)
 })

-test('drops empty assistant message when only thinking block was present and stripped', async () => {
+test('request body does not contain store field for local providers', async () => {
+  process.env.CLAUDE_CODE_USE_OPENAI = '1'
+  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
  let requestBody: Record<string, unknown> | undefined

  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
-    return new Response(JSON.stringify({
-      id: 'chatcmpl-1',
-      object: 'chat.completion',
-      created: 123456789,
-      model: 'mistral-large-latest',
-      choices: [{ message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
-      usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }
-    }), { headers: { 'Content-Type': 'application/json' } })
+    return new Response(
+      JSON.stringify({
+        id: 'chatcmpl-1',
+        object: 'chat.completion',
+        model: 'test-model',
+        choices: [{ index: 0, message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' }],
+        usage: { prompt_tokens: 10, completion_tokens: 2, total_tokens: 12 },
+      }),
+      { headers: { 'Content-Type': 'application/json' } },
+    )
  }) as FetchType

-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
+  const client = createOpenAIShimClient({ defaultHeaders: {} }) as unknown as OpenAIShimClient
  await client.beta.messages.create({
-    model: 'mistral-large-latest',
+    model: 'some-model',
+    messages: [{ role: 'user', content: [{ type: 'text', text: 'hi' }] }],
+    max_tokens: 64,
+    stream: false,
+  })
+
+  expect(requestBody).toBeDefined()
+  expect('store' in requestBody!).toBe(false)
+})
+
+test('preserves reasoning_content on assistant messages with tool_calls during replay', async () => {
+  process.env.CLAUDE_CODE_USE_OPENAI = '1'
+  let requestBody: Record<string, unknown> | undefined
+
+  globalThis.fetch = (async (_input, init) => {
+    requestBody = JSON.parse(String(init?.body))
+    return new Response(
+      JSON.stringify({
+        id: 'chatcmpl-1',
+        object: 'chat.completion',
+        model: 'test-model',
+        choices: [{ index: 0, message: { role: 'assistant', content: 'done' }, finish_reason: 'stop' }],
+        usage: { prompt_tokens: 10, completion_tokens: 2, total_tokens: 12 },
+      }),
+      { headers: { 'Content-Type': 'application/json' } },
+    )
+  }) as FetchType
+
+  const client = createOpenAIShimClient({ defaultHeaders: {} }) as unknown as OpenAIShimClient
+  await client.beta.messages.create({
+    model: 'kimi-k2.5',
    messages: [
-      { role: 'user', content: 'Initial' },
-      { role: 'assistant', content: [{ type: 'thinking', thinking: 'I am thinking...', signature: 'sig' }] },
-      { role: 'user', content: 'Interrupting query' },
+      { role: 'user', content: [{ type: 'text', text: 'read file' }] },
+      {
+        role: 'assistant',
+        content: [
+          { type: 'thinking', thinking: 'I should use the read tool' },
+          { type: 'tool_use', id: 'call_1', name: 'Read', input: { file_path: 'test.ts' } },
+        ],
+      },
+      {
+        role: 'user',
+        content: [
+          { type: 'tool_result', tool_use_id: 'call_1', content: 'file contents here' },
+        ],
+      },
    ],
    max_tokens: 64,
    stream: false,
  })

  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  // The assistant msg is dropped because thinking is stripped.
-  // The two user messages are coalesced.
-  expect(messages.length).toBe(1)
-  expect(messages[0].role).toBe('user')
-  expect(String(messages[0].content)).toContain('Initial')
-  expect(String(messages[0].content)).toContain('Interrupting query')
+  const assistantMsg = messages.find(m => m.role === 'assistant' && m.tool_calls)
+  expect(assistantMsg).toBeDefined()
+  expect(assistantMsg!.reasoning_content).toBe('I should use the read tool')
 })

-test('injects semantic assistant message when tool result is followed by user message', async () => {
+test('does not add reasoning_content on assistant messages without tool_calls', async () => {
+  process.env.CLAUDE_CODE_USE_OPENAI = '1'
  let requestBody: Record<string, unknown> | undefined

-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-    return new Response(JSON.stringify({
-      id: 'chatcmpl-2',
-      object: 'chat.completion',
-      created: 123456789,
-      model: 'mistral-large-latest',
-      choices: [{ message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
-      usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }
-    }), { headers: { 'Content-Type': 'application/json' } })
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
-  await client.beta.messages.create({
-    model: 'mistral-large-latest',
-    messages: [
-      { 
-        role: 'assistant', 
-        content: [{ type: 'tool_use', id: 'call_1', name: 'search', input: {} }] 
-      },
-      { 
-        role: 'user', 
-        content: [
-          { type: 'tool_result', tool_use_id: 'call_1', content: 'Result' }
-        ] 
-      },
-      { role: 'user', content: 'Next user query' },
-    ],
-    max_tokens: 64,
-    stream: false,
-  })
-
-  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  // Roles should be: assistant (tool_calls) -> tool -> assistant (semantic) -> user
-  const roles = messages.map(m => m.role)
-  expect(roles).toEqual(['assistant', 'tool', 'assistant', 'user'])
-  
-  const semanticMsg = messages[2]
-  expect(semanticMsg.role).toBe('assistant')
-  expect(semanticMsg.content).toBe('[Tool execution interrupted by user]')
-})
-
-test('Moonshot: uses max_tokens (not max_completion_tokens) and strips store', async () => {
-  process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
-  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
-
-  let requestBody: Record<string, unknown> | undefined
  globalThis.fetch = (async (_input, init) => {
    requestBody = JSON.parse(String(init?.body))
    return new Response(
      JSON.stringify({
        id: 'chatcmpl-1',
-        model: 'kimi-k2.6',
-        choices: [
-          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
-        ],
-        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
+        object: 'chat.completion',
+        model: 'test-model',
+        choices: [{ index: 0, message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' }],
+        usage: { prompt_tokens: 10, completion_tokens: 2, total_tokens: 12 },
      }),
      { headers: { 'Content-Type': 'application/json' } },
    )
  }) as FetchType

-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-  await client.beta.messages.create({
-    model: 'kimi-k2.6',
-    system: 'you are kimi',
-    messages: [{ role: 'user', content: 'hi' }],
-    max_tokens: 256,
-    stream: false,
-  })
-
-  expect(requestBody?.max_tokens).toBe(256)
-  expect(requestBody?.max_completion_tokens).toBeUndefined()
-  expect(requestBody?.store).toBeUndefined()
-})
-
-test('Moonshot: echoes reasoning_content on assistant tool-call messages', async () => {
-  // Regression for: "API Error: 400 {"error":{"message":"thinking is enabled
-  // but reasoning_content is missing in assistant tool call message at index
-  // N"}}" when the agent sends a prior-turn assistant response back to Kimi.
-  // The thinking block captured from the inbound response must round-trip
-  // as reasoning_content on the outgoing echoed assistant message.
-  process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
-  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
-
-  let requestBody: Record<string, unknown> | undefined
-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'kimi-k2.6',
-        choices: [
-          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
-        ],
-        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
-      }),
-      { headers: { 'Content-Type': 'application/json' } },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-  await client.beta.messages.create({
-    model: 'kimi-k2.6',
-    system: 'you are kimi',
-    messages: [
-      { role: 'user', content: 'check the logs' },
-      {
-        role: 'assistant',
-        content: [
-          {
-            type: 'thinking',
-            thinking: 'Need to inspect logs via Bash; running a cat.',
-          },
-          { type: 'text', text: "I'll inspect the logs." },
-          {
-            type: 'tool_use',
-            id: 'call_bash_1',
-            name: 'Bash',
-            input: { command: 'cat /tmp/app.log' },
-          },
-        ],
-      },
-      {
-        role: 'user',
-        content: [
-          {
-            type: 'tool_result',
-            tool_use_id: 'call_bash_1',
-            content: 'log line 1\nlog line 2',
-          },
-        ],
-      },
-    ],
-    max_tokens: 256,
-    stream: false,
-  })
-
-  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  const assistantWithToolCall = messages.find(
-    m => m.role === 'assistant' && Array.isArray(m.tool_calls),
-  )
-  expect(assistantWithToolCall).toBeDefined()
-  expect(assistantWithToolCall?.reasoning_content).toBe(
-    'Need to inspect logs via Bash; running a cat.',
-  )
-})
-
-test('non-Moonshot providers do NOT receive reasoning_content on assistant messages', async () => {
-  // Guard: only Moonshot opts in. DeepSeek/OpenRouter/etc. receive the
-  // outgoing assistant message without reasoning_content to avoid
-  // unknown-field rejections from strict servers.
-  process.env.OPENAI_BASE_URL = 'https://api.deepseek.com/v1'
-  process.env.OPENAI_API_KEY = 'sk-deepseek'
-
-  let requestBody: Record<string, unknown> | undefined
-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'deepseek-chat',
-        choices: [
-          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
-        ],
-        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
-      }),
-      { headers: { 'Content-Type': 'application/json' } },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-  await client.beta.messages.create({
-    model: 'deepseek-chat',
-    system: 'test',
-    messages: [
-      { role: 'user', content: 'hi' },
-      {
-        role: 'assistant',
-        content: [
-          { type: 'thinking', thinking: 'thought' },
-          { type: 'text', text: 'hello' },
-          {
-            type: 'tool_use',
-            id: 'call_1',
-            name: 'Bash',
-            input: { command: 'ls' },
-          },
-        ],
-      },
-      {
-        role: 'user',
-        content: [
-          { type: 'tool_result', tool_use_id: 'call_1', content: 'files' },
-        ],
-      },
-    ],
-    max_tokens: 32,
-    stream: false,
-  })
-
-  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  const assistantWithToolCall = messages.find(
-    m => m.role === 'assistant' && Array.isArray(m.tool_calls),
-  )
-  expect(assistantWithToolCall).toBeDefined()
-  expect(assistantWithToolCall?.reasoning_content).toBeUndefined()
-})
-
-test('Moonshot: cn host is also detected', async () => {
-  process.env.OPENAI_BASE_URL = 'https://api.moonshot.cn/v1'
-  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
-
-  let requestBody: Record<string, unknown> | undefined
-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'kimi-k2.6',
-        choices: [
-          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
-        ],
-        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
-      }),
-      { headers: { 'Content-Type': 'application/json' } },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-  await client.beta.messages.create({
-    model: 'kimi-k2.6',
-    system: 'you are kimi',
-    messages: [{ role: 'user', content: 'hi' }],
-    max_tokens: 256,
-    stream: false,
-  })
-
-  expect(requestBody?.store).toBeUndefined()
-})
-
-
-test('collapses multiple text blocks in tool_result to string for DeepSeek compatibility (issue #774)', async () => {
-  let requestBody: Record<string, unknown> | undefined
-
-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'deepseek-reasoner',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'done',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 12,
-          completion_tokens: 4,
-          total_tokens: 16,
-        },
-      }),
-      {
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
+  const client = createOpenAIShimClient({ defaultHeaders: {} }) as unknown as OpenAIShimClient
  await client.beta.messages.create({
    model: 'deepseek-reasoner',
-    system: 'test system',
    messages: [
-      { role: 'user', content: 'Run ls' },
+      { role: 'user', content: [{ type: 'text', text: 'explain' }] },
      {
        role: 'assistant',
        content: [
-          {
-            type: 'tool_use',
-            id: 'call_1',
-            name: 'Bash',
-            input: { command: 'ls' },
-          },
-        ],
-      },
-      {
-        role: 'user',
-        content: [
-          {
-            type: 'tool_result',
-            tool_use_id: 'call_1',
-            content: [
-              { type: 'text', text: 'line one' },
-              { type: 'text', text: 'line two' },
-            ],
-          },
+          { type: 'thinking', thinking: 'Let me think about this' },
+          { type: 'text', text: 'Here is the explanation' },
        ],
      },
+      { role: 'user', content: [{ type: 'text', text: 'thanks' }] },
    ],
    max_tokens: 64,
    stream: false,
  })

  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  const toolMessages = messages.filter(m => m.role === 'tool')
-  expect(toolMessages.length).toBe(1)
-  expect(toolMessages[0].tool_call_id).toBe('call_1')
-  expect(typeof toolMessages[0].content).toBe('string')
-  expect(toolMessages[0].content).toBe('line one\n\nline two')
-})
-
-test('collapses multiple text blocks into a single string for DeepSeek compatibility (issue #774)', async () => {
-  let requestBody: Record<string, unknown> | undefined
-
-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'deepseek-reasoner',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'done',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 12,
-          completion_tokens: 4,
-          total_tokens: 16,
-        },
-      }),
-      {
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
-  await client.beta.messages.create({
-    model: 'deepseek-reasoner',
-    system: 'test system',
-    messages: [
-      {
-        role: 'user',
-        content: [
-          { type: 'text', text: 'Hello!' },
-          { type: 'text', text: 'How are you?' },
-        ],
-      },
-    ],
-    max_tokens: 64,
-    stream: false,
-  })
-
-  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  expect(messages.length).toBe(2) // system + user
-  expect(messages[1].role).toBe('user')
-  expect(typeof messages[1].content).toBe('string')
-  expect(messages[1].content).toBe('Hello!\n\nHow are you?')
-})
-
-test('preserves mixed text and image tool results as multipart content', async () => {
-  let requestBody: Record<string, unknown> | undefined
-
-  globalThis.fetch = (async (_input, init) => {
-    requestBody = JSON.parse(String(init?.body))
-
-    return new Response(
-      JSON.stringify({
-        id: 'chatcmpl-1',
-        model: 'gpt-4o',
-        choices: [
-          {
-            message: {
-              role: 'assistant',
-              content: 'done',
-            },
-            finish_reason: 'stop',
-          },
-        ],
-        usage: {
-          prompt_tokens: 12,
-          completion_tokens: 4,
-          total_tokens: 16,
-        },
-      }),
-      {
-        headers: {
-          'Content-Type': 'application/json',
-        },
-      },
-    )
-  }) as FetchType
-
-  const client = createOpenAIShimClient({}) as OpenAIShimClient
-
-  await client.beta.messages.create({
-    model: 'gpt-4o',
-    system: 'test system',
-    messages: [
-      { role: 'user', content: 'Show me' },
-      {
-        role: 'assistant',
-        content: [
-          {
-            type: 'tool_use',
-            id: 'call_1',
-            name: 'Bash',
-            input: { command: 'cat image.png' },
-          },
-        ],
-      },
-      {
-        role: 'user',
-        content: [
-          {
-            type: 'tool_result',
-            tool_use_id: 'call_1',
-            content: [
-              { type: 'text', text: 'Here is the image:' },
-              {
-                type: 'image',
-                source: {
-                  type: 'base64',
-                  media_type: 'image/png',
-                  data: 'iVBORw0KGgo=',
-                },
-              },
-            ],
-          },
-        ],
-      },
-    ],
-    max_tokens: 64,
-    stream: false,
-  })
-
-  const messages = requestBody?.messages as Array<Record<string, unknown>>
-  const toolMessages = messages.filter(m => m.role === 'tool')
-  expect(toolMessages.length).toBe(1)
-  expect(Array.isArray(toolMessages[0].content)).toBe(true)
-  const content = toolMessages[0].content as Array<Record<string, unknown>>
-  expect(content.length).toBe(2)
-  expect(content[0].type).toBe('text')
-  expect(content[1].type).toBe('image_url')
-})
+  const assistantMsg = messages.find(m => m.role === 'assistant' && !m.tool_calls)
+  expect(assistantMsg).toBeDefined()
+  expect(assistantMsg!.reasoning_content).toBeUndefined()
+})
--- a/src/services/api/openaiShim.ts
+++ b/src/services/api/openaiShim.ts
@@ -46,15 +46,12 @@ import {
  type AnthropicUsage,
  type ShimCreateParams,
 } from './codexShim.js'
-import { compressToolHistory } from './compressToolHistory.js'
 import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
 import {
-  getLocalProviderRetryBaseUrls,
-  getGithubEndpointType,
  isLocalProviderUrl,
  resolveRuntimeCodexCredentials,
  resolveProviderRequest,
-  shouldAttemptLocalToollessRetry,
+  getGithubEndpointType,
 } from './providerConfig.js'
 import {
  buildOpenAICompatibilityErrorMessage,
@@ -67,8 +64,6 @@ import {
  normalizeToolArguments,
  hasToolFieldMapping,
 } from './toolArgumentNormalization.js'
-import { logApiCallStart, logApiCallEnd } from '../../utils/requestLogging.js'
-import { createStreamState, processStreamChunk, getStreamStats } from '../../utils/streamingOptimizer.js'

 type SecretValueSource = Partial<{
  OPENAI_API_KEY: string
@@ -84,10 +79,6 @@ const GITHUB_429_MAX_RETRIES = 3
 const GITHUB_429_BASE_DELAY_SEC = 1
 const GITHUB_429_MAX_DELAY_SEC = 32
 const GEMINI_API_HOST = 'generativelanguage.googleapis.com'
-const MOONSHOT_API_HOSTS = new Set([
-  'api.moonshot.ai',
-  'api.moonshot.cn',
-])

 const COPILOT_HEADERS: Record<string, string> = {
  'User-Agent': 'GitHubCopilotChat/0.26.7',
@@ -153,15 +144,6 @@ function hasGeminiApiHost(baseUrl: string | undefined): boolean {
  }
 }

-function isMoonshotBaseUrl(baseUrl: string | undefined): boolean {
-  if (!baseUrl) return false
-  try {
-    return MOONSHOT_API_HOSTS.has(new URL(baseUrl).hostname.toLowerCase())
-  } catch {
-    return false
-  }
-}
-
 function formatRetryAfterHint(response: Response): string {
  const ra = response.headers.get('retry-after')
  return ra ? ` (Retry-After: ${ra})` : ''
@@ -210,6 +192,7 @@ function sleepMs(ms: number): Promise<void> {
 interface OpenAIMessage {
  role: 'system' | 'user' | 'assistant' | 'tool'
  content?: string | Array<{ type: string; text?: string; image_url?: { url: string } }>
+  reasoning_content?: string
  tool_calls?: Array<{
    id: string
    type: 'function'
@@ -218,14 +201,6 @@ interface OpenAIMessage {
  }>
  tool_call_id?: string
  name?: string
-  /**
-   * Per-assistant-message chain-of-thought, attached when echoing an
-   * assistant message back to providers that require it (notably Moonshot:
-   * "thinking is enabled but reasoning_content is missing in assistant
-   * tool call message at index N" 400). Derived from the Anthropic thinking
-   * block captured when the original response was translated.
-   */
-  reasoning_content?: string
 }

 interface OpenAITool {
@@ -301,15 +276,6 @@ function convertToolResultContent(
    const text = parts[0].text ?? ''
    return isError ? `Error: ${text}` : text
  }
-
-  // Collapse arrays of only text blocks into a single string for DeepSeek
-  // compatibility (issue #774). DeepSeek rejects arrays in role: "tool" messages.
-  const allText = parts.every(p => p.type === 'text')
-  if (allText) {
-    const text = parts.map(p => p.text ?? '').join('\n\n')
-    return isError ? `Error: ${text}` : text
-  }
-
  if (isError && parts[0]?.type === 'text') {
    parts[0] = { ...parts[0], text: `Error: ${parts[0].text ?? ''}` }
  } else if (isError) {
@@ -368,14 +334,6 @@ function convertContentBlocks(

  if (parts.length === 0) return ''
  if (parts.length === 1 && parts[0].type === 'text') return parts[0].text ?? ''
-
-  // Collapse arrays of only text blocks into a single string for DeepSeek
-  // compatibility (issue #774).
-  const allText = parts.every(p => p.type === 'text')
-  if (allText) {
-    return parts.map(p => p.text ?? '').join('\n\n')
-  }
-
  return parts
 }

@@ -387,45 +345,19 @@ function isGeminiMode(): boolean {
 }

 function convertMessages(
-  messages: Array<{
-    role: string
-    message?: { role?: string; content?: unknown }
-    content?: unknown
-  }>,
+  messages: Array<{ role: string; message?: { role?: string; content?: unknown }; content?: unknown }>,
  system: unknown,
-  options?: { preserveReasoningContent?: boolean },
 ): OpenAIMessage[] {
-  const preserveReasoningContent = options?.preserveReasoningContent === true
  const result: OpenAIMessage[] = []
  const knownToolCallIds = new Set<string>()

-  // Pre-scan for all tool results in the history to identify valid tool calls
-  const toolResultIds = new Set<string>()
-  for (const msg of messages) {
-    const inner = msg.message ?? msg
-    const content = (inner as { content?: unknown }).content
-    if (Array.isArray(content)) {
-      for (const block of content) {
-        if (
-          (block as { type?: string }).type === 'tool_result' &&
-          (block as { tool_use_id?: string }).tool_use_id
-        ) {
-          toolResultIds.add((block as { tool_use_id: string }).tool_use_id)
-        }
-      }
-    }
-  }
-
  // System message first
  const sysText = convertSystemPrompt(system)
  if (sysText) {
    result.push({ role: 'system', content: sysText })
  }

-  for (let i = 0; i < messages.length; i++) {
-    const msg = messages[i]
-    const isLastInHistory = i === messages.length - 1
-
+  for (const msg of messages) {
    // Claude Code wraps messages in { role, message: { role, content } }
    const inner = msg.message ?? msg
    const role = (inner as { role?: string }).role ?? msg.role
@@ -434,12 +366,8 @@ function convertMessages(
    if (role === 'user') {
      // Check for tool_result blocks in user messages
      if (Array.isArray(content)) {
-        const toolResults = content.filter(
-          (b: { type?: string }) => b.type === 'tool_result',
-        )
-        const otherContent = content.filter(
-          (b: { type?: string }) => b.type !== 'tool_result',
-        )
+        const toolResults = content.filter((b: { type?: string }) => b.type === 'tool_result')
+        const otherContent = content.filter((b: { type?: string }) => b.type !== 'tool_result')

        // Emit tool results as tool messages, but ONLY if we have a matching tool_use ID.
        // Mistral/OpenAI strictly require tool messages to follow an assistant message with tool_calls.
@@ -454,9 +382,7 @@ function convertMessages(
              content: convertToolResultContent(tr.content, tr.is_error),
            })
          } else {
-            logForDebugging(
-              `Dropping orphan tool_result for ID: ${id} to prevent API error`,
-            )
+            logForDebugging(`Dropping orphan tool_result for ID: ${id} to prevent API error`)
          }
        }

@@ -476,12 +402,8 @@ function convertMessages(
    } else if (role === 'assistant') {
      // Check for tool_use blocks
      if (Array.isArray(content)) {
-        const toolUses = content.filter(
-          (b: { type?: string }) => b.type === 'tool_use',
-        )
-        const thinkingBlock = content.find(
-          (b: { type?: string }) => b.type === 'thinking',
-        )
+        const toolUses = content.filter((b: { type?: string }) => b.type === 'tool_use')
+        const thinkingBlock = content.find((b: { type?: string }) => b.type === 'thinking')
        const textContent = content.filter(
          (b: { type?: string }) => b.type !== 'tool_use' && b.type !== 'thinking',
        )
@@ -490,123 +412,80 @@ function convertMessages(
          role: 'assistant',
          content: (() => {
            const c = convertContentBlocks(textContent)
-            return typeof c === 'string'
-              ? c
-              : Array.isArray(c)
-                ? c.map((p: { text?: string }) => p.text ?? '').join('')
-                : ''
+            return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
          })(),
        }

-        // Providers that validate reasoning continuity (Moonshot: "thinking
-        // is enabled but reasoning_content is missing in assistant tool call
-        // message at index N" 400) need the original chain-of-thought echoed
-        // back on each assistant message that carries a tool_call. We kept
-        // the thinking block on the Anthropic side; re-attach it here as the
-        // `reasoning_content` field on the outgoing OpenAI-shaped message.
-        // Gated per-provider because other endpoints either ignore the field
-        // (harmless) or strict-reject unknown fields (harmful).
-        if (preserveReasoningContent) {
-          const thinkingText = (thinkingBlock as { thinking?: string } | undefined)?.thinking
-          if (typeof thinkingText === 'string' && thinkingText.trim().length > 0) {
-            assistantMsg.reasoning_content = thinkingText
-          }
-        }
-
        if (toolUses.length > 0) {
-          const mappedToolCalls = toolUses
-            .map(
-              (tu: {
-                id?: string
-                name?: string
-                input?: unknown
-                extra_content?: Record<string, unknown>
-                signature?: string
-              }) => {
-                const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`
+          // Preserve thinking text as reasoning_content for providers that
+          // require it on replayed assistant tool-call messages (e.g. Kimi,
+          // DeepSeek). Without this, follow-up requests fail with 400:
+          // "reasoning_content is missing in assistant tool call message".
+          // Note: only the first thinking block per turn is captured (.find);
+          // Anthropic's API typically produces one thinking block per turn.
+          if (thinkingBlock) {
+            assistantMsg.reasoning_content = (thinkingBlock as { thinking?: string }).thinking ?? ''
+          }

-                // Only keep tool calls that have a corresponding result in the history,
-                // or if it's the last message (prefill scenario).
-                // Orphaned tool calls (e.g. from user interruption) cause 400 errors.
-                if (!toolResultIds.has(id) && !isLastInHistory) {
-                  return null
-                }
+          assistantMsg.tool_calls = toolUses.map(
+            (tu: {
+              id?: string
+              name?: string
+              input?: unknown
+              extra_content?: Record<string, unknown>
+              signature?: string
+            }) => {
+              const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`
+              knownToolCallIds.add(id)
+              const toolCall: NonNullable<OpenAIMessage['tool_calls']>[number] = {
+                id,
+                type: 'function' as const,
+                function: {
+                  name: tu.name ?? 'unknown',
+                  arguments:
+                    typeof tu.input === 'string'
+                      ? tu.input
+                      : JSON.stringify(tu.input ?? {}),
+                },
+              }

-                knownToolCallIds.add(id)
-                const toolCall: NonNullable<
-                  OpenAIMessage['tool_calls']
-                >[number] = {
-                  id,
-                  type: 'function' as const,
-                  function: {
-                    name: tu.name ?? 'unknown',
-                    arguments:
-                      typeof tu.input === 'string'
-                        ? tu.input
-                        : JSON.stringify(tu.input ?? {}),
-                  },
-                }
+              // Preserve existing extra_content if present
+              if (tu.extra_content) {
+                toolCall.extra_content = { ...tu.extra_content }
+              }

-                // Preserve existing extra_content if present
-                if (tu.extra_content) {
-                  toolCall.extra_content = { ...tu.extra_content }
-                }
+              // Handle Gemini thought_signature
+              if (isGeminiMode()) {
+                // If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
+                // Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
+                // The API requires the same signature on every replayed function call part in a parallel set.
+                const signature = tu.signature ?? (thinkingBlock as any)?.signature

-                // Handle Gemini thought_signature
-                if (isGeminiMode()) {
-                  // If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
-                  // Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
-                  // The API requires the same signature on every replayed function call part in a parallel set.
-                  const signature =
-                    tu.signature ?? (thinkingBlock as any)?.signature
-
-                  // Merge into existing google-specific metadata if present
-                  const existingGoogle =
-                    (toolCall.extra_content?.google as Record<
-                      string,
-                      unknown
-                    >) ?? {}
-                  toolCall.extra_content = {
-                    ...toolCall.extra_content,
-                    google: {
-                      ...existingGoogle,
-                      thought_signature:
-                        signature ?? 'skip_thought_signature_validator',
-                    },
+                // Merge into existing google-specific metadata if present
+                const existingGoogle = (toolCall.extra_content?.google as Record<string, unknown>) ?? {}
+                toolCall.extra_content = {
+                  ...toolCall.extra_content,
+                  google: {
+                    ...existingGoogle,
+                    thought_signature: signature ?? "skip_thought_signature_validator"
                  }
                }
+              }

-                return toolCall
-              },
-            )
-            .filter((tc): tc is NonNullable<typeof tc> => tc !== null)
-
-          if (mappedToolCalls.length > 0) {
-            assistantMsg.tool_calls = mappedToolCalls
-          }
+              return toolCall
+            },
+          )
        }

-        // Only push assistant message if it has content or tool calls.
-        // Stripped thinking-only blocks from user interruptions are empty and cause 400s.
-        if (assistantMsg.content || assistantMsg.tool_calls?.length) {
-          result.push(assistantMsg)
-        }
+        result.push(assistantMsg)
      } else {
-        const assistantMsg: OpenAIMessage = {
+        result.push({
          role: 'assistant',
          content: (() => {
            const c = convertContentBlocks(content)
-            return typeof c === 'string'
-              ? c
-              : Array.isArray(c)
-                ? c.map((p: { text?: string }) => p.text ?? '').join('')
-                : ''
+            return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
          })(),
-        }
-
-        if (assistantMsg.content) {
-          result.push(assistantMsg)
-        }
+        })
      }
    }
  }
@@ -620,56 +499,25 @@ function convertMessages(
  for (const msg of result) {
    const prev = coalesced[coalesced.length - 1]

-    // Mistral/Devstral: 'tool' message must be followed by an 'assistant' message.
-    // If a 'tool' result is followed by a 'user' message, we must inject a semantic
-    // assistant response to satisfy the strict role sequence:
-    // ... -> assistant (calls) -> tool (results) -> assistant (semantic) -> user (next)
-    if (prev && prev.role === 'tool' && msg.role === 'user') {
-      coalesced.push({
-        role: 'assistant',
-        content: '[Tool execution interrupted by user]',
-      })
-    }
-
-    const lastAfterPossibleInjection = coalesced[coalesced.length - 1]
-    if (
-      lastAfterPossibleInjection &&
-      lastAfterPossibleInjection.role === msg.role &&
-      msg.role !== 'tool' &&
-      msg.role !== 'system'
-    ) {
-      const prevContent = lastAfterPossibleInjection.content
+    if (prev && prev.role === msg.role && msg.role !== 'tool' && msg.role !== 'system') {
+      const prevContent = prev.content
      const curContent = msg.content

      if (typeof prevContent === 'string' && typeof curContent === 'string') {
-        lastAfterPossibleInjection.content =
-          prevContent + (prevContent && curContent ? '\n' : '') + curContent
+        prev.content = prevContent + (prevContent && curContent ? '\n' : '') + curContent
      } else {
        const toArray = (
-          c:
-            | string
-            | Array<{ type: string; text?: string; image_url?: { url: string } }>
-            | undefined,
-        ): Array<{
-          type: string
-          text?: string
-          image_url?: { url: string }
-        }> => {
+          c: string | Array<{ type: string; text?: string; image_url?: { url: string } }> | undefined,
+        ): Array<{ type: string; text?: string; image_url?: { url: string } }> => {
          if (!c) return []
          if (typeof c === 'string') return c ? [{ type: 'text', text: c }] : []
          return c
        }
-        lastAfterPossibleInjection.content = [
-          ...toArray(prevContent),
-          ...toArray(curContent),
-        ]
+        prev.content = [...toArray(prevContent), ...toArray(curContent)]
      }

      if (msg.tool_calls?.length) {
-        lastAfterPossibleInjection.tool_calls = [
-          ...(lastAfterPossibleInjection.tool_calls ?? []),
-          ...msg.tool_calls,
-        ]
+        prev.tool_calls = [...(prev.tool_calls ?? []), ...msg.tool_calls]
      }
    } else {
      coalesced.push(msg)
@@ -884,7 +732,6 @@ async function* openaiStreamToAnthropic(
  let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null
  let hasEmittedFinalUsage = false
  let hasProcessedFinishReason = false
-  const streamState = createStreamState()

  // Emit message_start
  yield {
@@ -1048,7 +895,6 @@ async function* openaiStreamToAnthropic(
              delta: { type: 'text_delta', text: visible },
            }
          }
-          processStreamChunk(streamState, delta.content)
        }

        // Tool calls
@@ -1068,7 +914,6 @@ async function* openaiStreamToAnthropic(
              const toolBlockIndex = contentBlockIndex
              const initialArguments = tc.function.arguments ?? ''
              const normalizeAtStop = hasToolFieldMapping(tc.function.name)
-              processStreamChunk(streamState, tc.function.arguments ?? '')
              activeToolCalls.set(tc.index, {
                id: tc.id,
                name: tc.function.name,
@@ -1266,20 +1111,6 @@ async function* openaiStreamToAnthropic(
    reader.releaseLock()
  }

-  const stats = getStreamStats(streamState)
-  if (stats.totalChunks > 0) {
-    logForDebugging(
-      JSON.stringify({
-        type: 'stream_stats',
-        model,
-        total_chunks: stats.totalChunks,
-        first_token_ms: stats.firstTokenMs,
-        duration_ms: stats.durationMs,
-      }),
-      { level: 'debug' },
-    )
-  }
-
  yield { type: 'message_stop' }
 }

@@ -1477,20 +1308,14 @@ class OpenAIShimMessages {
    params: ShimCreateParams,
    options?: { signal?: AbortSignal; headers?: Record<string, string> },
  ): Promise<Response> {
-    const compressedMessages = compressToolHistory(
+    const openaiMessages = convertMessages(
      params.messages as Array<{
        role: string
        message?: { role?: string; content?: unknown }
        content?: unknown
      }>,
-      request.resolvedModel,
+      params.system,
    )
-    const openaiMessages = convertMessages(compressedMessages, params.system, {
-      // Moonshot requires every assistant tool-call message to carry
-      // reasoning_content when its thinking feature is active. Echo it back
-      // from the thinking block we captured on the inbound response.
-      preserveReasoningContent: isMoonshotBaseUrl(request.baseUrl),
-    })

    const body: Record<string, unknown> = {
      model: request.resolvedModel,
@@ -1526,19 +1351,15 @@ class OpenAIShimMessages {
    const isGithubCopilot = isGithub && githubEndpointType === 'copilot'
    const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom')

-    const isMoonshot = isMoonshotBaseUrl(request.baseUrl)
-
-    if ((isGithub || isMistral || isLocal || isMoonshot) && body.max_completion_tokens !== undefined) {
+    if ((isGithub || isMistral || isLocal) && body.max_completion_tokens !== undefined) {
      body.max_tokens = body.max_completion_tokens
      delete body.max_completion_tokens
    }

-    // mistral and gemini don't recognize body.store — Gemini returns 400
-    // "Invalid JSON payload received. Unknown name 'store': Cannot find field."
-    // Moonshot (api.moonshot.ai/.cn) has not published support for the
-    // parameter either; strip it preemptively to avoid the same class of
-    // error on strict-parse providers.
-    if (isMistral || isGeminiMode() || isMoonshot) {
+    // Strip store for providers that don't recognize it. Only OpenAI's own
+    // API supports this field — Gemini returns 400, local servers (vLLM,
+    // Ollama) reject unknown fields, and other providers silently ignore it.
+    if (isMistral || isGeminiMode() || isLocal) {
      delete body.store
    }

@@ -1618,95 +1439,48 @@ class OpenAIShimMessages {
      headers['X-GitHub-Api-Version'] = '2022-11-28'
    }

-    const buildChatCompletionsUrl = (baseUrl: string): string => {
-      // Azure Cognitive Services / Azure OpenAI require a deployment-specific
-      // path and an api-version query parameter.
-      if (isAzure) {
-        const apiVersion = process.env.AZURE_OPENAI_API_VERSION ?? '2024-12-01-preview'
-        const deployment = request.resolvedModel ?? process.env.OPENAI_MODEL ?? 'gpt-4o'
-
-        // If base URL already contains /deployments/, use it as-is with api-version.
-        if (/\/deployments\//i.test(baseUrl)) {
-          const normalizedBase = baseUrl.replace(/\/+$/, '')
-          return `${normalizedBase}/chat/completions?api-version=${apiVersion}`
-        }
-
-        // Strip trailing /v1 or /openai/v1 if present, then build Azure path.
-        const normalizedBase = baseUrl
-          .replace(/\/(openai\/)?v1\/?$/, '')
-          .replace(/\/+$/, '')
-
-        return `${normalizedBase}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`
+    // Build the chat completions URL
+    // Azure Cognitive Services / Azure OpenAI require a deployment-specific path
+    // and an api-version query parameter.
+    // Standard format: {base}/openai/deployments/{model}/chat/completions?api-version={version}
+    // Non-Azure: {base}/chat/completions
+    let chatCompletionsUrl: string
+    if (isAzure) {
+      const apiVersion = process.env.AZURE_OPENAI_API_VERSION ?? '2024-12-01-preview'
+      const deployment = request.resolvedModel ?? process.env.OPENAI_MODEL ?? 'gpt-4o'
+      // If base URL already contains /deployments/, use it as-is with api-version
+      if (/\/deployments\//i.test(request.baseUrl)) {
+        const base = request.baseUrl.replace(/\/+$/, '')
+        chatCompletionsUrl = `${base}/chat/completions?api-version=${apiVersion}`
+      } else {
+        // Strip trailing /v1 or /openai/v1 if present, then build Azure path
+        const base = request.baseUrl.replace(/\/(openai\/)?v1\/?$/, '').replace(/\/+$/, '')
+        chatCompletionsUrl = `${base}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`
      }
-
-      return `${baseUrl}/chat/completions`
+    } else {
+      chatCompletionsUrl = `${request.baseUrl}/chat/completions`
    }

-    const localRetryBaseUrls = isLocal
-      ? getLocalProviderRetryBaseUrls(request.baseUrl)
-      : []
-
-    let activeBaseUrl = request.baseUrl
-    let chatCompletionsUrl = buildChatCompletionsUrl(activeBaseUrl)
-    const attemptedLocalBaseUrls = new Set<string>([activeBaseUrl])
-    let didRetryWithoutTools = false
-
-    const promoteNextLocalBaseUrl = (
-      reason: 'endpoint_not_found' | 'localhost_resolution_failed',
-    ): boolean => {
-      for (const candidateBaseUrl of localRetryBaseUrls) {
-        if (attemptedLocalBaseUrls.has(candidateBaseUrl)) {
-          continue
-        }
-
-        const previousUrl = chatCompletionsUrl
-        attemptedLocalBaseUrls.add(candidateBaseUrl)
-        activeBaseUrl = candidateBaseUrl
-        chatCompletionsUrl = buildChatCompletionsUrl(activeBaseUrl)
-
-        logForDebugging(
-          `[OpenAIShim] self-heal retry reason=${reason} method=POST from=${redactUrlForDiagnostics(previousUrl)} to=${redactUrlForDiagnostics(chatCompletionsUrl)} model=${request.resolvedModel}`,
-          { level: 'warn' },
-        )
-
-        return true
-      }
-
-      return false
-    }
-
-    let serializedBody = JSON.stringify(body)
-
-    const refreshSerializedBody = (): void => {
-      serializedBody = JSON.stringify(body)
-    }
-
-    const buildFetchInit = () => ({
+    const fetchInit = {
      method: 'POST' as const,
      headers,
-      body: serializedBody,
+      body: JSON.stringify(body),
      signal: options?.signal,
-    })
+    }

-    const maxSelfHealAttempts = isLocal
-      ? localRetryBaseUrls.length + 1
-      : 0
-    const maxAttempts = (isGithub ? GITHUB_429_MAX_RETRIES : 1) + maxSelfHealAttempts
+    const maxAttempts = isGithub ? GITHUB_429_MAX_RETRIES : 1

    const throwClassifiedTransportError = (
      error: unknown,
      requestUrl: string,
-      preclassifiedFailure?: ReturnType<typeof classifyOpenAINetworkFailure>,
    ): never => {
      if (options?.signal?.aborted) {
        throw error
      }

-      const failure =
-        preclassifiedFailure ??
-        classifyOpenAINetworkFailure(error, {
-          url: requestUrl,
-        })
+      const failure = classifyOpenAINetworkFailure(error, {
+        url: requestUrl,
+      })
      const redactedUrl = redactUrlForDiagnostics(requestUrl)
      const safeMessage =
        redactSecretValueForDisplay(
@@ -1737,14 +1511,11 @@ class OpenAIShimMessages {
      responseHeaders: Headers,
      requestUrl: string,
      rateHint = '',
-      preclassifiedFailure?: ReturnType<typeof classifyOpenAIHttpFailure>,
    ): never => {
-      const failure =
-        preclassifiedFailure ??
-        classifyOpenAIHttpFailure({
-          status,
-          body: errorBody,
-        })
+      const failure = classifyOpenAIHttpFailure({
+        status,
+        body: errorBody,
+      })
      const redactedUrl = redactUrlForDiagnostics(requestUrl)

      logForDebugging(
@@ -1764,21 +1535,12 @@ class OpenAIShimMessages {
    }

    let response: Response | undefined
-    const provider = request.baseUrl.includes('nvidia') ? 'nvidia-nim'
-      : request.baseUrl.includes('minimax') ? 'minimax'
-      : request.baseUrl.includes('localhost:11434') || request.baseUrl.includes('localhost:11435') ? 'ollama'
-      : request.baseUrl.includes('anthropic') ? 'anthropic'
-      : 'openai'
-    const { correlationId, startTime } = logApiCallStart(provider, request.resolvedModel)
    for (let attempt = 0; attempt < maxAttempts; attempt++) {
      try {
-        response = await fetchWithProxyRetry(
-          chatCompletionsUrl,
-          buildFetchInit(),
-        )
+        response = await fetchWithProxyRetry(chatCompletionsUrl, fetchInit)
      } catch (error) {
        const isAbortError =
-          options?.signal?.aborted === true ||
+          fetchInit.signal?.aborted === true ||
          (typeof DOMException !== 'undefined' &&
            error instanceof DOMException &&
            error.name === 'AbortError') ||
@@ -1791,36 +1553,10 @@ class OpenAIShimMessages {
          throw error
        }

-        const failure = classifyOpenAINetworkFailure(error, {
-          url: chatCompletionsUrl,
-        })
-
-        if (
-          isLocal &&
-          failure.category === 'localhost_resolution_failed' &&
-          promoteNextLocalBaseUrl('localhost_resolution_failed')
-        ) {
-          continue
-        }
-
-        throwClassifiedTransportError(error, chatCompletionsUrl, failure)
+        throwClassifiedTransportError(error, chatCompletionsUrl)
      }

      if (response.ok) {
-        let tokensIn = 0
-        let tokensOut = 0
-        // Skip clone() for streaming responses - it blocks until full body is received,
-        // defeating the purpose of streaming. Usage data is already sent via
-        // stream_options: { include_usage: true } and can be extracted from the stream.
-        if (!params.stream) {
-          try {
-            const clone = response.clone()
-            const data = await clone.json()
-            tokensIn = data.usage?.prompt_tokens ?? 0
-            tokensOut = data.usage?.completion_tokens ?? 0
-          } catch { /* ignore */ }
-        }
-        logApiCallEnd(correlationId, startTime, request.resolvedModel, 'success', tokensIn, tokensOut, false)
        return response
      }

@@ -1909,10 +1645,6 @@ class OpenAIShimMessages {
            return responsesResponse
          }
          const responsesErrorBody = await responsesResponse.text().catch(() => 'unknown error')
-          const responsesFailure = classifyOpenAIHttpFailure({
-            status: responsesResponse.status,
-            body: responsesErrorBody,
-          })
          let responsesErrorResponse: object | undefined
          try { responsesErrorResponse = JSON.parse(responsesErrorBody) } catch { /* raw text */ }
          throwClassifiedHttpError(
@@ -1921,49 +1653,10 @@ class OpenAIShimMessages {
            responsesErrorResponse,
            responsesResponse.headers,
            responsesUrl,
-            '',
-            responsesFailure,
          )
        }
      }

-      const failure = classifyOpenAIHttpFailure({
-        status: response.status,
-        body: errorBody,
-      })
-
-      if (
-        isLocal &&
-        failure.category === 'endpoint_not_found' &&
-        promoteNextLocalBaseUrl('endpoint_not_found')
-      ) {
-        continue
-      }
-
-      const hasToolsPayload =
-        Array.isArray(body.tools) &&
-        body.tools.length > 0
-
-      if (
-        !didRetryWithoutTools &&
-        failure.category === 'tool_call_incompatible' &&
-        shouldAttemptLocalToollessRetry({
-          baseUrl: activeBaseUrl,
-          hasTools: hasToolsPayload,
-        })
-      ) {
-        didRetryWithoutTools = true
-        delete body.tools
-        delete body.tool_choice
-        refreshSerializedBody()
-
-        logForDebugging(
-          `[OpenAIShim] self-heal retry reason=tool_call_incompatible mode=toolless method=POST url=${redactUrlForDiagnostics(chatCompletionsUrl)} model=${request.resolvedModel}`,
-          { level: 'warn' },
-        )
-        continue
-      }
-
      let errorResponse: object | undefined
      try { errorResponse = JSON.parse(errorBody) } catch { /* raw text */ }
      throwClassifiedHttpError(
@@ -1973,7 +1666,6 @@ class OpenAIShimMessages {
        response.headers as unknown as Headers,
        chatCompletionsUrl,
        rateHint,
-        failure,
      )
    }

--- a/src/services/api/providerConfig.local.test.ts
+++ b/src/services/api/providerConfig.local.test.ts
@@ -2,10 +2,8 @@ import { afterEach, expect, test } from 'bun:test'

 import {
  getAdditionalModelOptionsCacheScope,
-  getLocalProviderRetryBaseUrls,
  isLocalProviderUrl,
  resolveProviderRequest,
-  shouldAttemptLocalToollessRetry,
 } from './providerConfig.js'

 const originalEnv = {
@@ -85,42 +83,3 @@ test('skips local model cache scope for remote openai-compatible providers', ()

  expect(getAdditionalModelOptionsCacheScope()).toBeNull()
 })
-
-test('derives local retry base URLs with /v1 and loopback fallback candidates', () => {
-  expect(getLocalProviderRetryBaseUrls('http://localhost:11434')).toEqual([
-    'http://localhost:11434/v1',
-    'http://127.0.0.1:11434',
-    'http://127.0.0.1:11434/v1',
-  ])
-})
-
-test('does not derive local retry base URLs for remote providers', () => {
-  expect(getLocalProviderRetryBaseUrls('https://api.openai.com/v1')).toEqual([])
-})
-
-test('enables local toolless retry for likely Ollama endpoints with tools', () => {
-  expect(
-    shouldAttemptLocalToollessRetry({
-      baseUrl: 'http://localhost:11434/v1',
-      hasTools: true,
-    }),
-  ).toBe(true)
-})
-
-test('disables local toolless retry when no tools are present', () => {
-  expect(
-    shouldAttemptLocalToollessRetry({
-      baseUrl: 'http://localhost:11434/v1',
-      hasTools: false,
-    }),
-  ).toBe(false)
-})
-
-test('disables local toolless retry for non-Ollama local endpoints', () => {
-  expect(
-    shouldAttemptLocalToollessRetry({
-      baseUrl: 'http://localhost:1234/v1',
-      hasTools: true,
-    }),
-  ).toBe(false)
-})
--- a/src/services/api/providerConfig.ts
+++ b/src/services/api/providerConfig.ts
@@ -305,101 +305,6 @@ export function isLocalProviderUrl(baseUrl: string | undefined): boolean {
  }
 }

-function trimTrailingSlash(value: string): string {
-  return value.replace(/\/+$/, '')
-}
-
-function normalizePathWithV1(pathname: string): string {
-  const trimmed = trimTrailingSlash(pathname)
-  if (!trimmed || trimmed === '/') {
-    return '/v1'
-  }
-
-  if (trimmed.toLowerCase().endsWith('/v1')) {
-    return trimmed
-  }
-
-  return `${trimmed}/v1`
-}
-
-function isLikelyOllamaEndpoint(baseUrl: string): boolean {
-  try {
-    const parsed = new URL(baseUrl)
-    const hostname = parsed.hostname.toLowerCase()
-    const pathname = parsed.pathname.toLowerCase()
-
-    if (parsed.port === '11434') {
-      return true
-    }
-
-    return (
-      hostname.includes('ollama') ||
-      pathname.includes('ollama')
-    )
-  } catch {
-    return false
-  }
-}
-
-export function getLocalProviderRetryBaseUrls(baseUrl: string): string[] {
-  if (!isLocalProviderUrl(baseUrl)) {
-    return []
-  }
-
-  try {
-    const parsed = new URL(baseUrl)
-    const original = trimTrailingSlash(parsed.toString())
-    const seen = new Set<string>([original])
-    const candidates: string[] = []
-
-    const addCandidate = (hostname: string, pathname: string): void => {
-      const next = new URL(parsed.toString())
-      next.hostname = hostname
-      next.pathname = pathname
-      next.search = ''
-      next.hash = ''
-
-      const normalized = trimTrailingSlash(next.toString())
-      if (seen.has(normalized)) {
-        return
-      }
-
-      seen.add(normalized)
-      candidates.push(normalized)
-    }
-
-    const v1Pathname = normalizePathWithV1(parsed.pathname)
-    if (v1Pathname !== trimTrailingSlash(parsed.pathname)) {
-      addCandidate(parsed.hostname, v1Pathname)
-    }
-
-    const hostname = parsed.hostname.toLowerCase().replace(/^\[|\]$/g, '')
-    if (hostname === 'localhost' || hostname === '::1') {
-      addCandidate('127.0.0.1', parsed.pathname || '/')
-      addCandidate('127.0.0.1', v1Pathname)
-    }
-
-    return candidates
-  } catch {
-    return []
-  }
-}
-
-export function shouldAttemptLocalToollessRetry(options: {
-  baseUrl: string
-  hasTools: boolean
-}): boolean {
-  if (!options.hasTools) {
-    return false
-  }
-
-  if (!isLocalProviderUrl(options.baseUrl)) {
-    return false
-  }
-
-  return isLikelyOllamaEndpoint(options.baseUrl)
-}
-
 export function isCodexBaseUrl(baseUrl: string | undefined): boolean {
  if (!baseUrl) return false
  try {
@@ -507,9 +412,6 @@ export function resolveProviderRequest(options?: {
    ? normalizedGeminiEnvBaseUrl
    : asNamedEnvUrl(process.env.OPENAI_BASE_URL, 'OPENAI_BASE_URL')

-  // In Mistral mode, a literal "undefined" MISTRAL_BASE_URL is treated as
-  // misconfiguration and falls back to OPENAI_API_BASE, then
-  // DEFAULT_MISTRAL_BASE_URL for a safe default endpoint.
  const fallbackEnvBaseUrl = isMistralMode
    ? (primaryEnvBaseUrl === undefined
      ? asNamedEnvUrl(process.env.OPENAI_API_BASE, 'OPENAI_API_BASE') ?? DEFAULT_MISTRAL_BASE_URL
--- a/src/services/api/smartModelRouting.test.ts
+++ b/src/services/api/smartModelRouting.test.ts
@@ -1,191 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-
-import {
-  routeModel,
-  type SmartRoutingConfig,
-} from './smartModelRouting.ts'
-
-const ENABLED: SmartRoutingConfig = {
-  enabled: true,
-  simpleModel: 'claude-haiku-4-5',
-  strongModel: 'claude-opus-4-7',
-}
-
-describe('routeModel — disabled / misconfigured', () => {
-  test('disabled config routes to strong', () => {
-    const decision = routeModel(
-      { userText: 'hi' },
-      { ...ENABLED, enabled: false },
-    )
-    expect(decision.model).toBe('claude-opus-4-7')
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('disabled')
-  })
-
-  test('missing simpleModel falls back to strong', () => {
-    const decision = routeModel(
-      { userText: 'hi' },
-      { ...ENABLED, simpleModel: '' },
-    )
-    expect(decision.model).toBe('claude-opus-4-7')
-    expect(decision.complexity).toBe('strong')
-  })
-
-  test('simpleModel === strongModel routes to strong (no-op)', () => {
-    const decision = routeModel(
-      { userText: 'hi' },
-      { ...ENABLED, simpleModel: 'claude-opus-4-7' },
-    )
-    expect(decision.model).toBe('claude-opus-4-7')
-    expect(decision.complexity).toBe('strong')
-  })
-})
-
-describe('routeModel — simple path', () => {
-  test('short greeting routes to simple', () => {
-    const decision = routeModel({ userText: 'thanks!', turnNumber: 5 }, ENABLED)
-    expect(decision.model).toBe('claude-haiku-4-5')
-    expect(decision.complexity).toBe('simple')
-  })
-
-  test('empty input routes to simple', () => {
-    const decision = routeModel({ userText: '   ' }, ENABLED)
-    expect(decision.model).toBe('claude-haiku-4-5')
-    expect(decision.complexity).toBe('simple')
-  })
-
-  test('mid-length chatter routes to simple', () => {
-    const decision = routeModel(
-      { userText: 'yep looks good, go ahead', turnNumber: 10 },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('simple')
-  })
-})
-
-describe('routeModel — strong path', () => {
-  test('first turn always routes to strong, even when short', () => {
-    const decision = routeModel(
-      { userText: 'fix the bug', turnNumber: 1 },
-      ENABLED,
-    )
-    expect(decision.model).toBe('claude-opus-4-7')
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('first turn')
-  })
-
-  test('code fence routes to strong', () => {
-    const decision = routeModel(
-      {
-        userText: 'change this:\n```\nfoo()\n```',
-        turnNumber: 5,
-      },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('code')
-  })
-
-  test('inline code span routes to strong', () => {
-    const decision = routeModel(
-      { userText: 'rename `foo` to `bar`', turnNumber: 5 },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-  })
-
-  test('reasoning keyword "plan" routes to strong even when short', () => {
-    const decision = routeModel(
-      { userText: 'plan the refactor', turnNumber: 5 },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('keyword')
-  })
-
-  test('reasoning keyword "debug" routes to strong', () => {
-    const decision = routeModel(
-      { userText: 'debug the test', turnNumber: 5 },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-  })
-
-  test('"root cause" multi-word keyword routes to strong', () => {
-    const decision = routeModel(
-      { userText: 'find the root cause', turnNumber: 5 },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-  })
-
-  test('multi-paragraph input routes to strong', () => {
-    const decision = routeModel(
-      {
-        userText: 'first thought.\n\nsecond thought.',
-        turnNumber: 5,
-      },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('multi-paragraph')
-  })
-
-  test('over-long input routes to strong', () => {
-    const long = 'ok '.repeat(100) // ~300 chars, 100 words
-    const decision = routeModel(
-      { userText: long, turnNumber: 5 },
-      ENABLED,
-    )
-    expect(decision.complexity).toBe('strong')
-  })
-
-  test('exactly at the boundary stays simple', () => {
-    const text = 'a'.repeat(160)
-    const decision = routeModel(
-      { userText: text, turnNumber: 5 },
-      { ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
-    )
-    expect(decision.complexity).toBe('simple')
-  })
-
-  test('one char over the boundary routes to strong', () => {
-    const text = 'a'.repeat(161)
-    const decision = routeModel(
-      { userText: text, turnNumber: 5 },
-      { ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
-    )
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('160 chars')
-  })
-})
-
-describe('routeModel — config overrides', () => {
-  test('custom simpleMaxChars is honored', () => {
-    const decision = routeModel(
-      { userText: 'abcdefghijklmnop', turnNumber: 5 },
-      { ...ENABLED, simpleMaxChars: 10 },
-    )
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('10 chars')
-  })
-
-  test('custom simpleMaxWords is honored', () => {
-    const decision = routeModel(
-      { userText: 'one two three four five', turnNumber: 5 },
-      { ...ENABLED, simpleMaxWords: 3 },
-    )
-    expect(decision.complexity).toBe('strong')
-    expect(decision.reason).toContain('3 words')
-  })
-})
-
-describe('routeModel — reason strings', () => {
-  test('simple decisions include char + word counts', () => {
-    const decision = routeModel(
-      { userText: 'sounds good', turnNumber: 5 },
-      ENABLED,
-    )
-    expect(decision.reason).toMatch(/\d+ chars, \d+ words/)
-  })
-})
--- a/src/services/api/smartModelRouting.ts
+++ b/src/services/api/smartModelRouting.ts
@@ -1,215 +0,0 @@
-/**
- * Smart model routing — cheap-for-simple, strong-for-hard.
- *
- * For everyday short chatter ("ok", "thanks", "what does this do?") the
- * incremental quality of Opus/GPT-5 over Haiku/Mini is negligible while the
- * cost and latency are an order of magnitude worse. Smart routing opts a
- * user into routing such "obviously simple" turns to a cheaper model while
- * keeping the strong model for the anything-non-trivial path.
- *
- * This module is a pure primitive: it takes a turn description (the user's
- * text + light context) and returns which model to use, based on config.
- * It never reads env vars or state directly — caller supplies everything.
- *
- * Off by default. Users opt in via settings.smartRouting.enabled. Intent:
- * make this a copy-paste-small config block rather than a hidden heuristic,
- * so the tradeoff is visible and the user controls it.
- */
-
-export type SmartRoutingConfig = {
-  enabled: boolean
-  /** Model to use for turns classified as "simple". */
-  simpleModel: string
-  /** Model to use for turns classified as "strong" (or when unsure). */
-  strongModel: string
-  /** Max characters in user input to qualify as "simple". Default 160. */
-  simpleMaxChars?: number
-  /** Max whitespace-separated words to qualify as "simple". Default 28. */
-  simpleMaxWords?: number
-}
-
-export type RoutingDecision = {
-  model: string
-  complexity: 'simple' | 'strong'
-  /** Human-readable reason — useful for the UI indicator and debug logs. */
-  reason: string
-}
-
-export type RoutingInput = {
-  /** The user's message text for this turn. */
-  userText: string
-  /**
-   * Optional: how many tool-use blocks the assistant has emitted in the
-   * recent conversation. High values correlate with "continue this work"
-   * follow-ups that can still be cheap, UNLESS the user also typed code
-   * or strong-keyword text.
-   */
-  recentToolUses?: number
-  /**
-   * Optional: turn number within the current session (1-indexed). The first
-   * turn is often task-setup and benefits from the strong model even if
-   * short — a bare "build X" opens the whole task.
-   */
-  turnNumber?: number
-}
-
-const DEFAULT_SIMPLE_MAX_CHARS = 160
-const DEFAULT_SIMPLE_MAX_WORDS = 28
-
-// Keywords that strongly suggest reasoning/planning/design work.
-// Matching is word-boundary / case-insensitive. Must include enough anchors
-// that short prompts like "plan the refactor" route to strong even under
-// the char/word cutoff.
-const STRONG_KEYWORDS = [
-  'plan',
-  'design',
-  'architect',
-  'architecture',
-  'refactor',
-  'debug',
-  'investigate',
-  'analyze',
-  'analyse',
-  'implement',
-  'optimize',
-  'optimise',
-  'review',
-  'audit',
-  'diagnose',
-  'root cause',
-  'root-cause',
-  'why does',
-  'why is',
-  'how should',
-  'why did',
-  'propose',
-  'trace',
-  'reproduce',
-]
-
-const STRONG_KEYWORD_RE = new RegExp(
-  `\\b(?:${STRONG_KEYWORDS.map(k => k.replace(/[-]/g, '[-\\s]')).join('|')})\\b`,
-  'i',
-)
-
-const CODE_FENCE_RE = /```[\s\S]*?```|`[^`\n]+`/
-
-function countWords(text: string): number {
-  const trimmed = text.trim()
-  if (!trimmed) return 0
-  return trimmed.split(/\s+/).length
-}
-
-function hasMultiParagraph(text: string): boolean {
-  return /\n\s*\n/.test(text)
-}
-
-function hasCode(text: string): boolean {
-  return CODE_FENCE_RE.test(text)
-}
-
-function hasStrongKeyword(text: string): boolean {
-  return STRONG_KEYWORD_RE.test(text)
-}
-
-/**
- * Decide whether to route to the simple or strong model based on heuristics.
- * Returns the chosen model + a reason. When routing is disabled or both
- * models match, the strong model is used (safe default).
- */
-export function routeModel(
-  input: RoutingInput,
-  config: SmartRoutingConfig,
-): RoutingDecision {
-  if (!config.enabled) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'smart-routing disabled',
-    }
-  }
-  if (!config.simpleModel || !config.strongModel) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'simpleModel or strongModel missing from config',
-    }
-  }
-  if (config.simpleModel === config.strongModel) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'simpleModel equals strongModel',
-    }
-  }
-
-  const text = input.userText ?? ''
-  const trimmed = text.trim()
-
-  if (!trimmed) {
-    // Empty input (e.g. resuming a tool-use chain) — cheap by default.
-    return {
-      model: config.simpleModel,
-      complexity: 'simple',
-      reason: 'empty user text',
-    }
-  }
-
-  // First turn of a session is task-setup — always use strong.
-  if (input.turnNumber === 1) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'first turn of session',
-    }
-  }
-
-  const maxChars = config.simpleMaxChars ?? DEFAULT_SIMPLE_MAX_CHARS
-  const maxWords = config.simpleMaxWords ?? DEFAULT_SIMPLE_MAX_WORDS
-
-  if (hasCode(trimmed)) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'contains code block or inline code',
-    }
-  }
-
-  if (hasStrongKeyword(trimmed)) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'contains reasoning/planning keyword',
-    }
-  }
-
-  if (hasMultiParagraph(trimmed)) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: 'multi-paragraph input',
-    }
-  }
-
-  if (trimmed.length > maxChars) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: `input > ${maxChars} chars`,
-    }
-  }
-
-  if (countWords(trimmed) > maxWords) {
-    return {
-      model: config.strongModel,
-      complexity: 'strong',
-      reason: `input > ${maxWords} words`,
-    }
-  }
-
-  return {
-    model: config.simpleModel,
-    complexity: 'simple',
-    reason: `short (${trimmed.length} chars, ${countWords(trimmed)} words)`,
-  }
-}
--- a/src/services/autoFix/autoFixRunner.test.ts
+++ b/src/services/autoFix/autoFixRunner.test.ts
@@ -70,7 +70,7 @@ describe('runAutoFixCheck', () => {

  test('handles timeout gracefully', async () => {
    const result = await runAutoFixCheck({
-      lint: 'node -e "setTimeout(() => {}, 10000)"',
+      lint: 'sleep 10',
      timeout: 100,

      cwd: '/tmp',
--- a/src/services/autoFix/autoFixRunner.ts
+++ b/src/services/autoFix/autoFixRunner.ts
@@ -46,31 +46,14 @@ async function runCommand(

    const killTree = () => {
      try {
-        if (isWindows && proc.pid) {
-          // shell=true on Windows can leave child commands running unless we
-          // terminate the full process tree.
-          const killer = spawn('taskkill', ['/pid', String(proc.pid), '/T', '/F'], {
-            windowsHide: true,
-            stdio: 'ignore',
-          })
-          killer.unref()
-          return
-        }
-
-        if (proc.pid) {
+        if (!isWindows && proc.pid) {
          // Kill the entire process group
          process.kill(-proc.pid, 'SIGTERM')
-          return
-        }
-
-        proc.kill('SIGTERM')
-      } catch {
-        // Process may have already exited; fallback to direct child kill.
-        try {
+        } else {
          proc.kill('SIGTERM')
-        } catch {
-          // Ignore final fallback errors.
        }
+      } catch {
+        // Process may have already exited
      }
    }

--- a/src/services/compact/autoCompact.test.ts
+++ b/src/services/compact/autoCompact.test.ts
@@ -16,21 +16,12 @@ describe('getEffectiveContextWindowSize', () => {
    // 8k minus 20k summary reservation = -12k, causing infinite auto-compact.
    // Now the fallback is 128k and there's a floor, so effective is always
    // at least reservedTokensForSummary + buffer.
-    //
-    // The exact floor depends on the max-output-tokens slot-reservation cap
-    // (tengu_otk_slot_v1 GrowthBook flag). With cap enabled, the model's
-    // default output cap drops to CAPPED_DEFAULT_MAX_TOKENS (8k), so the
-    // summary reservation is 8k and the floor is 8k + 13k = 21k. With cap
-    // disabled it's 20k + 13k = 33k. Assert the worst case so the test is
-    // stable regardless of flag state in CI vs local.
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
    try {
      const effective = getEffectiveContextWindowSize('some-unknown-3p-model')
      expect(effective).toBeGreaterThan(0)
-      // 21k = CAPPED_DEFAULT_MAX_TOKENS (8k) + AUTOCOMPACT_BUFFER_TOKENS (13k).
-      // Covers the anti-regression intent of issue #635 without assuming
-      // the GrowthBook flag state.
-      expect(effective).toBeGreaterThanOrEqual(21_000)
+      // Must be at least summary reservation (20k) + buffer (13k) = 33k
+      expect(effective).toBeGreaterThanOrEqual(33_000)
    } finally {
      delete process.env.CLAUDE_CODE_USE_OPENAI
    }
--- a/src/services/compact/microCompact.ts
+++ b/src/services/compact/microCompact.ts
@@ -38,7 +38,7 @@ export const TIME_BASED_MC_CLEARED_MESSAGE = '[Old tool result content cleared]'
 const IMAGE_MAX_TOKEN_SIZE = 2000

 // Only compact these built-in tools (MCP tools are also compactable via prefix match)
-export const COMPACTABLE_TOOLS = new Set<string>([
+const COMPACTABLE_TOOLS = new Set<string>([
  FILE_READ_TOOL_NAME,
  ...SHELL_TOOL_NAMES,
  GREP_TOOL_NAME,
@@ -51,7 +51,7 @@ export const COMPACTABLE_TOOLS = new Set<string>([

 const MCP_TOOL_PREFIX = 'mcp__'

-export function isCompactableTool(name: string): boolean {
+function isCompactableTool(name: string): boolean {
  return COMPACTABLE_TOOLS.has(name) || name.startsWith(MCP_TOOL_PREFIX)
 }

--- a/src/services/mcp/client.ts
+++ b/src/services/mcp/client.ts
@@ -2524,7 +2524,7 @@ export async function transformResultContent(
      return [
        {
          type: 'text',
-          text: recursivelySanitizeUnicode(resultContent.text) as string,
+          text: resultContent.text,
        },
      ]
    case 'audio': {
@@ -2569,9 +2569,7 @@ export async function transformResultContent(
        return [
          {
            type: 'text',
-            text: recursivelySanitizeUnicode(
-              `${prefix}${resource.text}`,
-            ) as string,
+            text: `${prefix}${resource.text}`,
          },
        ]
      } else if ('blob' in resource) {
--- a/src/services/tokenEstimation.ts
+++ b/src/services/tokenEstimation.ts
@@ -223,49 +223,6 @@ export function bytesPerTokenForFileType(fileExtension: string): number {
  }
 }

-/**
- * Tokenizer ratio by model family.
- * Different models have different encodings.
- */
-export interface ModelTokenizerConfig {
-  modelFamily: string
-  bytesPerToken: number
-  supportsJson: boolean
-  supportsCode: boolean
-}
-
-export const MODEL_TOKENIZER_CONFIGS: ModelTokenizerConfig[] = [
-  { modelFamily: 'claude', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
-  { modelFamily: 'gpt-4', bytesPerToken: 4, supportsJson: true, supportsCode: true },
-  { modelFamily: 'gpt-3.5', bytesPerToken: 4, supportsJson: true, supportsCode: true },
-  { modelFamily: 'gemini', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
-  { modelFamily: 'llama', bytesPerToken: 3.8, supportsJson: true, supportsCode: true },
-  { modelFamily: 'deepseek', bytesPerToken: 3.5, supportsJson: true, supportsCode: true },
-  { modelFamily: 'minimax', bytesPerToken: 3.2, supportsJson: true, supportsCode: true },
-]
-
-/**
- * Get tokenizer config for a model.
- */
-export function getTokenizerConfig(model: string): ModelTokenizerConfig {
-  const lower = model.toLowerCase()
-  
-  for (const config of MODEL_TOKENIZER_CONFIGS) {
-    if (lower.includes(config.modelFamily)) {
-      return config
-    }
-  }
-  
-  return { modelFamily: 'unknown', bytesPerToken: 4, supportsJson: true, supportsCode: true }
-}
-
-/**
- * Get bytes-per-token ratio for a model.
- */
-export function getBytesPerTokenForModel(model: string): number {
-  return getTokenizerConfig(model).bytesPerToken
-}
-
 /**
 * Like {@link roughTokenCountEstimation} but uses a more accurate
 * bytes-per-token ratio when the file type is known.
@@ -284,106 +241,6 @@ export function roughTokenCountEstimationForFileType(
  )
 }

-/**
- * Content type classification for compression ratio.
- */
-export type ContentType = 
-  | 'json' | 'code' | 'prose' | 'technical' 
-  | 'list' | 'table' | 'mixed'
-
-/**
- * Compression ratio by content type.
- * Measured empirically - denser content = lower ratio.
- */
-export const COMPRESSION_RATIOS: Record<ContentType, { min: number; max: number; typical: number }> = {
-  json: { min: 1.5, max: 2.5, typical: 2 },
-  code: { min: 3, max: 4.5, typical: 3.5 },
-  prose: { min: 3.5, max: 4.5, typical: 4 },
-  technical: { min: 2.5, max: 3.5, typical: 3 },
-  list: { min: 2, max: 3, typical: 2.5 },
-  table: { min: 1.8, max: 2.8, typical: 2.2 },
-  mixed: { min: 3, max: 4, typical: 3.5 },
-}
-
-/**
- * Detect content type from content.
- */
-export function detectContentType(content: string): ContentType {
-  const trimmed = content.trim()
-  
-  // JSON
-  if ((trimmed.startsWith('{') && trimmed.endsWith('}')) || 
-      (trimmed.startsWith('[') && trimmed.endsWith(']'))) {
-    try {
-      JSON.parse(trimmed)
-      return 'json'
-    } catch { /* not valid json */ }
-  }
-  
-  // Table (tabs or consistent delimiters)
-  const lines = trimmed.split('\n')
-  if (lines.length > 2) {
-    const hasTabs = lines[0].includes('\t')
-    const hasCommas = lines[0].includes(',')
-    if (hasTabs || hasCommas) {
-      const consistent = lines.slice(1).every(l => l.includes('\t') || l.includes(','))
-      if (consistent) return 'table'
-    }
-  }
-  
-  // List
-  if (/^[\d\-\*\•]/.test(trimmed) || /^[\d\-\*\•]/.test(lines[0])) {
-    return 'list'
-  }
-  
-  // Code (high density of special chars)
-  const codeChars = (content.match(/[{}()\[\];=]/g) || []).length
-  const codeRatio = codeChars / content.length
-  if (codeRatio > 0.05) return 'code'
-  
-  // Technical (has numbers and units)
-  if (/\d+\s*(px|em|rem|%|ms|s|kb|mb|gb)/i.test(content)) {
-    return 'technical'
-  }
-  
-  // Prose (default - natural language)
-  return 'prose'
-}
-
-/**
- * Get compression ratio for content.
- */
-export function getCompressionRatio(content: string, type?: ContentType): { ratio: number; min: number; max: number } {
-  const detectedType = type ?? detectContentType(content)
-  const { min, max, typical } = COMPRESSION_RATIOS[detectedType]
-  
-  // Adjust based on actual content length
-  // Shorter content = higher variance
-  const lengthBonus = content.length < 100 ? 0.5 : 0
-  
-  return {
-    ratio: typical,
-    min: min + lengthBonus,
-    max: max + lengthBonus,
-  }
-}
-
-/**
- * Estimate tokens with confidence bounds.
- */
-export function estimateWithBounds(
-  content: string,
-  type?: ContentType,
-): { estimate: number; min: number; max: number } {
-  const { ratio, min: minRatio, max: maxRatio } = getCompressionRatio(content, type)
-  
-  const estimate = roughTokenCountEstimation(content, ratio)
-  const min = roughTokenCountEstimation(content, maxRatio)
-  const max = roughTokenCountEstimation(content, minRatio)
-  
-  return { estimate, min, max }
-}
-
 /**
 * Estimates token count for a Message object by extracting and analyzing its text content.
 * This provides a more reliable estimate than getTokenUsage for messages that may have been compacted.
--- a/src/services/tokenModelCompression.test.ts
+++ b/src/services/tokenModelCompression.test.ts
@@ -1,100 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import {
-  getTokenizerConfig,
-  getBytesPerTokenForModel,
-  detectContentType,
-  getCompressionRatio,
-  estimateWithBounds,
-} from './tokenEstimation.js'
-
-describe('Model Tokenizers', () => {
-  describe('getTokenizerConfig', () => {
-    it('returns config for claude models', () => {
-      const config = getTokenizerConfig('claude-sonnet-4-5-20250514')
-      expect(config.modelFamily).toBe('claude')
-      expect(config.bytesPerToken).toBe(3.5)
-    })
-
-    it('returns config for gpt models', () => {
-      const config = getTokenizerConfig('gpt-4')
-      expect(config.modelFamily).toBe('gpt-4')
-      expect(config.bytesPerToken).toBe(4)
-    })
-
-    it('returns default for unknown models', () => {
-      const config = getTokenizerConfig('unknown-model')
-      expect(config.modelFamily).toBe('unknown')
-      expect(config.bytesPerToken).toBe(4)
-    })
-  })
-
-  describe('getBytesPerTokenForModel', () => {
-    it('returns bytes per token for model', () => {
-      expect(getBytesPerTokenForModel('claude-opus-3-5-20250214')).toBe(3.5)
-      expect(getBytesPerTokenForModel('gpt-4o')).toBe(4)
-      expect(getBytesPerTokenForModel('deepseek-chat')).toBe(3.5)
-      expect(getBytesPerTokenForModel('minimax-M2.7')).toBe(3.2)
-    })
-  })
-})
-
-describe('Content Type Detection', () => {
-  describe('detectContentType', () => {
-    it('detects JSON', () => {
-      expect(detectContentType('{"key": "value"}')).toBe('json')
-      expect(detectContentType('[1, 2, 3]')).toBe('json')
-    })
-
-    it('detects code', () => {
-      expect(detectContentType('function test() { return 1 + 2; }')).toBe('code')
-      expect(detectContentType('const x = () => {}')).toBe('code')
-    })
-
-    it('detects prose', () => {
-      expect(detectContentType('This is a natural language response.')).toBe('prose')
-      expect(detectContentType('Hello world how are you?')).toBe('prose')
-    })
-
-    it('detects code-like technical', () => {
-      // Has both code chars and technical - higher code char ratio wins
-      expect(detectContentType('margin: 10px; padding: 5px;')).toBe('code')
-    })
-
-    it('detects list', () => {
-      expect(detectContentType('- item 1\n- item 2')).toBe('list')
-      expect(detectContentType('1. first\n2. second')).toBe('list')
-    })
-
-    it('detects prose by default', () => {
-      // Single column with newlines = prose
-      expect(detectContentType('a b c\n1 2 3')).toBe('prose')
-    })
-  })
-})
-
-describe('Compression Ratio', () => {
-  describe('getCompressionRatio', () => {
-    it('returns appropriate ratios', () => {
-      expect(getCompressionRatio('{"a":1}').ratio).toBe(2)
-      expect(getCompressionRatio('code here {} []').ratio).toBe(3.5)
-      expect(getCompressionRatio('Hello world').ratio).toBe(4)
-    })
-  })
-
-  describe('estimateWithBounds', () => {
-    it('returns estimate with bounds', () => {
-      const result = estimateWithBounds('Hello world')
-      
-      expect(result.min).toBeLessThanOrEqual(result.estimate)
-      expect(result.max).toBeGreaterThanOrEqual(result.estimate)
-      expect(result.min).toBeLessThan(result.max)
-    })
-
-    it('handles JSON with tighter bounds', () => {
-      const result = estimateWithBounds('{"key": "value"}')
-      
-      // JSON has smaller ratio range
-      expect(result.max).toBeLessThan(10)
-    })
-  })
-})
--- a/src/services/tools/toolExecution.ts
+++ b/src/services/tools/toolExecution.ts
@@ -1241,7 +1241,6 @@ async function checkPermissionsAndCallTool(
      {
        ...toolUseContext,
        toolUseId: toolUseID,
-        hookChainsCanUseTool: canUseTool,
        userModified: permissionDecision.userModified ?? false,
      },
      canUseTool,
@@ -1730,29 +1729,19 @@ async function checkPermissionsAndCallTool(
    const hookMessages: MessageUpdateLazy<
      AttachmentMessage | ProgressMessage<HookProgress>
    >[] = []
-    const hookChainsContext = toolUseContext as ToolUseContext & {
-      hookChainsCanUseTool?: CanUseToolFn
-    }
-    hookChainsContext.hookChainsCanUseTool = canUseTool
-    try {
-      for await (const hookResult of runPostToolUseFailureHooks(
-        toolUseContext,
-        tool,
-        toolUseID,
-        messageId,
-        processedInput,
-        content,
-        isInterrupt,
-        requestId,
-        mcpServerType,
-        mcpServerBaseUrl,
-      )) {
-        hookMessages.push(hookResult)
-      }
-    } finally {
-      if (hookChainsContext.hookChainsCanUseTool === canUseTool) {
-        delete hookChainsContext.hookChainsCanUseTool
-      }
+    for await (const hookResult of runPostToolUseFailureHooks(
+      toolUseContext,
+      tool,
+      toolUseID,
+      messageId,
+      processedInput,
+      content,
+      isInterrupt,
+      requestId,
+      mcpServerType,
+      mcpServerBaseUrl,
+    )) {
+      hookMessages.push(hookResult)
    }

    return [
--- a/src/services/tools/toolHooks.ts
+++ b/src/services/tools/toolHooks.ts
@@ -284,7 +284,6 @@ export async function* runPostToolUseFailureHooks<Input extends AnyObject>(
      isInterrupt,
      permissionMode,
      toolUseContext.abortController.signal,
-      undefined,
    )) {
      try {
        // Check if we were aborted during hook execution
--- a/src/services/wiki/init.test.ts
+++ b/src/services/wiki/init.test.ts
@@ -26,10 +26,10 @@ test('initializeWiki creates the expected wiki scaffold', async () => {

  expect(result.alreadyExisted).toBe(false)
  expect(result.createdFiles).toEqual([
-    join('.openclaude', 'wiki', 'schema.md'),
-    join('.openclaude', 'wiki', 'index.md'),
-    join('.openclaude', 'wiki', 'log.md'),
-    join('.openclaude', 'wiki', 'pages', 'architecture.md'),
+    '.openclaude/wiki/schema.md',
+    '.openclaude/wiki/index.md',
+    '.openclaude/wiki/log.md',
+    '.openclaude/wiki/pages/architecture.md',
  ])
  expect(await readFile(paths.schemaFile, 'utf8')).toContain(
    '# OpenClaude Wiki Schema',
--- a/src/tools/ConfigTool/prompt.ts
+++ b/src/tools/ConfigTool/prompt.ts
@@ -59,7 +59,7 @@ export function generatePrompt(): string {
 ## Configurable settings list
 The following settings are available for you to change:

-### Global Settings (stored in ~/.openclaude.json)
+### Global Settings (stored in ~/.claude.json)
 ${globalSettings.join('\n')}

 ### Project Settings (stored in settings.json)
--- a/src/tools/FileReadTool/FileReadTool.ts
+++ b/src/tools/FileReadTool/FileReadTool.ts
@@ -733,9 +733,6 @@ export const CYBER_RISK_MITIGATION_REMINDER =
 const MITIGATION_EXEMPT_MODELS = new Set(['claude-opus-4-6'])

 function shouldIncludeFileReadMitigation(): boolean {
-  if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
-    return false
-  }
  const shortName = getCanonicalName(getMainLoopModel())
  return !MITIGATION_EXEMPT_MODELS.has(shortName)
 }
--- a/src/tools/WebFetchTool/applyPromptFallback.test.ts
+++ b/src/tools/WebFetchTool/applyPromptFallback.test.ts
@@ -1,87 +0,0 @@
-import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
-
-// Mock the Anthropic-API-side before importing the module under test, so
-// queryHaiku resolves into whatever the individual test wants (slow, failing,
-// or successful). We preserve every other export from claude.js so unrelated
-// transitive imports still work.
-const haikuMock = mock()
-
-beforeEach(async () => {
-  haikuMock.mockReset()
-  const actual = await import('../../services/api/claude.js')
-  mock.module('../../services/api/claude.js', () => ({
-    ...actual,
-    queryHaiku: haikuMock,
-  }))
-})
-
-afterEach(() => {
-  mock.restore()
-})
-
-async function runApply(markdown = 'Hello world.', signal?: AbortSignal): Promise<string> {
-  const nonce = `${Date.now()}-${Math.random()}`
-  const { applyPromptToMarkdown } =
-    await import(`./utils.js?ts=${nonce}`)
-  const ctrl = new AbortController()
-  return applyPromptToMarkdown(
-    'summarize',
-    markdown,
-    signal ?? ctrl.signal,
-    false,
-    false,
-  )
-}
-
-test('returns raw truncated markdown when queryHaiku throws', async () => {
-  haikuMock.mockImplementation(async () => {
-    throw new Error('MiniMax rejected the model name')
-  })
-
-  const output = await runApply('Gitlawb homepage content.')
-  expect(output).toContain('[Secondary-model summarization unavailable')
-  expect(output).toContain('Gitlawb homepage content.')
-})
-
-test('returns raw truncated markdown when queryHaiku simulates a timeout', async () => {
-  // Simulating raceWithTimeout's rejection path directly — we can't actually
-  // wait 45s in a test. The error shape matches what raceWithTimeout produces.
-  haikuMock.mockImplementation(async () => {
-    const err = new Error('Secondary-model summarization timed out after 45000ms')
-    ;(err as NodeJS.ErrnoException).code = 'SECONDARY_MODEL_TIMEOUT'
-    throw err
-  })
-
-  const output = await runApply('Slow provider content.')
-  expect(output).toContain('[Secondary-model summarization unavailable')
-  expect(output).toContain('Slow provider content.')
-})
-
-test('returns the model response when queryHaiku succeeds', async () => {
-  haikuMock.mockImplementation(async () => ({
-    message: {
-      content: [{ type: 'text', text: 'This page is about GitLawb, an AI legal platform.' }],
-    },
-  }))
-
-  const output = await runApply('some page content')
-  expect(output).toBe('This page is about GitLawb, an AI legal platform.')
-})
-
-test('returns fallback when queryHaiku resolves with empty content', async () => {
-  haikuMock.mockImplementation(async () => ({ message: { content: [] } }))
-
-  const output = await runApply('some page content')
-  expect(output).toContain('[Secondary-model summarization unavailable')
-  expect(output).toContain('some page content')
-})
-
-test('propagates AbortError from the caller signal', async () => {
-  const ctrl = new AbortController()
-  haikuMock.mockImplementation(async () => {
-    ctrl.abort()
-    return new Promise(() => {})
-  })
-
-  await expect(runApply('content', ctrl.signal)).rejects.toThrow()
-})
--- a/src/tools/WebFetchTool/domainCheck.test.ts
+++ b/src/tools/WebFetchTool/domainCheck.test.ts
@@ -20,11 +20,8 @@ afterEach(() => {
 describe('checkDomainBlocklist', () => {
  test('returns allowed without API call in OpenAI mode', async () => {
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
-    const actual = await import('../../utils/model/providers.js')
    mock.module('../../utils/model/providers.js', () => ({
-      ...actual,
      getAPIProvider: () => 'openai',
-      isFirstPartyAnthropicBaseUrl: () => false,
    }))
    const getSpy = mock(() =>
      Promise.resolve({ status: 200, data: { can_fetch: true } }),
@@ -40,11 +37,8 @@ describe('checkDomainBlocklist', () => {

  test('returns allowed without API call in Gemini mode', async () => {
    process.env.CLAUDE_CODE_USE_GEMINI = '1'
-    const actual = await import('../../utils/model/providers.js')
    mock.module('../../utils/model/providers.js', () => ({
-      ...actual,
      getAPIProvider: () => 'gemini',
-      isFirstPartyAnthropicBaseUrl: () => false,
    }))
    const getSpy = mock(() =>
      Promise.resolve({ status: 200, data: { can_fetch: true } }),
@@ -63,11 +57,8 @@ describe('checkDomainBlocklist', () => {
    delete process.env.CLAUDE_CODE_USE_GEMINI
    delete process.env.CLAUDE_CODE_USE_GITHUB

-    const actual = await import('../../utils/model/providers.js')
    mock.module('../../utils/model/providers.js', () => ({
-      ...actual,
      getAPIProvider: () => 'firstParty',
-      isFirstPartyAnthropicBaseUrl: () => true,
    }))
    const getSpy = mock(() =>
      Promise.resolve({ status: 200, data: { can_fetch: true } }),
--- a/src/tools/WebFetchTool/utils.ts
+++ b/src/tools/WebFetchTool/utils.ts
@@ -15,7 +15,6 @@ import {
 } from '../../utils/mcpOutputStorage.js'
 import { getSettings_DEPRECATED } from '../../utils/settings/settings.js'
 import { asSystemPrompt } from '../../utils/systemPromptType.js'
-import { ssrfGuardedLookup } from '../../utils/hooks/ssrfGuard.js'
 import { isPreapprovedHost } from './preapproved.js'
 import { makeSecondaryModelPrompt } from './prompt.js'

@@ -275,76 +274,19 @@ export async function getWithPermittedRedirects(
  if (depth > MAX_REDIRECTS) {
    throw new Error(`Too many redirects (exceeded ${MAX_REDIRECTS})`)
  }
-
-  const axiosConfig = {
-    signal,
-    timeout: FETCH_TIMEOUT_MS,
-    maxRedirects: 0,
-    responseType: 'arraybuffer' as const,
-    maxContentLength: MAX_HTTP_CONTENT_LENGTH,
-    lookup: ssrfGuardedLookup,
-    headers: {
-      Accept: 'text/markdown, text/html, */*',
-      'User-Agent': getWebFetchUserAgent(),
-    },
-  }
-
  try {
-    return await axios.get(url, axiosConfig)
+    return await axios.get(url, {
+      signal,
+      timeout: FETCH_TIMEOUT_MS,
+      maxRedirects: 0,
+      responseType: 'arraybuffer',
+      maxContentLength: MAX_HTTP_CONTENT_LENGTH,
+      headers: {
+        Accept: 'text/markdown, text/html, */*',
+        'User-Agent': getWebFetchUserAgent(),
+      },
+    })
  } catch (error) {
-    // Try native fetch as a fallback for timeout / network errors
-    // (Bun/Node bundled contexts occasionally hang with axios + custom lookup.)
-    const isTimeoutLike =
-      axios.isAxiosError(error) &&
-      (!error.response &&
-        (error.code === 'ECONNABORTED' ||
-          error.code === 'ETIMEDOUT' ||
-          error.message?.toLowerCase().includes('timeout')))
-    if (isTimeoutLike && !signal.aborted) {
-      try {
-        const fetchResponse = await fetch(url, {
-          signal,
-          redirect: 'manual',
-          headers: axiosConfig.headers,
-        })
-        // Handle redirects manually
-        if ([301, 302, 307, 308].includes(fetchResponse.status)) {
-          const redirectLocation = fetchResponse.headers.get('location')
-          if (!redirectLocation) {
-            throw new Error('Redirect missing Location header')
-          }
-          const redirectUrl = new URL(redirectLocation, url).toString()
-          if (redirectChecker(url, redirectUrl)) {
-            return getWithPermittedRedirects(
-              redirectUrl,
-              signal,
-              redirectChecker,
-              depth + 1,
-            )
-          } else {
-            return {
-              type: 'redirect' as const,
-              originalUrl: url,
-              redirectUrl,
-              statusCode: fetchResponse.status,
-            }
-          }
-        }
-        const arrayBuffer = await fetchResponse.arrayBuffer()
-        // Build an AxiosResponse-like shape so downstream code stays happy
-        return {
-          data: new Uint8Array(arrayBuffer),
-          status: fetchResponse.status,
-          statusText: fetchResponse.statusText,
-          headers: Object.fromEntries(fetchResponse.headers.entries()),
-          config: axiosConfig,
-          request: undefined,
-        } as unknown as AxiosResponse<ArrayBuffer>
-      } catch {
-        // Fall through to original error handling
-      }
-    }
-
    if (
      axios.isAxiosError(error) &&
      error.response &&
@@ -545,58 +487,6 @@ export async function getURLMarkdownContent(
  return entry
 }

-// Budget for the secondary-model summarization after fetch. If the small-
-// fast model is slow (e.g. a 200k-context third-party running a reasoning
-// pass over ~100KB of markdown), we'd rather fall back to raw truncated
-// markdown than hang the tool. Also keeps the worst-case WebFetch bounded
-// to FETCH_TIMEOUT_MS + SECONDARY_MODEL_TIMEOUT_MS regardless of provider.
-const SECONDARY_MODEL_TIMEOUT_MS = 45_000
-
-function raceWithTimeout<T>(
-  promise: Promise<T>,
-  timeoutMs: number,
-  signal: AbortSignal,
-): Promise<T> {
-  return new Promise<T>((resolve, reject) => {
-    const timer = setTimeout(() => {
-      const err = new Error(`Secondary-model summarization timed out after ${timeoutMs}ms`)
-      ;(err as NodeJS.ErrnoException).code = 'SECONDARY_MODEL_TIMEOUT'
-      reject(err)
-    }, timeoutMs)
-    const onAbort = () => {
-      clearTimeout(timer)
-      reject(new AbortError())
-    }
-    if (signal.aborted) {
-      clearTimeout(timer)
-      reject(new AbortError())
-      return
-    }
-    signal.addEventListener('abort', onAbort, { once: true })
-    promise.then(
-      value => {
-        clearTimeout(timer)
-        signal.removeEventListener('abort', onAbort)
-        resolve(value)
-      },
-      err => {
-        clearTimeout(timer)
-        signal.removeEventListener('abort', onAbort)
-        reject(err)
-      },
-    )
-  })
-}
-
-function buildFallbackMarkdownSummary(truncatedContent: string): string {
-  return [
-    '[Secondary-model summarization unavailable — returning raw fetched content.',
-    'This typically means the configured small-fast model took too long or errored.]',
-    '',
-    truncatedContent,
-  ].join('\n')
-}
-
 export async function applyPromptToMarkdown(
  prompt: string,
  markdownContent: string,
@@ -616,35 +506,18 @@ export async function applyPromptToMarkdown(
    prompt,
    isPreapprovedDomain,
  )
-  let assistantMessage
-  try {
-    assistantMessage = await raceWithTimeout(
-      queryHaiku({
-        systemPrompt: asSystemPrompt([]),
-        userPrompt: modelPrompt,
-        signal,
-        options: {
-          querySource: 'web_fetch_apply',
-          agents: [],
-          isNonInteractiveSession,
-          hasAppendSystemPrompt: false,
-          mcpTools: [],
-        },
-      }),
-      SECONDARY_MODEL_TIMEOUT_MS,
-      signal,
-    )
-  } catch (err) {
-    // User interrupts and SIGINTs still propagate. Everything else (timeout,
-    // provider-side error, unsupported model on third-party endpoint) falls
-    // back to raw markdown so the user still gets usable content rather than
-    // a hang. Log so it's visible in debug traces.
-    if (err instanceof AbortError || (err as Error)?.name === 'AbortError') {
-      throw err
-    }
-    logError(err)
-    return buildFallbackMarkdownSummary(truncatedContent)
-  }
+  const assistantMessage = await queryHaiku({
+    systemPrompt: asSystemPrompt([]),
+    userPrompt: modelPrompt,
+    signal,
+    options: {
+      querySource: 'web_fetch_apply',
+      agents: [],
+      isNonInteractiveSession,
+      hasAppendSystemPrompt: false,
+      mcpTools: [],
+    },
+  })

  // We need to bubble this up, so that the tool call throws, causing us to return
  // an is_error tool_use block to the server, and render a red dot in the UI.
@@ -659,5 +532,5 @@ export async function applyPromptToMarkdown(
      return contentBlock.text
    }
  }
-  return buildFallbackMarkdownSummary(truncatedContent)
+  return 'No response from model'
 }
--- a/src/tools/WebSearchTool/WebSearchTool.ts
+++ b/src/tools/WebSearchTool/WebSearchTool.ts
@@ -203,61 +203,6 @@ function buildCodexWebSearchInstructions(): string {
  ].join(' ')
 }

-function pushCodexTextResult(
-  results: (SearchResult | string)[],
-  value: unknown,
-): void {
-  if (typeof value !== 'string') return
-  const trimmed = value.trim()
-  if (trimmed) {
-    results.push(trimmed)
-  }
-}
-
-function addCodexSource(
-  sourceMap: Map<string, { title: string; url: string }>,
-  source: unknown,
-): void {
-  if (typeof source?.url !== 'string' || !source.url) return
-  sourceMap.set(source.url, {
-    title:
-      typeof source.title === 'string' && source.title
-        ? source.title
-        : source.url,
-    url: source.url,
-  })
-}
-
-function getCodexSources(item: Record<string, any>): unknown[] {
-  if (Array.isArray(item.action?.sources)) {
-    return item.action.sources
-  }
-  if (Array.isArray(item.sources)) {
-    return item.sources
-  }
-  if (Array.isArray(item.result?.sources)) {
-    return item.result.sources
-  }
-  return []
-}
-
-function extractCodexWebSearchFailure(item: Record<string, any>): string | undefined {
-  // Codex web_search_call items can carry a status field. When the tool
-  // call fails (rate limit, upstream error, model-side guardrail), the
-  // parser should surface a meaningful error rather than the generic
-  // "No results found." fallback. Shape observed across recent payloads:
-  //   { type: 'web_search_call', status: 'failed', error: { message?: string } }
-  //   { type: 'web_search_call', status: 'failed', action: { error?: { message?: string } } }
-  if (item?.status !== 'failed') return undefined
-  const reason =
-    (typeof item.error?.message === 'string' && item.error.message) ||
-    (typeof item.action?.error?.message === 'string' &&
-      item.action.error.message) ||
-    (typeof item.error === 'string' && item.error) ||
-    undefined
-  return reason ? `Web search failed: ${reason}` : 'Web search failed.'
-}
-
 function makeOutputFromCodexWebSearchResponse(
  response: Record<string, unknown>,
  query: string,
@@ -269,12 +214,18 @@ function makeOutputFromCodexWebSearchResponse(

  for (const item of output) {
    if (item?.type === 'web_search_call') {
-      const failure = extractCodexWebSearchFailure(item)
-      if (failure) {
-        results.push(failure)
-      }
-      for (const source of getCodexSources(item)) {
-        addCodexSource(sourceMap, source)
+      const sources = Array.isArray(item.action?.sources)
+        ? item.action.sources
+        : []
+      for (const source of sources) {
+        if (typeof source?.url !== 'string' || !source.url) continue
+        sourceMap.set(source.url, {
+          title:
+            typeof source.title === 'string' && source.title
+              ? source.title
+              : source.url,
+          url: source.url,
+        })
      }
      continue
    }
@@ -284,12 +235,11 @@ function makeOutputFromCodexWebSearchResponse(
    }

    for (const part of item.content) {
-      if (part?.type === 'output_text' || part?.type === 'text') {
-        pushCodexTextResult(results, part.text)
-      }
-
-      for (const source of getCodexSources(part)) {
-        addCodexSource(sourceMap, source)
+      if (part?.type === 'output_text' && typeof part.text === 'string') {
+        const trimmed = part.text.trim()
+        if (trimmed) {
+          results.push(trimmed)
+        }
      }

      const annotations = Array.isArray(part?.annotations)
@@ -297,13 +247,23 @@ function makeOutputFromCodexWebSearchResponse(
        : []
      for (const annotation of annotations) {
        if (annotation?.type !== 'url_citation') continue
-        addCodexSource(sourceMap, annotation)
+        if (typeof annotation.url !== 'string' || !annotation.url) continue
+        sourceMap.set(annotation.url, {
+          title:
+            typeof annotation.title === 'string' && annotation.title
+              ? annotation.title
+              : annotation.url,
+          url: annotation.url,
+        })
      }
    }
  }

-  if (results.length === 0) {
-    pushCodexTextResult(results, response.output_text)
+  if (results.length === 0 && typeof response.output_text === 'string') {
+    const trimmed = response.output_text.trim()
+    if (trimmed) {
+      results.push(trimmed)
+    }
  }

  if (sourceMap.size > 0) {
@@ -313,10 +273,6 @@ function makeOutputFromCodexWebSearchResponse(
    })
  }

-  if (results.length === 0) {
-    results.push('No results found.')
-  }
-
  return {
    query,
    results,
@@ -324,10 +280,6 @@ function makeOutputFromCodexWebSearchResponse(
  }
 }

-export const __test = {
-  makeOutputFromCodexWebSearchResponse,
-}
-
 async function runCodexWebSearch(
  input: Input,
  signal: AbortSignal,
@@ -505,19 +457,6 @@ function shouldUseAdapterProvider(): boolean {
  return getAvailableProviders().length > 0
 }

-/**
- * Returns true when the current provider has a working native or Codex
- * web-search fallback after an adapter failure. OpenAI shim providers
- * (moonshot, minimax, nvidia-nim, openai, github, etc.) do NOT support
- * Anthropic's web_search_20250305 tool, so falling through to the native
- * path silently produces "Did 0 searches".
- */
-function hasNativeSearchFallback(): boolean {
-  if (isCodexResponsesWebSearchEnabled()) return true
-  const provider = getAPIProvider()
-  return provider === 'firstParty' || provider === 'vertex' || provider === 'foundry'
-}
-
 // ---------------------------------------------------------------------------
 // Tool export
 // ---------------------------------------------------------------------------
@@ -670,17 +609,6 @@ export const WebSearchTool = buildTool({
        // Auto mode: only fall through on transient errors (network, timeout, 5xx).
        // Config / guardrail errors (SSRF, HTTPS, bad URL, etc.) must surface.
        if (!isTransientError(err)) throw err
-        // No viable fallback for this provider — surface the adapter error
-        // instead of falling through to a broken native path.
-        if (!hasNativeSearchFallback()) {
-          const provider = getAPIProvider()
-          const errMsg = err instanceof Error ? err.message : String(err)
-          throw new Error(
-            `Web search is unavailable for provider "${provider}". ` +
-              `The search adapter failed (${errMsg}). ` +
-              `Try switching to a provider with built-in web search (e.g. Anthropic, Codex) or try again later.`,
-          )
-        }
        console.error(
          `[web-search] Adapter failed, falling through to native: ${err}`,
        )
--- a/src/tools/WebSearchTool/providers/duckduckgo.ts
+++ b/src/tools/WebSearchTool/providers/duckduckgo.ts
@@ -1,44 +1,6 @@
 import type { SearchInput, SearchProvider } from './types.js'
 import { applyDomainFilters, type ProviderOutput } from './types.js'

-// DuckDuckGo's HTML scraper aggressively blocks datacenter / repeat IPs with
-// an "anomaly in the request" response. When that happens we surface an
-// actionable error instead of the opaque scraper message so users know how
-// to configure a working backend.
-const DDG_ANOMALY_HINT =
-  'DuckDuckGo scraping is rate-limited from this network. ' +
-  'Configure a search backend with one of: ' +
-  'FIRECRAWL_API_KEY, TAVILY_API_KEY, EXA_API_KEY, YOU_API_KEY, ' +
-  'JINA_API_KEY, BING_API_KEY, MOJEEK_API_KEY, LINKUP_API_KEY — ' +
-  'or use an Anthropic / Vertex / Foundry provider for native web search.'
-
-const MAX_RETRIES = 3
-const INITIAL_BACKOFF_MS = 1000
-
-function isAnomalyError(message: string): boolean {
-  return /anomaly in the request|likely making requests too quickly/i.test(
-    message,
-  )
-}
-
-function isRetryableDDGError(err: unknown): boolean {
-  if (!(err instanceof Error)) return false
-  const msg = err.message.toLowerCase()
-  return (
-    msg.includes('anomaly') ||
-    msg.includes('too quickly') ||
-    msg.includes('rate limit') ||
-    msg.includes('timeout') ||
-    msg.includes('econnreset') ||
-    msg.includes('etimedout') ||
-    msg.includes('econnaborted')
-  )
-}
-
-function sleep(ms: number): Promise<void> {
-  return new Promise(r => setTimeout(r, ms))
-}
-
 export const duckduckgoProvider: SearchProvider = {
  name: 'duckduckgo',

@@ -57,44 +19,22 @@ export const duckduckgoProvider: SearchProvider = {
      throw new Error('duck-duck-scrape package not installed. Run: npm install duck-duck-scrape')
    }
    if (signal?.aborted) throw new DOMException('Aborted', 'AbortError')
+    // TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches
+    const response = await search(input.query, { safeSearch: SafeSearchType.STRICT })

-    let lastErr: unknown
-    for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
-      if (signal?.aborted) throw new DOMException('Aborted', 'AbortError')
-      try {
-        // TODO: duck-duck-scrape doesn't accept AbortSignal — can't cancel in-flight searches
-        const response = await search(input.query, { safeSearch: SafeSearchType.STRICT })
+    const hits = applyDomainFilters(
+      response.results.map(r => ({
+        title: r.title || r.url,
+        url: r.url,
+        description: r.description ?? undefined,
+      })),
+      input,
+    )

-        const hits = applyDomainFilters(
-          response.results.map(r => ({
-            title: r.title || r.url,
-            url: r.url,
-            description: r.description ?? undefined,
-          })),
-          input,
-        )
-
-        return {
-          hits,
-          providerName: 'duckduckgo',
-          durationSeconds: (performance.now() - start) / 1000,
-        }
-      } catch (err) {
-        lastErr = err
-        const msg = err instanceof Error ? err.message : String(err)
-        if (isAnomalyError(msg)) {
-          throw new Error(DDG_ANOMALY_HINT)
-        }
-        if (!isRetryableDDGError(err) || attempt === MAX_RETRIES - 1) {
-          throw err
-        }
-        // Exponential backoff with jitter: 1s, 2s, 4s +/- 20%
-        const baseDelay = INITIAL_BACKOFF_MS * Math.pow(2, attempt)
-        const jitter = baseDelay * 0.2 * (Math.random() * 2 - 1)
-        await sleep(baseDelay + jitter)
-      }
+    return {
+      hits,
+      providerName: 'duckduckgo',
+      durationSeconds: (performance.now() - start) / 1000,
    }
-
-    throw lastErr
  },
 }
--- a/src/utils/auth.ts
+++ b/src/utils/auth.ts
@@ -693,7 +693,7 @@ export function refreshAwsAuth(awsAuthRefresh: string): Promise<boolean> {
              'AWS auth refresh timed out after 3 minutes. Run your auth command manually in a separate terminal.',
            )
          : chalk.red(
-              'Error running awsAuthRefresh (in settings or ~/.openclaude.json):',
+              'Error running awsAuthRefresh (in settings or ~/.claude.json):',
            )
        // biome-ignore lint/suspicious/noConsole:: intentional console output
        console.error(message)
@@ -771,7 +771,7 @@ async function getAwsCredsFromCredentialExport(): Promise<{
      }
    } catch (e) {
      const message = chalk.red(
-        'Error getting AWS credentials from awsCredentialExport (in settings or ~/.openclaude.json):',
+        'Error getting AWS credentials from awsCredentialExport (in settings or ~/.claude.json):',
      )
      if (e instanceof Error) {
        // biome-ignore lint/suspicious/noConsole:: intentional console output
@@ -961,7 +961,7 @@ export function refreshGcpAuth(gcpAuthRefresh: string): Promise<boolean> {
              'GCP auth refresh timed out after 3 minutes. Run your auth command manually in a separate terminal.',
            )
          : chalk.red(
-              'Error running gcpAuthRefresh (in settings or ~/.openclaude.json):',
+              'Error running gcpAuthRefresh (in settings or ~/.claude.json):',
            )
        // biome-ignore lint/suspicious/noConsole:: intentional console output
        console.error(message)
@@ -1959,7 +1959,7 @@ export async function validateForceLoginOrg(): Promise<OrgValidationResult> {

  // Always fetch the authoritative org UUID from the profile endpoint.
  // Even keychain-sourced tokens verify server-side: the cached org UUID
-  // in ~/.openclaude.json is user-writable and cannot be trusted.
+  // in ~/.claude.json is user-writable and cannot be trusted.
  const { source } = getAuthTokenSource()
  const isEnvVarToken =
    source === 'CLAUDE_CODE_OAUTH_TOKEN' ||
--- a/src/utils/caCertsConfig.ts
+++ b/src/utils/caCertsConfig.ts
@@ -28,7 +28,7 @@ import { getSettingsForSource } from './settings/settings.js'
 * is lazy-initialized) and ensure Node.js compatibility.
 *
 * This is safe to call before the trust dialog because we only read from
- * user-controlled files (~/.claude/settings.json and ~/.openclaude.json),
+ * user-controlled files (~/.claude/settings.json and ~/.claude.json),
 * not from project-level settings.
 */
 export function applyExtraCACertsFromConfig(): void {
@@ -52,7 +52,7 @@ export function applyExtraCACertsFromConfig(): void {
 * after the trust dialog. But we need the CA cert early to establish the TLS
 * connection to an HTTPS proxy during init().
 *
- * We read from global config (~/.openclaude.json) and user settings
+ * We read from global config (~/.claude.json) and user settings
 * (~/.claude/settings.json). These are user-controlled files that don't
 * require trust approval.
 */
--- a/src/utils/claudeInChrome/setup.ts
+++ b/src/utils/claudeInChrome/setup.ts
@@ -355,7 +355,7 @@ exec ${command}
 *
 * Only positive detections are persisted. A negative result from the
 * filesystem scan is not cached, because it may come from a machine that
- * shares ~/.openclaude.json but has no local Chrome (e.g. a remote dev
+ * shares ~/.claude.json but has no local Chrome (e.g. a remote dev
 * environment using the bridge), and caching it would permanently poison
 * auto-enable for every session on every machine that reads that config.
 */
--- a/src/utils/config.ts
+++ b/src/utils/config.ts
@@ -244,7 +244,6 @@ export type GlobalConfig = {
  bypassPermissionsModeAccepted?: boolean
  hasUsedBackslashReturn?: boolean
  autoCompactEnabled: boolean // Controls whether auto-compact is enabled
-  toolHistoryCompressionEnabled: boolean // Compress old tool_result content for small-context providers
  showTurnDuration: boolean // Controls whether to show turn duration message (e.g., "Cooked for 1m 6s")
  /**
   * @deprecated Use settings.env instead.
@@ -623,7 +622,6 @@ function createDefaultGlobalConfig(): GlobalConfig {
    verbose: false,
    editorMode: 'normal',
    autoCompactEnabled: true,
-    toolHistoryCompressionEnabled: true,
    showTurnDuration: true,
    hasSeenTasksHint: false,
    hasUsedStash: false,
@@ -670,7 +668,6 @@ export const GLOBAL_CONFIG_KEYS = [
  'editorMode',
  'hasUsedBackslashReturn',
  'autoCompactEnabled',
-  'toolHistoryCompressionEnabled',
  'showTurnDuration',
  'diffTool',
  'env',
@@ -921,7 +918,7 @@ let configCacheHits = 0
 let configCacheMisses = 0
 // Session-total count of actual disk writes to the global config file.
 // Exposed for internal-only dev diagnostics (see inc-4552) so anomalous write
-// rates surface in the UI before they corrupt ~/.openclaude.json.
+// rates surface in the UI before they corrupt ~/.claude.json.
 let globalConfigWriteCount = 0

 export function getGlobalConfigWriteCount(): number {
@@ -1260,7 +1257,7 @@ function saveConfigWithLock<A extends object>(
    const currentConfig = getConfig(file, createDefault)
    if (file === getGlobalClaudeFile() && wouldLoseAuthState(currentConfig)) {
      logForDebugging(
-        'saveConfigWithLock: re-read config is missing auth that cache has; refusing to write to avoid wiping ~/.openclaude.json. See GH #3117.',
+        'saveConfigWithLock: re-read config is missing auth that cache has; refusing to write to avoid wiping ~/.claude.json. See GH #3117.',
        { level: 'error' },
      )
      logEvent('tengu_config_auth_loss_prevented', {})
--- a/src/utils/context.ts
+++ b/src/utils/context.ts
@@ -12,12 +12,7 @@ export const MODEL_CONTEXT_WINDOW_DEFAULT = 200_000
 // Fallback context window for unknown 3P models. Must be large enough that
 // the effective context (this minus output token reservation) stays positive,
 // otherwise auto-compact fires on every message (issue #635).
-// Override via CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW env var to avoid
-// hardcoding when deploying models not yet in openaiContextWindows.ts.
-export const OPENAI_FALLBACK_CONTEXT_WINDOW = (() => {
-  const v = parseInt(process.env.CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW ?? '', 10)
-  return !isNaN(v) && v > 0 ? v : 128_000
-})()
+export const OPENAI_FALLBACK_CONTEXT_WINDOW = 128_000

 // Maximum output tokens for compact operations
 export const COMPACT_MAX_OUTPUT_TOKENS = 20_000
@@ -195,16 +190,20 @@ export function getModelMaxOutputTokens(model: string): {
  }

  // OpenAI-compatible provider — use known output limits to avoid 400 errors
-  if (
+  const isOpenAICompatProvider =
    isEnvTruthy(process.env.CLAUDE_CODE_USE_OPENAI) ||
    isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI) ||
    isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) ||
    isEnvTruthy(process.env.CLAUDE_CODE_USE_MISTRAL)
-  ) {
+  if (isOpenAICompatProvider) {
    const openaiMax = getOpenAIMaxOutputTokens(model)
    if (openaiMax !== undefined) {
      return { default: openaiMax, upperLimit: openaiMax }
    }
+    // Unknown 3P model — use conservative default to avoid vLLM/Ollama 400
+    // errors when the default 32k exceeds the model's max_model_len.
+    // Users can override with CLAUDE_CODE_MAX_OUTPUT_TOKENS.
+    return { default: 4_096, upperLimit: 16_384 }
  }

  const m = getCanonicalName(model)
--- a/src/utils/deepLink/registerProtocol.ts
+++ b/src/utils/deepLink/registerProtocol.ts
@@ -253,7 +253,7 @@ async function resolveClaudePath(): Promise<string> {
 * Check whether the OS-level protocol handler is already registered AND
 * points at the expected `claude` binary. Reads the registration artifact
 * directly (symlink target, .desktop Exec line, registry value) rather than
- * a cached flag in ~/.openclaude.json, so:
+ * a cached flag in ~/.claude.json, so:
 *   - the check is per-machine (config can sync across machines; OS state can't)
 *   - stale paths self-heal (install-method change → re-register next session)
 *   - deleted artifacts self-heal
@@ -311,7 +311,7 @@ export async function ensureDeepLinkProtocolRegistered(): Promise<void> {
  // EACCES/ENOSPC are deterministic — retrying next session won't help.
  // Throttle to once per 24h so a read-only ~/.local/share/applications
  // doesn't generate a failure event on every startup. Marker lives in
-  // ~/.claude (per-machine, not synced) rather than ~/.openclaude.json (can sync).
+  // ~/.claude (per-machine, not synced) rather than ~/.claude.json (can sync).
  const failureMarkerPath = path.join(
    getClaudeConfigHomeDir(),
    '.deep-link-register-failed',
--- a/src/utils/env.test.ts
+++ b/src/utils/env.test.ts
@@ -1,62 +0,0 @@
-import { afterEach, beforeEach, expect, test } from 'bun:test'
-import { mkdtempSync, rmSync, writeFileSync } from 'fs'
-import { tmpdir } from 'os'
-import { join } from 'path'
-
-const originalEnv = {
-  CLAUDE_CONFIG_DIR: process.env.CLAUDE_CONFIG_DIR,
-  CLAUDE_CODE_CUSTOM_OAUTH_URL: process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL,
-  USER_TYPE: process.env.USER_TYPE,
-}
-
-let tempDir: string
-
-beforeEach(() => {
-  tempDir = mkdtempSync(join(tmpdir(), 'openclaude-env-test-'))
-  process.env.CLAUDE_CONFIG_DIR = tempDir
-  delete process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL
-  delete process.env.USER_TYPE
-})
-
-afterEach(() => {
-  rmSync(tempDir, { recursive: true, force: true })
-  if (originalEnv.CLAUDE_CONFIG_DIR === undefined) {
-    delete process.env.CLAUDE_CONFIG_DIR
-  } else {
-    process.env.CLAUDE_CONFIG_DIR = originalEnv.CLAUDE_CONFIG_DIR
-  }
-  if (originalEnv.CLAUDE_CODE_CUSTOM_OAUTH_URL === undefined) {
-    delete process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL
-  } else {
-    process.env.CLAUDE_CODE_CUSTOM_OAUTH_URL = originalEnv.CLAUDE_CODE_CUSTOM_OAUTH_URL
-  }
-  if (originalEnv.USER_TYPE === undefined) {
-    delete process.env.USER_TYPE
-  } else {
-    process.env.USER_TYPE = originalEnv.USER_TYPE
-  }
-})
-
-async function importFreshEnvModule() {
-  return import(`./env.js?ts=${Date.now()}-${Math.random()}`)
-}
-
-// getGlobalClaudeFile — three migration branches
-
-test('getGlobalClaudeFile: new install returns .openclaude.json when neither file exists', async () => {
-  const { getGlobalClaudeFile } = await importFreshEnvModule()
-  expect(getGlobalClaudeFile()).toBe(join(tempDir, '.openclaude.json'))
-})
-
-test('getGlobalClaudeFile: existing user keeps .claude.json when only legacy file exists', async () => {
-  writeFileSync(join(tempDir, '.claude.json'), '{}')
-  const { getGlobalClaudeFile } = await importFreshEnvModule()
-  expect(getGlobalClaudeFile()).toBe(join(tempDir, '.claude.json'))
-})
-
-test('getGlobalClaudeFile: migrated user uses .openclaude.json when both files exist', async () => {
-  writeFileSync(join(tempDir, '.claude.json'), '{}')
-  writeFileSync(join(tempDir, '.openclaude.json'), '{}')
-  const { getGlobalClaudeFile } = await importFreshEnvModule()
-  expect(getGlobalClaudeFile()).toBe(join(tempDir, '.openclaude.json'))
-})
--- a/src/utils/env.ts
+++ b/src/utils/env.ts
@@ -21,21 +21,8 @@ export const getGlobalClaudeFile = memoize((): string => {
    return join(getClaudeConfigHomeDir(), '.config.json')
  }

-  const oauthSuffix = fileSuffixForOauthConfig()
-  const configDir = process.env.CLAUDE_CONFIG_DIR || homedir()
-
-  // Default to .openclaude.json. Fall back to .claude.json only if the new
-  // file doesn't exist yet and the legacy one does (same migration pattern
-  // as resolveClaudeConfigHomeDir for the config directory).
-  const newFilename = `.openclaude${oauthSuffix}.json`
-  const legacyFilename = `.claude${oauthSuffix}.json`
-  if (
-    !getFsImplementation().existsSync(join(configDir, newFilename)) &&
-    getFsImplementation().existsSync(join(configDir, legacyFilename))
-  ) {
-    return join(configDir, legacyFilename)
-  }
-  return join(configDir, newFilename)
+  const filename = `.claude${fileSuffixForOauthConfig()}.json`
+  return join(process.env.CLAUDE_CONFIG_DIR || homedir(), filename)
 })

 const hasInternetAccess = memoize(async (): Promise<boolean> => {
--- a/src/utils/hookChains.integration.test.ts
+++ b/src/utils/hookChains.integration.test.ts
@@ -1,357 +0,0 @@
-import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
-import { mkdtemp, rm, writeFile } from 'node:fs/promises'
-import { tmpdir } from 'node:os'
-import { join } from 'node:path'
-
-type HookChainsModule = typeof import('./hookChains.js')
-
-type ImportHarnessOptions = {
-  allowRemoteSessions?: boolean
-  teamFile?:
-    | {
-        name: string
-        members: Array<{ name: string }>
-      }
-    | null
-  teamName?: string
-  senderName?: string
-  replBridgeHandle?: unknown
-}
-
-const tempDirs: string[] = []
-const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
-
-async function createConfigFile(config: unknown): Promise<string> {
-  const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-int-'))
-  tempDirs.push(dir)
-  const filePath = join(dir, 'hook-chains.json')
-  await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
-  return filePath
-}
-
-async function importHookChainsHarness(
-  options: ImportHarnessOptions = {},
-): Promise<{
-  mod: HookChainsModule
-  writeToMailboxSpy: ReturnType<typeof mock>
-  agentToolCallSpy: ReturnType<typeof mock>
-}> {
-  mock.restore()
-
-  const allowRemoteSessions = options.allowRemoteSessions ?? true
-  const teamName = options.teamName ?? 'mesh-team'
-  const senderName = options.senderName ?? 'mesh-lead'
-  const replBridgeHandle = options.replBridgeHandle ?? null
-
-  const writeToMailboxSpy = mock(async () => {})
-  const agentToolCallSpy = mock(async () => ({
-    data: {
-      status: 'async_launched',
-      agentId: 'agent-fallback-1',
-    },
-  }))
-
-  mock.module('../services/analytics/index.js', () => ({
-    logEvent: () => {},
-  }))
-
-  mock.module('./telemetry/events.js', () => ({
-    logOTelEvent: async () => {},
-  }))
-
-  mock.module('../services/policyLimits/index.js', () => ({
-    isPolicyAllowed: () => allowRemoteSessions,
-  }))
-
-  mock.module('./swarm/teamHelpers.js', () => ({
-    readTeamFileAsync: async () => options.teamFile ?? null,
-  }))
-
-  mock.module('./teammateMailbox.js', () => ({
-    writeToMailbox: writeToMailboxSpy,
-  }))
-
-  mock.module('./teammate.js', () => ({
-    getAgentName: () => senderName,
-    getTeamName: () => teamName,
-    getTeammateColor: () => 'blue',
-    // Keep parity with the real module's surface so later tests that
-    // run after this file (mock.module is process-global and mock.restore
-    // does not undo module mocks in Bun) do not see undefined members.
-    isTeammate: () => false,
-    isPlanModeRequired: () => false,
-    getAgentId: () => undefined,
-    getParentSessionId: () => undefined,
-  }))
-
-  mock.module('../bridge/replBridgeHandle.js', () => ({
-    getReplBridgeHandle: () => replBridgeHandle,
-  }))
-
-  // Integration mock target requested in the task: fallback action can route
-  // through this mocked tool launcher from runtime callback wiring.
-  mock.module('../tools/AgentTool/AgentTool.js', () => ({
-    AgentTool: {
-      call: agentToolCallSpy,
-    },
-  }))
-
-  const mod = await import(`./hookChains.js?integration=${Date.now()}-${Math.random()}`)
-  return { mod, writeToMailboxSpy, agentToolCallSpy }
-}
-
-beforeEach(() => {
-  process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
-})
-
-afterEach(async () => {
-  mock.restore()
-
-  if (originalHookChainsEnabled === undefined) {
-    delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
-  } else {
-    process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
-  }
-
-  await Promise.all(
-    tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
-  )
-})
-
-describe('hookChains integration dispatch', () => {
-  test('end-to-end rule evaluation + action dispatch on TaskCompleted failure', async () => {
-    const { mod } = await importHookChainsHarness({
-      teamName: 'mesh-team',
-      senderName: 'mesh-lead',
-      teamFile: {
-        name: 'mesh-team',
-        members: [{ name: 'mesh-lead' }, { name: 'worker-a' }, { name: 'worker-b' }],
-      },
-    })
-
-    const configPath = await createConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'task-failure-recovery',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [
-            { type: 'spawn_fallback_agent' },
-            { type: 'notify_team' },
-          ],
-        },
-      ],
-    })
-
-    const spawnSpy = mock(async () => ({ launched: true, agentId: 'agent-e2e-1' }))
-    const notifySpy = mock(async () => ({ sent: true, recipientCount: 2 }))
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: {
-          task_id: 'task-001',
-          task_subject: 'Patch flaky build',
-          error: 'CI timeout',
-        },
-      },
-      runtime: {
-        onSpawnFallbackAgent: spawnSpy,
-        onNotifyTeam: notifySpy,
-      },
-    })
-
-    expect(result.enabled).toBe(true)
-    expect(result.matchedRuleIds).toEqual(['task-failure-recovery'])
-    expect(result.actionResults).toHaveLength(2)
-    expect(result.actionResults[0]?.status).toBe('executed')
-    expect(result.actionResults[1]?.status).toBe('executed')
-    expect(spawnSpy).toHaveBeenCalledTimes(1)
-    expect(notifySpy).toHaveBeenCalledTimes(1)
-  })
-
-  test('fallback spawn injects failure context into generated prompt', async () => {
-    const { mod, agentToolCallSpy } = await importHookChainsHarness()
-
-    const configPath = await createConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'fallback-context',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [
-            {
-              type: 'spawn_fallback_agent',
-              description: 'Fallback for failed task',
-            },
-          ],
-        },
-      ],
-    })
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: {
-          task_id: 'task-ctx-1',
-          task_subject: 'Repair migration guard',
-          task_description: 'Fix regression in check ordering',
-          error: 'Task failed after retry budget exhausted',
-        },
-      },
-      runtime: {
-        onSpawnFallbackAgent: async request => {
-          const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
-          await (AgentTool.call as unknown as (...args: unknown[]) => Promise<unknown>)({
-            prompt: request.prompt,
-            description: request.description,
-            run_in_background: request.runInBackground,
-            subagent_type: request.agentType,
-            model: request.model,
-          })
-          return { launched: true, agentId: 'agent-fallback-ctx' }
-        },
-      },
-    })
-
-    expect(result.actionResults[0]?.status).toBe('executed')
-    expect(agentToolCallSpy).toHaveBeenCalledTimes(1)
-
-    const callInput = agentToolCallSpy.mock.calls[0]?.[0] as {
-      prompt: string
-      description: string
-      run_in_background: boolean
-    }
-
-    expect(callInput.description).toBe('Fallback for failed task')
-    expect(callInput.run_in_background).toBe(true)
-    expect(callInput.prompt).toContain('Event: TaskCompleted')
-    expect(callInput.prompt).toContain('Outcome: failed')
-    expect(callInput.prompt).toContain('Task subject: Repair migration guard')
-    expect(callInput.prompt).toContain('Failure details: Task failed after retry budget exhausted')
-  })
-
-  test('notify_team dispatches mailbox writes when team exists and skips when absent', async () => {
-    const withTeam = await importHookChainsHarness({
-      teamName: 'mesh-a',
-      senderName: 'lead-a',
-      teamFile: {
-        name: 'mesh-a',
-        members: [{ name: 'lead-a' }, { name: 'worker-1' }, { name: 'worker-2' }],
-      },
-    })
-
-    const configPathWithTeam = await createConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'notify-existing-team',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'notify_team' }],
-        },
-      ],
-    })
-
-    const withTeamResult = await withTeam.mod.dispatchHookChainsForEvent({
-      configPathOverride: configPathWithTeam,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-team-ok', error: 'boom' },
-      },
-    })
-
-    expect(withTeamResult.actionResults[0]?.status).toBe('executed')
-    expect(withTeam.writeToMailboxSpy).toHaveBeenCalledTimes(2)
-
-    const recipients = withTeam.writeToMailboxSpy.mock.calls.map(
-      call => call[0] as string,
-    )
-    expect(recipients.sort()).toEqual(['worker-1', 'worker-2'])
-
-    const withoutTeam = await importHookChainsHarness({
-      teamName: 'mesh-missing',
-      senderName: 'lead-missing',
-      teamFile: null,
-    })
-
-    const configPathWithoutTeam = await createConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'notify-missing-team',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'notify_team' }],
-        },
-      ],
-    })
-
-    const withoutTeamResult = await withoutTeam.mod.dispatchHookChainsForEvent({
-      configPathOverride: configPathWithoutTeam,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-team-missing', error: 'boom' },
-      },
-    })
-
-    expect(withoutTeamResult.actionResults[0]?.status).toBe('skipped')
-    expect(withoutTeamResult.actionResults[0]?.reason).toContain('Team file not found')
-    expect(withoutTeam.writeToMailboxSpy).not.toHaveBeenCalled()
-  })
-
-  test('warm_remote_capacity is a safe no-op when bridge is inactive', async () => {
-    const { mod } = await importHookChainsHarness({
-      allowRemoteSessions: true,
-      replBridgeHandle: null,
-    })
-
-    const configPath = await createConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'bridge-warmup-noop',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'warm_remote_capacity' }],
-        },
-      ],
-    })
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-warm-1' },
-      },
-    })
-
-    expect(result.actionResults).toHaveLength(1)
-    expect(result.actionResults[0]?.status).toBe('skipped')
-    expect(result.actionResults[0]?.reason).toContain('Bridge is not active')
-  })
-})
--- a/src/utils/hookChains.test.ts
+++ b/src/utils/hookChains.test.ts
@@ -1,476 +0,0 @@
-import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test'
-import { mkdtemp, rm, writeFile } from 'node:fs/promises'
-import { tmpdir } from 'node:os'
-import { join } from 'node:path'
-
-type HookChainsModule = typeof import('./hookChains.js')
-
-const tempDirs: string[] = []
-const originalHookChainsEnabled = process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
-
-async function makeConfigFile(config: unknown): Promise<string> {
-  const dir = await mkdtemp(join(tmpdir(), 'openclaude-hook-chains-'))
-  tempDirs.push(dir)
-  const filePath = join(dir, 'hook-chains.json')
-  await writeFile(filePath, JSON.stringify(config, null, 2), 'utf-8')
-  return filePath
-}
-
-async function importHookChainsModule(options?: {
-  allowRemoteSessions?: boolean
-}): Promise<HookChainsModule> {
-  mock.restore()
-
-  const allowRemoteSessions = options?.allowRemoteSessions ?? true
-
-  mock.module('../services/analytics/index.js', () => ({
-    logEvent: () => {},
-  }))
-
-  mock.module('./telemetry/events.js', () => ({
-    logOTelEvent: async () => {},
-  }))
-
-  mock.module('../services/policyLimits/index.js', () => ({
-    isPolicyAllowed: () => allowRemoteSessions,
-  }))
-
-  return import(`./hookChains.js?test=${Date.now()}-${Math.random()}`)
-}
-
-beforeEach(() => {
-  process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = '1'
-})
-
-afterEach(async () => {
-  mock.restore()
-
-  if (originalHookChainsEnabled === undefined) {
-    delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
-  } else {
-    process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS = originalHookChainsEnabled
-  }
-
-  await Promise.all(
-    tempDirs.splice(0).map(dir => rm(dir, { recursive: true, force: true })),
-  )
-})
-
-describe('hookChains schema validation', () => {
-  test('returns disabled config when env gate is unset', async () => {
-    delete process.env.CLAUDE_CODE_ENABLE_HOOK_CHAINS
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      rules: [
-        {
-          id: 'env-gated-rule',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'spawn_fallback_agent' }],
-        },
-      ],
-    })
-
-    const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
-    expect(loaded.exists).toBe(false)
-    expect(loaded.config.enabled).toBe(false)
-    expect(loaded.config.rules).toHaveLength(0)
-  })
-
-  test('loads valid config and memoizes by mtime/size', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 5000,
-      defaultDedupWindowMs: 5000,
-      rules: [
-        {
-          id: 'task-failure-fallback',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [
-            {
-              type: 'spawn_fallback_agent',
-              description: 'Fallback recovery agent',
-            },
-          ],
-        },
-      ],
-    })
-
-    const first = mod.loadHookChainsConfig({ pathOverride: configPath })
-    expect(first.exists).toBe(true)
-    expect(first.error).toBeUndefined()
-    expect(first.fromCache).toBe(false)
-    expect(first.config.enabled).toBe(true)
-    expect(first.config.rules).toHaveLength(1)
-    expect(first.config.rules[0]?.id).toBe('task-failure-fallback')
-
-    const second = mod.loadHookChainsConfig({ pathOverride: configPath })
-    expect(second.exists).toBe(true)
-    expect(second.error).toBeUndefined()
-    expect(second.fromCache).toBe(true)
-    expect(second.config.rules).toHaveLength(1)
-  })
-
-  test('accepts wrapped { hookChains: ... } config shape', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      hookChains: {
-        version: 1,
-        enabled: true,
-        rules: [
-          {
-            id: 'wrapped-shape',
-            trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
-            actions: [{ type: 'notify_team' }],
-          },
-        ],
-      },
-    })
-
-    const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
-    expect(loaded.error).toBeUndefined()
-    expect(loaded.config.enabled).toBe(true)
-    expect(loaded.config.rules[0]?.id).toBe('wrapped-shape')
-  })
-
-  test('returns disabled config for invalid schema', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      rules: [
-        {
-          id: 'invalid-rule',
-          trigger: {
-            event: 'TaskCompleted',
-            outcome: 'failed',
-            outcomes: ['failed'],
-          },
-          actions: [{ type: 'spawn_fallback_agent' }],
-        },
-      ],
-    })
-
-    const loaded = mod.loadHookChainsConfig({ pathOverride: configPath })
-    expect(loaded.exists).toBe(true)
-    expect(loaded.error).toBeDefined()
-    expect(loaded.config.enabled).toBe(false)
-    expect(loaded.config.rules).toHaveLength(0)
-  })
-})
-
-describe('evaluateHookChainRules', () => {
-  test('matches by event + outcome + condition', async () => {
-    const mod = await importHookChainsModule()
-
-    const rules = [
-      {
-        id: 'post-tool-failure-rule',
-        trigger: { event: 'PostToolUseFailure', outcome: 'failed' },
-        condition: {
-          toolNames: ['Edit'],
-          errorIncludes: ['permission'],
-          eventFieldEquals: { 'meta.source': 'scheduler' },
-        },
-        actions: [{ type: 'spawn_fallback_agent' }],
-      },
-    ]
-
-    const matches = mod.evaluateHookChainRules(rules as never, {
-      eventName: 'PostToolUseFailure',
-      outcome: 'failed',
-      payload: {
-        tool_name: 'Edit',
-        error: 'Permission denied by policy',
-        meta: { source: 'scheduler' },
-      },
-    })
-
-    expect(matches).toHaveLength(1)
-    expect(matches[0]?.rule.id).toBe('post-tool-failure-rule')
-  })
-
-  test('does not match when event/condition fail', async () => {
-    const mod = await importHookChainsModule()
-
-    const rules = [
-      {
-        id: 'rule-no-match',
-        trigger: { event: 'PostToolUseFailure', outcomes: ['failed'] },
-        condition: { toolNames: ['Write'] },
-        actions: [{ type: 'spawn_fallback_agent' }],
-      },
-    ]
-
-    const wrongEvent = mod.evaluateHookChainRules(rules as never, {
-      eventName: 'TaskCompleted',
-      outcome: 'failed',
-      payload: { tool_name: 'Write' },
-    })
-    expect(wrongEvent).toHaveLength(0)
-
-    const wrongCondition = mod.evaluateHookChainRules(rules as never, {
-      eventName: 'PostToolUseFailure',
-      outcome: 'failed',
-      payload: { tool_name: 'Edit' },
-    })
-    expect(wrongCondition).toHaveLength(0)
-  })
-})
-
-describe('dispatchHookChainsForEvent guard logic', () => {
-  test('dedup skips duplicate event/action within dedup window', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 4,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 60_000,
-      rules: [
-        {
-          id: 'dedup-rule',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          cooldownMs: 0,
-          dedupWindowMs: 60_000,
-          actions: [{ id: 'spawn-1', type: 'spawn_fallback_agent' }],
-        },
-      ],
-    })
-
-    const spawn = mock(async () => ({ launched: true, agentId: 'agent-1' }))
-
-    const first = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-123', error: 'boom' },
-      },
-      runtime: { onSpawnFallbackAgent: spawn },
-    })
-
-    const second = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-123', error: 'boom' },
-      },
-      runtime: { onSpawnFallbackAgent: spawn },
-    })
-
-    expect(first.actionResults[0]?.status).toBe('executed')
-    expect(second.actionResults[0]?.status).toBe('skipped')
-    expect(second.actionResults[0]?.reason).toContain('dedup')
-    expect(spawn).toHaveBeenCalledTimes(1)
-  })
-
-  test('cooldown skips second dispatch when rule cooldown is active', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 4,
-      defaultCooldownMs: 60_000,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'cooldown-rule',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          cooldownMs: 60_000,
-          dedupWindowMs: 0,
-          actions: [{ type: 'spawn_fallback_agent' }],
-        },
-      ],
-    })
-
-    const spawn = mock(async () => ({ launched: true, agentId: 'agent-2' }))
-
-    const first = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-456' },
-      },
-      runtime: { onSpawnFallbackAgent: spawn },
-    })
-
-    const second = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-789' },
-      },
-      runtime: { onSpawnFallbackAgent: spawn },
-    })
-
-    expect(first.actionResults[0]?.status).toBe('executed')
-    expect(second.actionResults[0]?.status).toBe('skipped')
-    expect(second.actionResults[0]?.reason).toContain('cooldown')
-    expect(spawn).toHaveBeenCalledTimes(1)
-  })
-
-  test('depth limit blocks dispatch when chain depth reaches max', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 1,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'depth-rule',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'spawn_fallback_agent' }],
-        },
-      ],
-    })
-
-    const spawn = mock(async () => ({ launched: true, agentId: 'agent-3' }))
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-depth' },
-      },
-      runtime: {
-        chainDepth: 1,
-        onSpawnFallbackAgent: spawn,
-      },
-    })
-
-    expect(result.enabled).toBe(true)
-    expect(result.matchedRuleIds).toHaveLength(0)
-    expect(result.actionResults).toHaveLength(0)
-    expect(spawn).not.toHaveBeenCalled()
-  })
-})
-
-describe('action dispatch skip scenarios', () => {
-  test('fails spawn_fallback_agent when launcher callback is missing', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'missing-launcher',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'spawn_fallback_agent' }],
-        },
-      ],
-    })
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-missing-launcher' },
-      },
-      runtime: {},
-    })
-
-    expect(result.actionResults[0]?.status).toBe('failed')
-    expect(result.actionResults[0]?.reason).toContain('launcher')
-  })
-
-  test('skips disabled action and does not execute callback', async () => {
-    const mod = await importHookChainsModule()
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'disabled-action-rule',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [
-            {
-              type: 'spawn_fallback_agent',
-              enabled: false,
-            },
-          ],
-        },
-      ],
-    })
-
-    const spawn = mock(async () => ({ launched: true, agentId: 'agent-4' }))
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-disabled' },
-      },
-      runtime: { onSpawnFallbackAgent: spawn },
-    })
-
-    expect(result.actionResults[0]?.status).toBe('skipped')
-    expect(result.actionResults[0]?.reason).toContain('disabled')
-    expect(spawn).not.toHaveBeenCalled()
-  })
-
-  test('skips warm_remote_capacity when policy denies remote sessions', async () => {
-    const mod = await importHookChainsModule({ allowRemoteSessions: false })
-
-    const configPath = await makeConfigFile({
-      version: 1,
-      enabled: true,
-      maxChainDepth: 3,
-      defaultCooldownMs: 0,
-      defaultDedupWindowMs: 0,
-      rules: [
-        {
-          id: 'policy-denied-remote-warm',
-          trigger: { event: 'TaskCompleted', outcome: 'failed' },
-          actions: [{ type: 'warm_remote_capacity' }],
-        },
-      ],
-    })
-
-    const warm = mock(async () => ({
-      warmed: true,
-      environmentId: 'env-123',
-    }))
-
-    const result = await mod.dispatchHookChainsForEvent({
-      configPathOverride: configPath,
-      event: {
-        eventName: 'TaskCompleted',
-        outcome: 'failed',
-        payload: { task_id: 'task-policy-denied' },
-      },
-      runtime: { onWarmRemoteCapacity: warm },
-    })
-
-    expect(result.actionResults[0]?.status).toBe('skipped')
-    expect(result.actionResults[0]?.reason).toContain('policy')
-    expect(warm).not.toHaveBeenCalled()
-  })
-})
--- a/src/utils/hookChains.ts
+++ b/src/utils/hookChains.ts
--- a/src/utils/hooks.ts
+++ b/src/utils/hooks.ts
@@ -10,7 +10,6 @@ import { wrapSpawn } from './ShellCommand.js'
 import { TaskOutput } from './task/TaskOutput.js'
 import { getCwd } from './cwd.js'
 import { randomUUID } from 'crypto'
-import { feature } from 'bun:bundle'
 import { formatShellPrefixCommand } from './bash/shellPrefix.js'
 import {
  getHookEnvFilePath,
@@ -135,7 +134,6 @@ import { registerPendingAsyncHook } from './hooks/AsyncHookRegistry.js'
 import { enqueuePendingNotification } from './messageQueueManager.js'
 import {
  extractTextContent,
-  createAssistantMessage,
  getLastAssistantMessage,
  wrapInSystemReminder,
 } from './messages.js'
@@ -147,7 +145,6 @@ import {
 import { createAttachmentMessage } from './attachments.js'
 import { all } from './generators.js'
 import { findToolByName, type Tools, type ToolUseContext } from '../Tool.js'
-import type { CanUseToolFn } from '../hooks/useCanUseTool.js'
 import { execPromptHook } from './hooks/execPromptHook.js'
 import type { Message, AssistantMessage } from '../types/message.js'
 import { execAgentHook } from './hooks/execAgentHook.js'
@@ -165,147 +162,9 @@ import type { AppState } from '../state/AppState.js'
 import { jsonStringify, jsonParse } from './slowOperations.js'
 import { isEnvTruthy } from './envUtils.js'
 import { errorMessage, getErrnoCode } from './errors.js'
-import { getAgentName, getTeamName, getTeammateColor } from './teammate.js'
-import type {
-  HookChainOutcome,
-  HookChainRuntimeContext,
-  SpawnFallbackAgentRequest,
-  SpawnFallbackAgentResponse,
-} from './hookChains.js'

 const TOOL_HOOK_EXECUTION_TIMEOUT_MS = 10 * 60 * 1000

-function normalizeFallbackAgentModel(
-  model: string | undefined,
-): 'sonnet' | 'opus' | 'haiku' | undefined {
-  if (model === 'sonnet' || model === 'opus' || model === 'haiku') {
-    return model
-  }
-  return undefined
-}
-
-async function launchFallbackAgentFromHookChains(
-  request: SpawnFallbackAgentRequest,
-  toolUseContext: ToolUseContext,
-  canUseTool: CanUseToolFn,
-): Promise<SpawnFallbackAgentResponse> {
-  try {
-    const { AgentTool } = await import('../tools/AgentTool/AgentTool.js')
-    const normalizedModel = normalizeFallbackAgentModel(request.model)
-    const result = await AgentTool.call(
-      {
-        prompt: request.prompt,
-        description: request.description,
-        run_in_background: true,
-        ...(request.agentType ? { subagent_type: request.agentType } : {}),
-        ...(normalizedModel ? { model: normalizedModel } : {}),
-      },
-      toolUseContext,
-      canUseTool,
-      createAssistantMessage({ content: [] }),
-    )
-
-    const data = result.data as
-      | {
-          status?: string
-          agentId?: string
-          agent_id?: string
-        }
-      | undefined
-    const status = data?.status
-
-    if (
-      status === 'async_launched' ||
-      status === 'completed' ||
-      status === 'remote_launched' ||
-      status === 'teammate_spawned'
-    ) {
-      return {
-        launched: true,
-        agentId: data?.agentId ?? data?.agent_id,
-      }
-    }
-
-    return {
-      launched: true,
-      reason:
-        status !== undefined
-          ? `Fallback launched with status ${status}`
-          : undefined,
-    }
-  } catch (error) {
-    return {
-      launched: false,
-      reason: `Fallback launch failed: ${errorMessage(error)}`,
-    }
-  }
-}
-
-async function dispatchHookChainFromHookRuntime(args: {
-  eventName: 'PostToolUseFailure' | 'TaskCompleted'
-  outcome: HookChainOutcome
-  payload: Record<string, unknown>
-  signal?: AbortSignal
-  toolUseContext?: ToolUseContext
-}): Promise<void> {
-  try {
-    if (!feature('HOOK_CHAINS')) {
-      return
-    }
-
-    const { dispatchHookChainsForEvent } = await import('./hookChains.js')
-
-    const runtime: HookChainRuntimeContext = {
-      signal: args.signal,
-      senderName: getAgentName() ?? undefined,
-      senderColor: getTeammateColor() ?? undefined,
-      teamName: getTeamName() ?? undefined,
-    }
-
-    const chainDepth = args.toolUseContext?.queryTracking?.depth
-    if (typeof chainDepth === 'number' && Number.isFinite(chainDepth)) {
-      runtime.chainDepth = chainDepth
-    }
-
-    const hookChainsCanUseTool = (
-      args.toolUseContext as
-        | (ToolUseContext & { hookChainsCanUseTool?: CanUseToolFn })
-        | undefined
-    )?.hookChainsCanUseTool
-
-    if (args.toolUseContext) {
-      runtime.onSpawnFallbackAgent = request => {
-        if (!hookChainsCanUseTool) {
-          return Promise.resolve({
-            launched: false,
-            reason:
-              'Fallback action requires canUseTool in this hook runtime context',
-          })
-        }
-
-        return launchFallbackAgentFromHookChains(
-          request,
-          args.toolUseContext!,
-          hookChainsCanUseTool,
-        )
-      }
-    }
-
-    await dispatchHookChainsForEvent({
-      event: {
-        eventName: args.eventName,
-        outcome: args.outcome,
-        payload: args.payload,
-      },
-      runtime,
-    })
-  } catch (error) {
-    logForDebugging(
-      `[hook-chains] Dispatch failed for ${args.eventName}: ${errorMessage(error)}`,
-    )
-  }
-}
-
 /**
 * SessionEnd hooks run during shutdown/clear and need a much tighter bound
 * than TOOL_HOOK_EXECUTION_TIMEOUT_MS. This value is used by callers as both
@@ -3643,11 +3502,9 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
 ): AsyncGenerator<AggregatedHookResult> {
  const appState = toolUseContext.getAppState()
  const sessionId = toolUseContext.agentId ?? getSessionId()
-  const hasPostToolFailureHooks = hasHookForEvent(
-    'PostToolUseFailure',
-    appState,
-    sessionId,
-  )
+  if (!hasHookForEvent('PostToolUseFailure', appState, sessionId)) {
+    return
+  }

  const hookInput: PostToolUseFailureHookInput = {
    ...createBaseHookInput(permissionMode, undefined, toolUseContext),
@@ -3659,33 +3516,12 @@ export async function* executePostToolUseFailureHooks<ToolInput>(
    is_interrupt: isInterrupt,
  }

-  let blockingHookCount = 0
-
-  if (hasPostToolFailureHooks) {
-    for await (const result of executeHooks({
-      hookInput,
-      toolUseID,
-      matchQuery: toolName,
-      signal,
-      timeoutMs,
-      toolUseContext,
-    })) {
-      if (result.blockingError) {
-        blockingHookCount++
-      }
-      yield result
-    }
-  }
-
-  await dispatchHookChainFromHookRuntime({
-    eventName: 'PostToolUseFailure',
-    outcome: 'failed',
-    payload: {
-      ...hookInput,
-      hook_blocking_error_count: blockingHookCount,
-      hook_execution_skipped: !hasPostToolFailureHooks,
-    },
+  yield* executeHooks({
+    hookInput,
+    toolUseID,
+    matchQuery: toolName,
    signal,
+    timeoutMs,
    toolUseContext,
  })
 }
@@ -3971,36 +3807,12 @@ export async function* executeTaskCompletedHooks(
    team_name: teamName,
  }

-  let blockingHookCount = 0
-  let preventedContinuation = false
-
-  for await (const result of executeHooks({
+  yield* executeHooks({
    hookInput,
    toolUseID: randomUUID(),
    signal,
    timeoutMs,
    toolUseContext,
-  })) {
-    if (result.blockingError) {
-      blockingHookCount++
-    }
-    if (result.preventContinuation) {
-      preventedContinuation = true
-    }
-    yield result
-  }
-
-  await dispatchHookChainFromHookRuntime({
-    eventName: 'TaskCompleted',
-    outcome:
-      blockingHookCount > 0 || preventedContinuation ? 'failed' : 'success',
-    payload: {
-      ...hookInput,
-      hook_blocking_error_count: blockingHookCount,
-      hook_prevented_continuation: preventedContinuation,
-    },
-    signal,
-    toolUseContext,
  })
 }

--- a/src/utils/json.ts
+++ b/src/utils/json.ts
@@ -24,7 +24,7 @@ type CachedParse = { ok: true; value: unknown } | { ok: false }
 // lodash memoize default resolver = first arg only).
 // Skip caching above this size — the LRU stores the full string as the key,
 // so a 200KB config file would pin ~10MB in #keyList across 50 slots. Large
-// inputs like ~/.openclaude.json also change between reads (numStartups bumps on
+// inputs like ~/.claude.json also change between reads (numStartups bumps on
 // every CC startup), so the cache never hits anyway.
 const PARSE_CACHE_MAX_KEY_BYTES = 8 * 1024

--- a/src/utils/localInstaller.ts
+++ b/src/utils/localInstaller.ts
@@ -44,10 +44,9 @@ function getCandidateLocalBinaryPaths(localInstallDir: string): string[] {
 }

 export function isManagedLocalInstallationPath(execPath: string): boolean {
-  const normalizedExecPath = execPath.replace(/\\+/g, '/')
  return (
-    normalizedExecPath.includes('/.openclaude/local/node_modules/') ||
-    normalizedExecPath.includes('/.claude/local/node_modules/')
+    execPath.includes('/.openclaude/local/node_modules/') ||
+    execPath.includes('/.claude/local/node_modules/')
  )
 }

--- a/src/utils/managedEnv.ts
+++ b/src/utils/managedEnv.ts
@@ -131,7 +131,7 @@ export function applySafeConfigEnvironmentVariables(): void {
        : null
  }

-  // Global config (~/.openclaude.json) is user-controlled. In CCD mode,
+  // Global config (~/.claude.json) is user-controlled. In CCD mode,
  // filterSettingsEnv strips keys that were in the spawn env snapshot so
  // the desktop host's operational vars (OTEL, etc.) are not overridden.
  Object.assign(process.env, filterSettingsEnv(getGlobalConfig().env))
--- a/src/utils/managedEnvConstants.ts
+++ b/src/utils/managedEnvConstants.ts
@@ -123,6 +123,7 @@ export const SAFE_ENV_VARS = new Set([
  'ANTHROPIC_DEFAULT_SONNET_MODEL_DESCRIPTION',
  'ANTHROPIC_DEFAULT_SONNET_MODEL_NAME',
  'ANTHROPIC_DEFAULT_SONNET_MODEL_SUPPORTED_CAPABILITIES',
+  'ANTHROPIC_FOUNDRY_API_KEY',
  'ANTHROPIC_MODEL',
  'ANTHROPIC_SMALL_FAST_MODEL_AWS_REGION',
  'ANTHROPIC_SMALL_FAST_MODEL',
--- a/src/utils/messages.ts
+++ b/src/utils/messages.ts
@@ -75,7 +75,6 @@ import type {
 import { isAdvisorBlock } from './advisor.js'
 import { isAgentSwarmsEnabled } from './agentSwarmsEnabled.js'
 import { count } from './array.js'
-import { isEnvTruthy } from './envUtils.js'
 import {
  type Attachment,
  type HookAttachment,
@@ -3667,9 +3666,6 @@ Read the team config to discover your teammates' names. Check the task list peri
      ])
    }
    case 'todo_reminder': {
-      if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
-        return []
-      }
      const todoItems = attachment.content
        .map((todo, index) => `${index + 1}. [${todo.status}] ${todo.content}`)
        .join('\n')
@@ -3690,9 +3686,6 @@ Read the team config to discover your teammates' names. Check the task list peri
      if (!isTodoV2Enabled()) {
        return []
      }
-      if (isEnvTruthy(process.env.OPENCLAUDE_DISABLE_TOOL_REMINDERS)) {
-        return []
-      }
      const taskItems = attachment.content
        .map(task => `#${task.id}. [${task.status}] ${task.subject}`)
        .join('\n')
--- a/src/utils/model/benchmark.ts
+++ b/src/utils/model/benchmark.ts
@@ -1,205 +0,0 @@
-/**
- * Model Benchmarking for OpenClaude
- * 
- * Tests and compares model speed/quality for informed model selection.
- * Supports OpenAI-compatible, Ollama, Anthropic, Bedrock, Vertex.
- */
-
-import { getAPIProvider } from './providers.js'
-
-export interface BenchmarkResult {
-  model: string
-  provider: string
-  firstTokenMs: number
-  totalTokens: number
-  tokensPerSecond: number
-  success: boolean
-  error?: string
-}
-
-const TEST_PROMPT = 'Write a short hello world in Python.'
-const MAX_TOKENS = 50
-const TIMEOUT_MS = 30000
-
-function getBenchmarkEndpoint(): string | null {
-  const provider = getAPIProvider()
-  const baseUrl = process.env.OPENAI_BASE_URL
-  
-  // Check for Ollama (local)
-  if (baseUrl?.includes('localhost:11434') || baseUrl?.includes('localhost:11435')) {
-    return `${baseUrl}/chat/completions`
-  }
-  // OpenAI-compatible endpoints
-  if (provider === 'openai' || provider === 'firstParty') {
-    return `${baseUrl || 'https://api.openai.com/v1'}/chat/completions`
-  }
-  // NVIDIA NIM or MiniMax via OPENAI_BASE_URL
-  if (baseUrl?.includes('nvidia') || baseUrl?.includes('minimax')) {
-    return `${baseUrl}/chat/completions`
-  }
-  return null
-}
-
-function getBenchmarkAuthHeader(): string | null {
-  const apiKey = process.env.OPENAI_API_KEY
-  if (!apiKey) return null
-  return `Bearer ${apiKey}`
-}
-
-export async function benchmarkModel(
-  model: string,
-  onChunk?: (text: string) => void,
-): Promise<BenchmarkResult> {
-  const endpoint = getBenchmarkEndpoint()
-  const authHeader = getBenchmarkAuthHeader()
-  
-  if (!endpoint || !authHeader) {
-    return {
-      model,
-      provider: getAPIProvider(),
-      firstTokenMs: 0,
-      totalTokens: 0,
-      tokensPerSecond: 0,
-      success: false,
-      error: 'Benchmark not supported for this provider',
-    }
-  }
-  
-  const startTime = performance.now()
-  let totalTokens = 0
-  let firstTokenMs: number | null = null
-
-  try {
-    const response = await fetch(endpoint, {
-      method: 'POST',
-      headers: {
-        'Content-Type': 'application/json',
-        'Authorization': authHeader,
-      },
-      body: JSON.stringify({
-        model,
-        messages: [{ role: 'user', content: TEST_PROMPT }],
-        max_tokens: MAX_TOKENS,
-        stream: true,
-      }),
-      signal: AbortSignal.timeout(TIMEOUT_MS),
-    })
-
-    if (!response.ok) {
-      let errorMsg = `HTTP ${response.status}`
-      try {
-        const error = await response.json()
-        errorMsg = error.error?.message || errorMsg
-      } catch {
-        // ignore
-      }
-      return {
-        model,
-        provider: getAPIProvider(),
-        firstTokenMs: 0,
-        totalTokens: 0,
-        tokensPerSecond: 0,
-        success: false,
-        error: errorMsg,
-      }
-    }
-
-    const reader = response.body?.getReader()
-    if (!reader) {
-      throw new Error('No response body')
-    }
-
-    const decoder = new TextDecoder()
-    let buffer = ''
-
-    while (true) {
-      const { done, value } = await reader.read()
-      if (done) break
-
-      buffer += decoder.decode(value, { stream: true })
-      const lines = buffer.split('\n')
-      buffer = lines.pop() || ''
-
-      for (const line of lines) {
-        if (line.startsWith('data: ')) {
-          const data = line.slice(6)
-          if (data === '[DONE]') continue
-
-          try {
-            const json = JSON.parse(data)
-            const content = json.choices?.[0]?.delta?.content
-            if (content) {
-              if (firstTokenMs === null) {
-                firstTokenMs = performance.now() - startTime
-              }
-              totalTokens += content.length / 4
-              onChunk?.(content)
-            }
-          } catch {
-            // skip invalid JSON
-          }
-        }
-      }
-    }
-
-    const totalMs = performance.now() - startTime
-    const tokensPerSecond = totalMs > 0 ? (totalTokens / totalMs) * 1000 : 0
-
-    return {
-      model,
-      provider: getAPIProvider(),
-      firstTokenMs: firstTokenMs ?? 0,
-      totalTokens,
-      tokensPerSecond,
-      success: true,
-    }
-  } catch (error) {
-    return {
-      model,
-      provider: getAPIProvider(),
-      firstTokenMs: 0,
-      totalTokens: 0,
-      tokensPerSecond: 0,
-      success: false,
-      error: error instanceof Error ? error.message : 'Unknown error',
-    }
-  }
-}
-
-export async function benchmarkMultipleModels(
-  models: string[],
-  onProgress?: (completed: number, total: number, result: BenchmarkResult) => void,
-): Promise<BenchmarkResult[]> {
-  const results: BenchmarkResult[] = []
-
-  for (let i = 0; i < models.length; i++) {
-    const result = await benchmarkModel(models[i])
-    results.push(result)
-    onProgress?.(i + 1, models.length, result)
-  }
-
-  return results
-}
-
-export function formatBenchmarkResults(results: BenchmarkResult[]): string {
-  const header = 'Model'.padEnd(40) + 'TPS' + '  First Token' + '  Status'
-  const divider = '-'.repeat(70)
-  
-  const rows = results
-    .sort((a, b) => b.tokensPerSecond - a.tokensPerSecond)
-    .map(r => {
-      const name = r.model.length > 38 ? r.model.slice(0, 37) + '…' : r.model
-      const tps = r.tokensPerSecond.toFixed(1).padStart(6)
-      const first = r.firstTokenMs > 0 ? `${r.firstTokenMs.toFixed(0)}ms`.padStart(12) : 'N/A'.padStart(12)
-      const status = r.success ? '✓' : '✗'
-      return name.padEnd(40) + tps + '  ' + first + '  ' + status
-    })
-
-  return [header, divider, ...rows].join('\n')
-}
-
-export function isBenchmarkSupported(): boolean {
-  const endpoint = getBenchmarkEndpoint()
-  const authHeader = getBenchmarkAuthHeader()
-  return endpoint !== null && authHeader !== null
-}
--- a/src/utils/model/configs.ts
+++ b/src/utils/model/configs.ts
@@ -20,7 +20,7 @@ export const OPENAI_MODEL_DEFAULTS = {
 // Override with GEMINI_MODEL env var.
 // ---------------------------------------------------------------------------
 export const GEMINI_MODEL_DEFAULTS = {
-  opus: 'gemini-2.5-pro',   // most capable
+  opus: 'gemini-2.5-pro-preview-03-25',   // most capable
  sonnet: 'gemini-2.0-flash',              // balanced
  haiku: 'gemini-2.0-flash-lite',          // fast & cheap
 } as const
@@ -112,7 +112,7 @@ export const CLAUDE_OPUS_4_CONFIG = {
  vertex: 'claude-opus-4@20250514',
  foundry: 'claude-opus-4',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro',
+  gemini: 'gemini-2.5-pro-preview-03-25',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -125,7 +125,7 @@ export const CLAUDE_OPUS_4_1_CONFIG = {
  vertex: 'claude-opus-4-1@20250805',
  foundry: 'claude-opus-4-1',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro',
+  gemini: 'gemini-2.5-pro-preview-03-25',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -138,7 +138,7 @@ export const CLAUDE_OPUS_4_5_CONFIG = {
  vertex: 'claude-opus-4-5@20251101',
  foundry: 'claude-opus-4-5',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro',
+  gemini: 'gemini-2.5-pro-preview-03-25',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
@@ -151,7 +151,7 @@ export const CLAUDE_OPUS_4_6_CONFIG = {
  vertex: 'claude-opus-4-6',
  foundry: 'claude-opus-4-6',
  openai: 'gpt-4o',
-  gemini: 'gemini-2.5-pro',
+  gemini: 'gemini-2.5-pro-preview-03-25',
  github: 'github:copilot',
  codex: 'gpt-5.4',
  'nvidia-nim': 'nvidia/llama-3.1-nemotron-70b-instruct',
--- a/src/utils/model/model.openai-shim-providers.test.ts
+++ b/src/utils/model/model.openai-shim-providers.test.ts
@@ -1,199 +0,0 @@
-import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
-
-import { saveGlobalConfig } from '../config.js'
-import {
-  getDefaultHaikuModel,
-  getDefaultOpusModel,
-  getDefaultSonnetModel,
-  getSmallFastModel,
-  getUserSpecifiedModelSetting,
-} from './model.js'
-
-const SAVED_ENV = {
-  CLAUDE_CODE_USE_OPENAI: process.env.CLAUDE_CODE_USE_OPENAI,
-  CLAUDE_CODE_USE_GEMINI: process.env.CLAUDE_CODE_USE_GEMINI,
-  CLAUDE_CODE_USE_GITHUB: process.env.CLAUDE_CODE_USE_GITHUB,
-  CLAUDE_CODE_USE_MISTRAL: process.env.CLAUDE_CODE_USE_MISTRAL,
-  CLAUDE_CODE_USE_BEDROCK: process.env.CLAUDE_CODE_USE_BEDROCK,
-  CLAUDE_CODE_USE_VERTEX: process.env.CLAUDE_CODE_USE_VERTEX,
-  CLAUDE_CODE_USE_FOUNDRY: process.env.CLAUDE_CODE_USE_FOUNDRY,
-  NVIDIA_NIM: process.env.NVIDIA_NIM,
-  MINIMAX_API_KEY: process.env.MINIMAX_API_KEY,
-  OPENAI_MODEL: process.env.OPENAI_MODEL,
-  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
-  CODEX_API_KEY: process.env.CODEX_API_KEY,
-  CHATGPT_ACCOUNT_ID: process.env.CHATGPT_ACCOUNT_ID,
-}
-
-function restoreEnv(key: keyof typeof SAVED_ENV): void {
-  if (SAVED_ENV[key] === undefined) {
-    delete process.env[key]
-  } else {
-    process.env[key] = SAVED_ENV[key]
-  }
-}
-
-beforeEach(() => {
-  // Other test files (notably modelOptions.github.test.ts) install a
-  // persistent mock.module for './providers.js' that overrides getAPIProvider
-  // globally. Without mock.restore() here, those overrides bleed into this
-  // suite and the provider-kind branches we're testing become unreachable.
-  mock.restore()
-  delete process.env.CLAUDE_CODE_USE_OPENAI
-  delete process.env.CLAUDE_CODE_USE_GEMINI
-  delete process.env.CLAUDE_CODE_USE_GITHUB
-  delete process.env.CLAUDE_CODE_USE_MISTRAL
-  delete process.env.CLAUDE_CODE_USE_BEDROCK
-  delete process.env.CLAUDE_CODE_USE_VERTEX
-  delete process.env.CLAUDE_CODE_USE_FOUNDRY
-  delete process.env.NVIDIA_NIM
-  delete process.env.MINIMAX_API_KEY
-  delete process.env.OPENAI_MODEL
-  delete process.env.OPENAI_BASE_URL
-  delete process.env.CODEX_API_KEY
-  delete process.env.CHATGPT_ACCOUNT_ID
-  saveGlobalConfig(current => ({
-    ...current,
-    model: undefined,
-  }))
-})
-
-afterEach(() => {
-  for (const key of Object.keys(SAVED_ENV) as Array<keyof typeof SAVED_ENV>) {
-    restoreEnv(key)
-  }
-  saveGlobalConfig(current => ({
-    ...current,
-    model: undefined,
-  }))
-})
-
-test('codex provider reads OPENAI_MODEL, not stale settings.model', () => {
-  // Regression: switching from Moonshot (settings.model='kimi-k2.6' persisted
-  // from that session) to the Codex profile. Codex profile correctly sets
-  // OPENAI_MODEL=codexplan + base URL to chatgpt.com/backend-api/codex.
-  // getUserSpecifiedModelSetting previously ignored env for 'codex' provider
-  // and returned settings.model='kimi-k2.6', causing Codex's API to reject
-  // the request: "The 'kimi-k2.6' model is not supported when using Codex".
-  saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_BASE_URL = 'https://chatgpt.com/backend-api/codex'
-  process.env.OPENAI_MODEL = 'codexplan'
-  process.env.CODEX_API_KEY = 'codex-test'
-  process.env.CHATGPT_ACCOUNT_ID = 'acct_test'
-
-  const model = getUserSpecifiedModelSetting()
-  expect(model).toBe('codexplan')
-})
-
-test('nvidia-nim provider reads OPENAI_MODEL, not stale settings.model', () => {
-  saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
-  process.env.NVIDIA_NIM = '1'
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
-
-  const model = getUserSpecifiedModelSetting()
-  expect(model).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
-})
-
-test('minimax provider reads OPENAI_MODEL, not stale settings.model', () => {
-  saveGlobalConfig(current => ({ ...current, model: 'kimi-k2.6' }))
-  process.env.MINIMAX_API_KEY = 'minimax-test'
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_MODEL = 'MiniMax-M2.5'
-
-  const model = getUserSpecifiedModelSetting()
-  expect(model).toBe('MiniMax-M2.5')
-})
-
-test('openai provider still reads OPENAI_MODEL (regression guard)', () => {
-  saveGlobalConfig(current => ({ ...current, model: 'stale-default' }))
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_MODEL = 'gpt-4o'
-
-  const model = getUserSpecifiedModelSetting()
-  expect(model).toBe('gpt-4o')
-})
-
-test('github provider still reads OPENAI_MODEL (regression guard)', () => {
-  saveGlobalConfig(current => ({ ...current, model: 'stale-default' }))
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'github:copilot'
-
-  const model = getUserSpecifiedModelSetting()
-  expect(model).toBe('github:copilot')
-})
-
-// ---------------------------------------------------------------------------
-// Default model helpers — must not fall through to claude-haiku-4-5 etc. for
-// OpenAI-shim providers whose endpoints don't speak Anthropic model names.
-// Hitting that fallthrough caused WebFetch to hang for 60s on MiniMax/Codex
-// because queryHaiku() shipped an unknown model id to the shim endpoint.
-// ---------------------------------------------------------------------------
-
-test('getSmallFastModel returns OPENAI_MODEL for MiniMax (regression: WebFetch hang)', () => {
-  process.env.MINIMAX_API_KEY = 'minimax-test'
-  process.env.OPENAI_MODEL = 'MiniMax-M2.5-highspeed'
-
-  expect(getSmallFastModel()).toBe('MiniMax-M2.5-highspeed')
-})
-
-test('getSmallFastModel returns OPENAI_MODEL for Codex (regression)', () => {
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_BASE_URL = 'https://chatgpt.com/backend-api/codex'
-  process.env.OPENAI_MODEL = 'codexspark'
-  process.env.CODEX_API_KEY = 'codex-test'
-  process.env.CHATGPT_ACCOUNT_ID = 'acct_test'
-
-  expect(getSmallFastModel()).toBe('codexspark')
-})
-
-test('getSmallFastModel returns OPENAI_MODEL for NVIDIA NIM (regression)', () => {
-  process.env.NVIDIA_NIM = '1'
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
-
-  expect(getSmallFastModel()).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
-})
-
-test('getDefaultOpusModel returns OPENAI_MODEL for MiniMax', () => {
-  process.env.MINIMAX_API_KEY = 'minimax-test'
-  process.env.OPENAI_MODEL = 'MiniMax-M2.7'
-
-  expect(getDefaultOpusModel()).toBe('MiniMax-M2.7')
-})
-
-test('getDefaultSonnetModel returns OPENAI_MODEL for NVIDIA NIM', () => {
-  process.env.NVIDIA_NIM = '1'
-  process.env.CLAUDE_CODE_USE_OPENAI = '1'
-  process.env.OPENAI_MODEL = 'nvidia/llama-3.1-nemotron-70b-instruct'
-
-  expect(getDefaultSonnetModel()).toBe('nvidia/llama-3.1-nemotron-70b-instruct')
-})
-
-test('getDefaultHaikuModel returns OPENAI_MODEL for MiniMax', () => {
-  process.env.MINIMAX_API_KEY = 'minimax-test'
-  process.env.OPENAI_MODEL = 'MiniMax-M2.5-highspeed'
-
-  expect(getDefaultHaikuModel()).toBe('MiniMax-M2.5-highspeed')
-})
-
-test('default helpers do not leak claude-* names to shim providers', () => {
-  // Umbrella guard: for each OpenAI-shim provider, none of the default-model
-  // helpers may return an Anthropic-branded model name. That was the source
-  // of the WebFetch 60s hang — MiniMax received "claude-haiku-4-5" and sat
-  // on the connection.
-  process.env.MINIMAX_API_KEY = 'minimax-test'
-  process.env.OPENAI_MODEL = 'MiniMax-M2.7'
-
-  for (const fn of [
-    getSmallFastModel,
-    getDefaultOpusModel,
-    getDefaultSonnetModel,
-    getDefaultHaikuModel,
-  ]) {
-    const model = fn()
-    expect(model.toLowerCase()).not.toContain('claude')
-  }
-})
-
--- a/src/utils/model/model.ts
+++ b/src/utils/model/model.ts
@@ -52,25 +52,10 @@ export function getSmallFastModel(): ModelName {
  if (getAPIProvider() === 'openai') {
    return process.env.OPENAI_MODEL || 'gpt-4o-mini'
  }
-  // Codex provider — OPENAI_MODEL is always set for Codex profiles; only fall
-  // back to a codex-spark alias when an override env strips it.
-  if (getAPIProvider() === 'codex') {
-    return process.env.OPENAI_MODEL || 'codexspark'
-  }
  // For GitHub Copilot provider
  if (getAPIProvider() === 'github') {
    return process.env.OPENAI_MODEL || 'github:copilot'
  }
-  // NVIDIA NIM — OPENAI_MODEL carries the user's active NIM model; use a
-  // small Meta Llama variant as the conservative fallback.
-  if (getAPIProvider() === 'nvidia-nim') {
-    return process.env.OPENAI_MODEL || 'meta/llama-3.1-8b-instruct'
-  }
-  // MiniMax — OPENAI_MODEL carries the active MiniMax model; fall back to
-  // the fastest tier (M2.5-highspeed) when missing.
-  if (getAPIProvider() === 'minimax') {
-    return process.env.OPENAI_MODEL || 'MiniMax-M2.5-highspeed'
-  }
  return getDefaultHaikuModel()
 }

@@ -106,24 +91,11 @@ export function getUserSpecifiedModelSetting(): ModelSetting | undefined {
    const setting = normalizeModelSetting(settings.model)
    // Read the model env var that matches the active provider to prevent
    // cross-provider leaks (e.g. ANTHROPIC_MODEL sent to the OpenAI API).
-    //
-    // All OpenAI-shim providers (openai, codex, github, nvidia-nim, minimax)
-    // set CLAUDE_CODE_USE_OPENAI=1 + OPENAI_MODEL via
-    // applyProviderProfileToProcessEnv. Earlier this check only included
-    // openai/github — codex/nvidia-nim/minimax fell through to the stale
-    // settings.model, so switching from (say) Moonshot to Codex kept firing
-    // `kimi-k2.6` at the Codex endpoint and getting 400s.
    const provider = getAPIProvider()
-    const isOpenAIShimProvider =
-      provider === 'openai' ||
-      provider === 'codex' ||
-      provider === 'github' ||
-      provider === 'nvidia-nim' ||
-      provider === 'minimax'
    specifiedModel =
      (provider === 'gemini' ? process.env.GEMINI_MODEL : undefined) ||
      (provider === 'mistral' ? process.env.MISTRAL_MODEL : undefined) ||
-      (isOpenAIShimProvider ? process.env.OPENAI_MODEL : undefined) ||
+      (provider === 'openai' || provider === 'gemini' || provider === 'mistral' || provider === 'github' ? process.env.OPENAI_MODEL : undefined) ||
      (provider === 'firstParty' ? process.env.ANTHROPIC_MODEL : undefined) ||
      setting ||
      undefined
@@ -168,7 +140,7 @@ export function getDefaultOpusModel(): ModelName {
  }
  // Gemini provider
  if (getAPIProvider() === 'gemini') {
-    return process.env.GEMINI_MODEL || 'gemini-2.5-pro'
+    return process.env.GEMINI_MODEL || 'gemini-2.5-pro-preview-03-25'
  }
  // Mistral provider
  if (getAPIProvider() === 'mistral') {
@@ -186,14 +158,6 @@ export function getDefaultOpusModel(): ModelName {
  if (getAPIProvider() === 'github') {
    return process.env.OPENAI_MODEL || 'github:copilot'
  }
-  // NVIDIA NIM
-  if (getAPIProvider() === 'nvidia-nim') {
-    return process.env.OPENAI_MODEL || 'nvidia/llama-3.1-nemotron-70b-instruct'
-  }
-  // MiniMax — flagship tier for "opus"-equivalent.
-  if (getAPIProvider() === 'minimax') {
-    return process.env.OPENAI_MODEL || 'MiniMax-M2.7'
-  }
  // 3P providers (Bedrock, Vertex, Foundry) — kept as a separate branch
  // even when values match, since 3P availability lags firstParty and
  // these will diverge again at the next model launch.
@@ -228,14 +192,6 @@ export function getDefaultSonnetModel(): ModelName {
  if (getAPIProvider() === 'github') {
    return process.env.OPENAI_MODEL || 'github:copilot'
  }
-  // NVIDIA NIM
-  if (getAPIProvider() === 'nvidia-nim') {
-    return process.env.OPENAI_MODEL || 'nvidia/llama-3.1-nemotron-70b-instruct'
-  }
-  // MiniMax — mid tier for "sonnet"-equivalent.
-  if (getAPIProvider() === 'minimax') {
-    return process.env.OPENAI_MODEL || 'MiniMax-M2.5'
-  }
  // Default to Sonnet 4.5 for 3P since they may not have 4.6 yet
  if (getAPIProvider() !== 'firstParty') {
    return getModelStrings().sonnet45
@@ -268,14 +224,6 @@ export function getDefaultHaikuModel(): ModelName {
  if (getAPIProvider() === 'gemini') {
    return process.env.GEMINI_MODEL || 'gemini-2.0-flash-lite'
  }
-  // NVIDIA NIM
-  if (getAPIProvider() === 'nvidia-nim') {
-    return process.env.OPENAI_MODEL || 'meta/llama-3.1-8b-instruct'
-  }
-  // MiniMax — fastest tier for "haiku"-equivalent.
-  if (getAPIProvider() === 'minimax') {
-    return process.env.OPENAI_MODEL || 'MiniMax-M2.5-highspeed'
-  }

  // Haiku 4.5 is available on all platforms (first-party, Foundry, Bedrock, Vertex)
  return getModelStrings().haiku45
--- a/src/utils/model/modelCache.test.ts
+++ b/src/utils/model/modelCache.test.ts
@@ -1,30 +0,0 @@
-import { describe, expect, it, beforeEach, afterEach, vi } from 'bun:test'
-import { isModelCacheValid, getCachedModelsFromDisk, saveModelsToCache } from '../model/modelCache.js'
-
-vi.mock('../model/ollamaModels.js', () => ({
-  isOllamaProvider: vi.fn(() => true),
-}))
-
-describe('modelCache', () => {
-  const mockModel = { value: 'llama3', label: 'Llama 3', description: 'Test model' }
-
-  describe('isModelCacheValid', () => {
-    it('returns false for non-existent cache', async () => {
-      const result = await isModelCacheValid('ollama')
-      expect(result).toBe(false)
-    })
-  })
-
-  describe('getCachedModelsFromDisk', () => {
-    it('returns null when not cache available', async () => {
-      const result = await getCachedModelsFromDisk()
-      expect(result).toBeNull()
-    })
-  })
-
-  describe('saveModelsToCache', () => {
-    it('has saveModelsToCache function', () => {
-      expect(typeof saveModelsToCache).toBe('function')
-    })
-  })
-})
--- a/src/utils/model/modelCache.ts
+++ b/src/utils/model/modelCache.ts
@@ -1,165 +0,0 @@
-/**
- * Model Caching for OpenClaude
- * 
- * Caches model lists to disk for faster startup and offline access.
- * Uses async fs operations to avoid blocking the event loop.
- */
-
-import { access, readFile, writeFile, mkdir, unlink } from 'node:fs/promises'
-import { existsSync } from 'node:fs'
-import { join } from 'node:path'
-import { homedir } from 'node:os'
-import { getAPIProvider } from './providers.js'
-
-const CACHE_VERSION = '1'
-const CACHE_TTL_HOURS = 24
-const CACHE_DIR_NAME = '.openclaude-model-cache'
-
-interface ModelCache {
-  version: string
-  timestamp: number
-  provider: string
-  models: Array<{ value: string; label: string; description: string }>
-}
-
-function getCacheDir(): string {
-  const home = homedir()
-  const cacheDir = join(home, CACHE_DIR_NAME)
-  if (!existsSync(cacheDir)) {
-    mkdir(cacheDir, { recursive: true })
-  }
-  return cacheDir
-}
-
-function getCacheFilePath(provider: string): string {
-  return join(getCacheDir(), `${provider}.json`)
-}
-
-function isOpenAICompatibleProvider(): boolean {
-  const baseUrl = process.env.OPENAI_BASE_URL || ''
-  return baseUrl.includes('localhost') || baseUrl.includes('nvidia') || baseUrl.includes('minimax') || getAPIProvider() === 'openai'
-}
-
-export async function isModelCacheValid(provider: string): Promise<boolean> {
-  const cachePath = getCacheFilePath(provider)
-  
-  try {
-    await access(cachePath)
-  } catch {
-    return false
-  }
-
-  try {
-    const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
-    if (data.version !== CACHE_VERSION) {
-      return false
-    }
-    if (data.provider !== provider) {
-      return false
-    }
-
-    const ageHours = (Date.now() - data.timestamp) / (1000 * 60 * 60)
-    return ageHours < CACHE_TTL_HOURS
-  } catch {
-    return false
-  }
-}
-
-export async function getCachedModelsFromDisk<T>(): Promise<T[] | null> {
-  const provider = getAPIProvider()
-  const baseUrl = process.env.OPENAI_BASE_URL || ''
-  const isLocalOllama = baseUrl.includes('localhost:11434') || baseUrl.includes('localhost:11435')
-  const isNvidia = baseUrl.includes('nvidia') || baseUrl.includes('integrate.api.nvidia')
-  const isMiniMax = baseUrl.includes('minimax')
-  
-  if (!isLocalOllama && !isNvidia && !isMiniMax && provider !== 'openai') {
-    return null
-  }
-
-  const cachePath = getCacheFilePath(provider)
-  
-  if (!(await isModelCacheValid(provider))) {
-    return null
-  }
-
-  try {
-    const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
-    return data.models as T[]
-  } catch {
-    return null
-  }
-}
-
-export async function saveModelsToCache(
-  models: Array<{ value: string; label: string; description: string }>,
-): Promise<void> {
-  const provider = getAPIProvider()
-  if (!provider) return
-
-  const cachePath = getCacheFilePath(provider)
-  const cacheData: ModelCache = {
-    version: CACHE_VERSION,
-    timestamp: Date.now(),
-    provider,
-    models,
-  }
-  
-  try {
-    await writeFile(cachePath, JSON.stringify(cacheData, null, 2), 'utf-8')
-  } catch (error) {
-    console.warn('[ModelCache] Failed to save cache:', error)
-  }
-}
-
-export async function clearModelCache(provider?: string): Promise<void> {
-  if (provider) {
-    const cachePath = getCacheFilePath(provider)
-    try {
-      await unlink(cachePath)
-    } catch {
-      // ignore if doesn't exist
-    }
-  } else {
-    const cacheDir = getCacheDir()
-    try {
-      await unlink(join(cacheDir, 'ollama.json'))
-      await unlink(join(cacheDir, 'nvidia-nim.json'))
-      await unlink(join(cacheDir, 'minimax.json'))
-    } catch {
-      // ignore
-    }
-  }
-}
-
-export async function getModelCacheInfo(): Promise<{ provider: string; age: string } | null> {
-  const provider = getAPIProvider()
-  const cachePath = getCacheFilePath(provider)
-  
-  try {
-    await access(cachePath)
-  } catch {
-    return null
-  }
-
-  try {
-    const data = JSON.parse(await readFile(cachePath, 'utf-8')) as ModelCache
-    const ageMs = Date.now() - data.timestamp
-    const ageHours = Math.floor(ageMs / (1000 * 60 * 60))
-    const ageMins = Math.floor((ageMs % (1000 * 60 * 60)) / (1000 * 60))
-    
-    return {
-      provider: data.provider,
-      age: ageHours > 0 ? `${ageHours}h ${ageMins}m` : `${ageMins}m`,
-    }
-  } catch {
-    return null
-  }
-}
-
-export function isCacheAvailable(): boolean {
-  const baseUrl = process.env.OPENAI_BASE_URL || ''
-  const isLocalOllama = baseUrl.includes('localhost:11434') || baseUrl.includes('localhost:11435')
-  const isNvidia = baseUrl.includes('nvidia') || baseUrl.includes('integrate.api.nvidia')
-  const isMiniMax = baseUrl.includes('minimax')
-  return isLocalOllama || isNvidia || isMiniMax || getAPIProvider() === 'openai'
-}
--- a/src/utils/model/openaiContextWindows.ts
+++ b/src/utils/model/openaiContextWindows.ts
@@ -177,15 +177,19 @@ const OPENAI_CONTEXT_WINDOWS: Record<string, number> = {
  'MiniMax-M2':               204_800,

  // Google (via OpenRouter)
-  'google/gemini-2.0-flash':1_048_576,
-  'google/gemini-2.5-pro':  1_048_576,
+  'google/gemini-2.0-flash':          1_048_576,
+  'google/gemini-2.5-pro':            1_048_576,
+  'google/gemini-3-flash-preview':    1_048_576,
+  'google/gemini-3.1-pro-preview':    1_048_576,

  // Google (native via CLAUDE_CODE_USE_GEMINI)
-  'gemini-2.0-flash':              1_048_576,
-  'gemini-2.5-pro':                1_048_576,
-  'gemini-2.5-flash':              1_048_576,
-  'gemini-3.1-pro':                1_048_576,
-  'gemini-3.1-flash-lite-preview': 1_048_576,
+  'gemini-2.0-flash':                 1_048_576,
+  'gemini-2.5-pro':                   1_048_576,
+  'gemini-2.5-flash':                 1_048_576,
+  'gemini-3-flash-preview':           1_048_576,
+  'gemini-3.1-pro':                   1_048_576,
+  'gemini-3.1-pro-preview':           1_048_576,
+  'gemini-3.1-flash-lite-preview':    1_048_576,

  // Ollama local models
  // Llama 3.1+ models support 128k context natively (Meta official specs).
@@ -219,17 +223,6 @@ const OPENAI_CONTEXT_WINDOWS: Record<string, number> = {
  'kimi-k2.5':                262_144,
  'glm-5':                    202_752,
  'glm-4.7':                  202_752,
-
-  // Moonshot AI direct API (api.moonshot.ai/v1). Values from Moonshot's
-  // published model card — all K2 tier share 256K context. Prefix matching
-  // in lookupByKey catches variants like "kimi-k2.6-preview".
-  'kimi-k2.6':                262_144,
-  'kimi-k2':                  131_072,
-  'kimi-k2-instruct':         131_072,
-  'kimi-k2-thinking':         262_144,
-  'moonshot-v1-8k':             8_192,
-  'moonshot-v1-32k':           32_768,
-  'moonshot-v1-128k':         131_072,
 }

 /**
@@ -340,15 +333,19 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
  'MiniMax-Vision-01-Fast':    16_384,

  // Google (via OpenRouter)
-  'google/gemini-2.0-flash':   8_192,
-  'google/gemini-2.5-pro':    65_536,
+  'google/gemini-2.0-flash':          8_192,
+  'google/gemini-2.5-pro':           65_536,
+  'google/gemini-3-flash-preview':   65_536,
+  'google/gemini-3.1-pro-preview':   65_536,

  // Google (native via CLAUDE_CODE_USE_GEMINI)
-  'gemini-2.0-flash':              8_192,
-  'gemini-2.5-pro':                65_536,
-  'gemini-2.5-flash':              65_536,
-  'gemini-3.1-pro':                65_536,
-  'gemini-3.1-flash-lite-preview': 65_536,
+  'gemini-2.0-flash':                 8_192,
+  'gemini-2.5-pro':                  65_536,
+  'gemini-2.5-flash':                65_536,
+  'gemini-3-flash-preview':          65_536,
+  'gemini-3.1-pro':                  65_536,
+  'gemini-3.1-pro-preview':          65_536,
+  'gemini-3.1-flash-lite-preview':   65_536,

  // Ollama local models (conservative safe defaults)
  'llama3.3:70b':               4_096,
@@ -402,62 +399,18 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
  'kimi-k2.5':                 32_768,
  'glm-5':                     16_384,
  'glm-4.7':                   16_384,
-
-  // Moonshot AI direct API
-  'kimi-k2.6':                 32_768,
-  'kimi-k2':                   32_768,
-  'kimi-k2-instruct':          32_768,
-  'kimi-k2-thinking':          32_768,
-  'moonshot-v1-8k':             4_096,
-  'moonshot-v1-32k':           16_384,
-  'moonshot-v1-128k':          32_768,
 }

-// External context-window overrides loaded once at startup.
-// Set CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS to a JSON object mapping model name
-// → context-window token count to add or override entries without editing
-// this file.  Example:
-//   CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS='{"my-corp/llm-v2":200000}'
-const OPENAI_EXTERNAL_CONTEXT_WINDOWS: Record<string, number> = (() => {
-  try {
-    const raw = process.env.CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS
-    if (raw) {
-      const parsed = JSON.parse(raw)
-      if (typeof parsed === 'object' && parsed !== null) return parsed as Record<string, number>
-    }
-  } catch { /* ignore malformed JSON */ }
-  return {}
-})()
-
-// External max-output-token overrides.
-// Set CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS to a JSON object mapping model name
-// → max output token count.
-const OPENAI_EXTERNAL_MAX_OUTPUT_TOKENS: Record<string, number> = (() => {
-  try {
-    const raw = process.env.CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS
-    if (raw) {
-      const parsed = JSON.parse(raw)
-      if (typeof parsed === 'object' && parsed !== null) return parsed as Record<string, number>
-    }
-  } catch { /* ignore malformed JSON */ }
-  return {}
-})()
-
-function lookupByModel<T>(table: Record<string, T>, externalTable: Record<string, T>, model: string): T | undefined {
+function lookupByModel<T>(table: Record<string, T>, model: string): T | undefined {
  // Try provider-qualified key first: "{OPENAI_MODEL}:{model}" so that
  // e.g. "github:copilot:claude-haiku-4.5" can have different limits than
  // a bare "claude-haiku-4.5" served by another provider.
  const providerModel = process.env.OPENAI_MODEL?.trim()
  if (providerModel && providerModel !== model) {
    const qualified = `${providerModel}:${model}`
-    // External table takes precedence over the built-in table.
-    const externalQualified = lookupByKey(externalTable, qualified)
-    if (externalQualified !== undefined) return externalQualified
    const qualifiedResult = lookupByKey(table, qualified)
    if (qualifiedResult !== undefined) return qualifiedResult
  }
-  const externalResult = lookupByKey(externalTable, model)
-  if (externalResult !== undefined) return externalResult
  return lookupByKey(table, model)
 }

@@ -481,7 +434,7 @@ function lookupByKey<T>(table: Record<string, T>, model: string): T | undefined
 * "gpt-4o-2024-11-20" resolve to the base "gpt-4o" entry.
 */
 export function getOpenAIContextWindow(model: string): number | undefined {
-  return lookupByModel(OPENAI_CONTEXT_WINDOWS, OPENAI_EXTERNAL_CONTEXT_WINDOWS, model)
+  return lookupByModel(OPENAI_CONTEXT_WINDOWS, model)
 }

 /**
@@ -489,5 +442,5 @@ export function getOpenAIContextWindow(model: string): number | undefined {
 * Returns undefined if the model is not in the table.
 */
 export function getOpenAIMaxOutputTokens(model: string): number | undefined {
-  return lookupByModel(OPENAI_MAX_OUTPUT_TOKENS, OPENAI_EXTERNAL_MAX_OUTPUT_TOKENS, model)
+  return lookupByModel(OPENAI_MAX_OUTPUT_TOKENS, model)
 }
--- a/src/utils/model/providers.test.ts
+++ b/src/utils/model/providers.test.ts
@@ -107,60 +107,3 @@ test('official OpenAI base URLs now keep provider detection on openai for aliase
  const { getAPIProvider } = await importFreshProvidersModule()
  expect(getAPIProvider()).toBe('openai')
 })
-
-// isGithubNativeAnthropicMode
-
-test('isGithubNativeAnthropicMode: false when CLAUDE_CODE_USE_GITHUB is not set', async () => {
-  clearProviderEnv()
-  process.env.OPENAI_MODEL = 'claude-sonnet-4-5'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode()).toBe(false)
-})
-
-test('isGithubNativeAnthropicMode: true for bare claude- model via OPENAI_MODEL', async () => {
-  clearProviderEnv()
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'claude-sonnet-4-5'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode()).toBe(true)
-})
-
-test('isGithubNativeAnthropicMode: true for github:copilot:claude- compound format', async () => {
-  clearProviderEnv()
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'github:copilot:claude-sonnet-4'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode()).toBe(true)
-})
-
-test('isGithubNativeAnthropicMode: true when resolvedModel is a claude- model', async () => {
-  clearProviderEnv()
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'github:copilot'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode('claude-haiku-4-5')).toBe(true)
-})
-
-test('isGithubNativeAnthropicMode: false for generic github:copilot alias', async () => {
-  clearProviderEnv()
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'github:copilot'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode()).toBe(false)
-})
-
-test('isGithubNativeAnthropicMode: false for non-Claude model', async () => {
-  clearProviderEnv()
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'gpt-4o'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode()).toBe(false)
-})
-
-test('isGithubNativeAnthropicMode: false for github:copilot:gpt- model', async () => {
-  clearProviderEnv()
-  process.env.CLAUDE_CODE_USE_GITHUB = '1'
-  process.env.OPENAI_MODEL = 'github:copilot:gpt-4o'
-  const { isGithubNativeAnthropicMode } = await importFreshProvidersModule()
-  expect(isGithubNativeAnthropicMode()).toBe(false)
-})
--- a/src/utils/model/providers.ts
+++ b/src/utils/model/providers.ts
@@ -19,12 +19,7 @@ export function getAPIProvider(): APIProvider {
  if (isEnvTruthy(process.env.NVIDIA_NIM)) {
    return 'nvidia-nim'
  }
-  // MiniMax is signalled by a real API key, not a '1'/'true' flag. Using
-  // isEnvTruthy() here silently treated every MiniMax user as 'firstParty'
-  // (or 'openai' once they set CLAUDE_CODE_USE_OPENAI via the profile),
-  // making every provider-kind-specific branch for 'minimax' elsewhere in
-  // the codebase unreachable. Presence check is the correct signal.
-  if (typeof process.env.MINIMAX_API_KEY === 'string' && process.env.MINIMAX_API_KEY.trim() !== '') {
+  if (isEnvTruthy(process.env.MINIMAX_API_KEY)) {
    return 'minimax'
  }
  return isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI)
@@ -50,24 +45,6 @@ export function getAPIProvider(): APIProvider {
 export function usesAnthropicAccountFlow(): boolean {
  return getAPIProvider() === 'firstParty'
 }
-
-/**
- * Returns true when the GitHub provider should use Anthropic's native API
- * format instead of the OpenAI-compatible shim.
- *
- * Enabled when CLAUDE_CODE_USE_GITHUB=1 and the model string contains "claude-"
- * anywhere (handles bare names like "claude-sonnet-4" and compound formats like
- * "github:copilot:claude-sonnet-4" or any future provider-prefixed variants).
- *
- * api.githubcopilot.com supports Anthropic native format for Claude models,
- * enabling prompt caching via cache_control blocks which significantly reduces
- * per-turn token costs by caching the system prompt and tool definitions.
- */
-export function isGithubNativeAnthropicMode(resolvedModel?: string): boolean {
-  if (!isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)) return false
-  const model = resolvedModel?.trim() || process.env.OPENAI_MODEL?.trim() || ''
-  return model.toLowerCase().includes('claude-')
-}
 function isCodexModel(): boolean {
  return shouldUseCodexTransport(
    process.env.OPENAI_MODEL || '',
--- a/src/utils/permissions/filesystem.ts
+++ b/src/utils/permissions/filesystem.ts
@@ -64,7 +64,6 @@ export const DANGEROUS_FILES = [
  '.profile',
  '.ripgreprc',
  '.mcp.json',
-  '.openclaude.json',
  '.claude.json',
 ] as const

--- a/src/utils/plugins/marketplaceManager.ts
+++ b/src/utils/plugins/marketplaceManager.ts
@@ -532,7 +532,6 @@ export async function gitPull(
 ): Promise<{ code: number; stderr: string }> {
  logForDebugging(`git pull: cwd=${cwd} ref=${ref ?? 'default'}`)
  const env = { ...process.env, ...GIT_NO_PROMPT_ENV }
-  const baseArgs = ['-c', 'core.hooksPath=/dev/null']
  const credentialArgs = options?.disableCredentialHelper
    ? ['-c', 'credential.helper=']
    : []
@@ -540,7 +539,7 @@ export async function gitPull(
  if (ref) {
    const fetchResult = await execFileNoThrowWithCwd(
      gitExe(),
-      [...baseArgs, ...credentialArgs, 'fetch', 'origin', ref],
+      [...credentialArgs, 'fetch', 'origin', ref],
      { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
    )

@@ -550,7 +549,7 @@ export async function gitPull(

    const checkoutResult = await execFileNoThrowWithCwd(
      gitExe(),
-      [...baseArgs, ...credentialArgs, 'checkout', ref],
+      [...credentialArgs, 'checkout', ref],
      { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
    )

@@ -560,7 +559,7 @@ export async function gitPull(

    const pullResult = await execFileNoThrowWithCwd(
      gitExe(),
-      [...baseArgs, ...credentialArgs, 'pull', 'origin', ref],
+      [...credentialArgs, 'pull', 'origin', ref],
      { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
    )
    if (pullResult.code !== 0) {
@@ -572,7 +571,7 @@ export async function gitPull(

  const result = await execFileNoThrowWithCwd(
    gitExe(),
-    [...baseArgs, ...credentialArgs, 'pull', 'origin', 'HEAD'],
+    [...credentialArgs, 'pull', 'origin', 'HEAD'],
    { cwd, timeout: getPluginGitTimeoutMs(), stdin: 'ignore', env },
  )
  if (result.code !== 0) {
@@ -626,8 +625,6 @@ async function gitSubmoduleUpdate(
    [
      '-c',
      'core.sshCommand=ssh -o BatchMode=yes -o StrictHostKeyChecking=yes',
-      '-c',
-      'core.hooksPath=/dev/null',
      ...credentialArgs,
      'submodule',
      'update',
@@ -813,8 +810,6 @@ export async function gitClone(
  const args = [
    '-c',
    'core.sshCommand=ssh -o BatchMode=yes -o StrictHostKeyChecking=yes',
-    '-c',
-    'core.hooksPath=/dev/null',
    'clone',
    '--depth',
    '1',
--- a/src/utils/providerAutoDetect.test.ts
+++ b/src/utils/providerAutoDetect.test.ts
@@ -1,299 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-
-import {
-  detectBestProvider,
-  detectLocalService,
-  detectProviderFromEnv,
-} from './providerAutoDetect.ts'
-
-// Hermetic env scan: always report "no Codex auth on disk" so tests don't
-// depend on the dev machine's ~/.codex/auth.json state.
-function scan(env: Record<string, string | undefined>) {
-  return detectProviderFromEnv({ env, hasCodexAuth: () => false })
-}
-
-describe('detectProviderFromEnv — priority order', () => {
-  test('ANTHROPIC_API_KEY wins over all others', () => {
-    expect(
-      scan({
-        ANTHROPIC_API_KEY: 'sk-ant-x',
-        OPENAI_API_KEY: 'sk-x',
-        GEMINI_API_KEY: 'gem-x',
-      }),
-    ).toEqual({ kind: 'anthropic', source: 'ANTHROPIC_API_KEY set' })
-  })
-
-  test('CODEX_API_KEY beats OpenAI/Gemini/etc', () => {
-    expect(
-      scan({
-        CODEX_API_KEY: 'codex-x',
-        OPENAI_API_KEY: 'sk-x',
-      }),
-    ).toEqual({ kind: 'codex', source: 'CODEX_API_KEY set' })
-  })
-
-  test('CHATGPT_ACCOUNT_ID alone is enough for Codex', () => {
-    expect(
-      scan({
-        CHATGPT_ACCOUNT_ID: 'acct-123',
-      }),
-    ).toEqual({ kind: 'codex', source: 'CHATGPT_ACCOUNT_ID set' })
-  })
-
-  test('Codex auth file on disk is detected without any env', () => {
-    expect(
-      detectProviderFromEnv({ env: {}, hasCodexAuth: () => true }),
-    ).toEqual({ kind: 'codex', source: '~/.codex/auth.json present' })
-  })
-
-  test('GITHUB_TOKEN wins over OpenAI', () => {
-    expect(
-      scan({
-        GITHUB_TOKEN: 'ghp-x',
-        OPENAI_API_KEY: 'sk-x',
-      }),
-    ).toEqual({ kind: 'github', source: 'GITHUB_TOKEN set (GitHub Copilot)' })
-  })
-
-  test('GH_TOKEN is equivalent to GITHUB_TOKEN', () => {
-    expect(
-      scan({
-        GH_TOKEN: 'ghp-x',
-      }),
-    ).toEqual({ kind: 'github', source: 'GH_TOKEN set (GitHub Copilot)' })
-  })
-
-  test('OPENAI_API_KEYS (plural) detected', () => {
-    expect(
-      scan({
-        OPENAI_API_KEYS: 'sk-a,sk-b',
-      }),
-    ).toEqual({ kind: 'openai', source: 'OPENAI_API_KEYS set' })
-  })
-
-  test('OPENAI_API_KEY reports baseUrl when set', () => {
-    expect(
-      scan({
-        OPENAI_API_KEY: 'sk-x',
-        OPENAI_BASE_URL: 'https://openrouter.ai/api/v1',
-      }),
-    ).toEqual({
-      kind: 'openai',
-      source: 'OPENAI_API_KEY set',
-      baseUrl: 'https://openrouter.ai/api/v1',
-    })
-  })
-
-  test('GEMINI_API_KEY detected', () => {
-    expect(scan({ GEMINI_API_KEY: 'gem-x' })).toEqual({
-      kind: 'gemini',
-      source: 'GEMINI_API_KEY set',
-    })
-  })
-
-  test('GOOGLE_API_KEY also detects Gemini', () => {
-    expect(scan({ GOOGLE_API_KEY: 'gk-x' })).toEqual({
-      kind: 'gemini',
-      source: 'GOOGLE_API_KEY set',
-    })
-  })
-
-  test('MISTRAL_API_KEY detected', () => {
-    expect(scan({ MISTRAL_API_KEY: 'mis-x' })).toEqual({
-      kind: 'mistral',
-      source: 'MISTRAL_API_KEY set',
-    })
-  })
-
-  test('MINIMAX_API_KEY detected', () => {
-    expect(scan({ MINIMAX_API_KEY: 'mm-x' })).toEqual({
-      kind: 'minimax',
-      source: 'MINIMAX_API_KEY set',
-    })
-  })
-
-  test('empty-string values are ignored', () => {
-    expect(
-      scan({
-        ANTHROPIC_API_KEY: '',
-        OPENAI_API_KEY: '   ',
-        GEMINI_API_KEY: 'gem-x',
-      }),
-    ).toEqual({ kind: 'gemini', source: 'GEMINI_API_KEY set' })
-  })
-
-  test('no credentials → null', () => {
-    expect(scan({})).toBeNull()
-  })
-})
-
-describe('detectLocalService', () => {
-  test('returns Ollama when its /api/tags responds ok', async () => {
-    const fetchImpl = (async (input: URL | RequestInfo) => {
-      const url = typeof input === 'string' ? input : (input as URL).toString()
-      if (url.includes(':11434')) {
-        return new Response('{"models":[]}', { status: 200 })
-      }
-      return new Response('', { status: 404 })
-    }) as typeof fetch
-
-    const result = await detectLocalService({
-      env: {},
-      fetchImpl,
-      timeoutMs: 200,
-    })
-    expect(result?.kind).toBe('ollama')
-    expect(result?.baseUrl).toBe('http://localhost:11434')
-  })
-
-  test('Ollama wins over LM Studio even when both are reachable', async () => {
-    const fetchImpl = (async () => new Response('{}', { status: 200 })) as typeof fetch
-    const result = await detectLocalService({
-      env: {},
-      fetchImpl,
-      timeoutMs: 200,
-    })
-    expect(result?.kind).toBe('ollama')
-  })
-
-  test('falls back to LM Studio when Ollama is unreachable', async () => {
-    const fetchImpl = (async (input: URL | RequestInfo) => {
-      const url = typeof input === 'string' ? input : (input as URL).toString()
-      if (url.includes(':1234')) {
-        return new Response('{"data":[]}', { status: 200 })
-      }
-      return new Response('', { status: 404 })
-    }) as typeof fetch
-
-    const result = await detectLocalService({
-      env: {},
-      fetchImpl,
-      timeoutMs: 200,
-    })
-    expect(result?.kind).toBe('lm-studio')
-    expect(result?.baseUrl).toBe('http://localhost:1234')
-  })
-
-  test('returns null when no local services respond', async () => {
-    const fetchImpl = (async () =>
-      new Response('', { status: 500 })) as typeof fetch
-    const result = await detectLocalService({
-      env: {},
-      fetchImpl,
-      timeoutMs: 200,
-    })
-    expect(result).toBeNull()
-  })
-
-  test('honors OLLAMA_BASE_URL override', async () => {
-    const probedUrls: string[] = []
-    const fetchImpl = (async (input: URL | RequestInfo) => {
-      const url = typeof input === 'string' ? input : (input as URL).toString()
-      probedUrls.push(url)
-      return new Response('{"models":[]}', { status: 200 })
-    }) as typeof fetch
-
-    const result = await detectLocalService({
-      env: { OLLAMA_BASE_URL: 'http://10.0.0.5:11434' },
-      fetchImpl,
-      timeoutMs: 200,
-    })
-    expect(result?.baseUrl).toBe('http://10.0.0.5:11434')
-    expect(probedUrls).toContain('http://10.0.0.5:11434/api/tags')
-  })
-
-  test('probe timeout does not throw — returns null', async () => {
-    const fetchImpl = (async (_input: URL | RequestInfo, init?: RequestInit) => {
-      // Respect the caller's abort signal so the race with timeoutMs is fair.
-      return new Promise<Response>((_resolve, reject) => {
-        const onAbort = () => reject(new Error('aborted'))
-        init?.signal?.addEventListener('abort', onAbort)
-        setTimeout(() => {
-          init?.signal?.removeEventListener('abort', onAbort)
-          _resolve(new Response('ok'))
-        }, 500)
-      })
-    }) as typeof fetch
-
-    const result = await detectLocalService({
-      env: {},
-      fetchImpl,
-      timeoutMs: 50,
-    })
-    expect(result).toBeNull()
-  })
-
-  test('network errors do not throw', async () => {
-    const fetchImpl = (async () => {
-      throw new Error('ECONNREFUSED')
-    }) as typeof fetch
-
-    const result = await detectLocalService({
-      env: {},
-      fetchImpl,
-      timeoutMs: 200,
-    })
-    expect(result).toBeNull()
-  })
-})
-
-describe('detectBestProvider — orchestrator', () => {
-  test('env match short-circuits the local probe', async () => {
-    let probeCalled = false
-    const fetchImpl = (async () => {
-      probeCalled = true
-      return new Response('{}', { status: 200 })
-    }) as typeof fetch
-
-    const result = await detectBestProvider({
-      env: { ANTHROPIC_API_KEY: 'sk-ant' },
-      fetchImpl,
-      timeoutMs: 200,
-      hasCodexAuth: () => false,
-    })
-    expect(result?.kind).toBe('anthropic')
-    expect(probeCalled).toBe(false)
-  })
-
-  test('env miss falls through to local-service probe', async () => {
-    const fetchImpl = (async () => new Response('{}', { status: 200 })) as typeof fetch
-    const result = await detectBestProvider({
-      env: {},
-      fetchImpl,
-      timeoutMs: 200,
-      hasCodexAuth: () => false,
-    })
-    expect(result?.kind).toBe('ollama')
-  })
-
-  test('skipLocal prevents network probes', async () => {
-    let probeCalled = false
-    const fetchImpl = (async () => {
-      probeCalled = true
-      return new Response('{}', { status: 200 })
-    }) as typeof fetch
-
-    const result = await detectBestProvider({
-      env: {},
-      fetchImpl,
-      skipLocal: true,
-      hasCodexAuth: () => false,
-    })
-    expect(result).toBeNull()
-    expect(probeCalled).toBe(false)
-  })
-
-  test('completely empty environment returns null', async () => {
-    const fetchImpl = (async () => {
-      throw new Error('nothing reachable')
-    }) as typeof fetch
-
-    const result = await detectBestProvider({
-      env: {},
-      fetchImpl,
-      timeoutMs: 100,
-      hasCodexAuth: () => false,
-    })
-    expect(result).toBeNull()
-  })
-})
--- a/src/utils/providerAutoDetect.ts
+++ b/src/utils/providerAutoDetect.ts
@@ -1,283 +0,0 @@
-/**
- * Zero-config provider autodetection.
- *
- * Scans the environment (API keys, OAuth tokens, stored credentials) and local
- * network (Ollama, LM Studio) to pick the best provider for first-run users
- * who have not explicitly configured one. Returns a structured detection
- * result that callers can consume to build a launch-ready profile env, or
- * null when nothing is detected — in which case the existing onboarding /
- * picker flow should take over.
- *
- * Detection priority (first match wins):
- *   1. ANTHROPIC_API_KEY → first-party Claude (most capable default)
- *   2. Codex: CODEX_API_KEY, CHATGPT_ACCOUNT_ID, or valid ~/.codex/auth.json
- *   3. GitHub Copilot: GITHUB_TOKEN or GH_TOKEN
- *   4. OPENAI_API_KEY / OPENAI_API_KEYS
- *   5. GEMINI_API_KEY or GOOGLE_API_KEY
- *   6. MISTRAL_API_KEY
- *   7. MINIMAX_API_KEY
- *   8. Local Ollama reachable (default localhost:11434)
- *   9. Local LM Studio reachable (default localhost:1234)
- *
- * Local-service probes are parallelized and cheap (short timeout, no
- * request body). Env scans are synchronous and run first so we don't make
- * network calls when a credential is already present.
- *
- * This module intentionally does NOT decide whether to apply the detection;
- * callers should gate on hasExplicitProviderSelection() (providerProfile.ts)
- * and the presence of a persisted profile file.
- */
-
-import { existsSync } from 'fs'
-import { homedir } from 'os'
-import { join } from 'path'
-
-export type DetectedProviderKind =
-  | 'anthropic'
-  | 'codex'
-  | 'github'
-  | 'openai'
-  | 'gemini'
-  | 'mistral'
-  | 'minimax'
-  | 'ollama'
-  | 'lm-studio'
-
-export type DetectedProvider = {
-  kind: DetectedProviderKind
-  /** One-line human-readable reason, e.g. "ANTHROPIC_API_KEY set". */
-  source: string
-  /** Present when the detection already resolved a usable base URL. */
-  baseUrl?: string
-  /** Present when detection also narrowed down a specific model. */
-  model?: string
-}
-
-type EnvLike = NodeJS.ProcessEnv | Record<string, string | undefined>
-
-function envHasNonEmpty(env: EnvLike, key: string): boolean {
-  const value = env[key]
-  return typeof value === 'string' && value.trim().length > 0
-}
-
-function firstSet(env: EnvLike, keys: readonly string[]): string | undefined {
-  for (const key of keys) {
-    if (envHasNonEmpty(env, key)) return key
-  }
-  return undefined
-}
-
-function defaultHasCodexAuthFile(): boolean {
-  const paths = [
-    process.env.CODEX_AUTH_PATH,
-    join(homedir(), '.codex', 'auth.json'),
-  ]
-  return paths.some(p => p && existsSync(p))
-}
-
-export type DetectProviderFromEnvOptions = {
-  env?: EnvLike
-  /**
-   * Override Codex auth-file detection. Primarily for tests — the default
-   * implementation checks ~/.codex/auth.json and CODEX_AUTH_PATH on disk.
-   */
-  hasCodexAuth?: () => boolean
-}
-
-/**
- * Synchronous env-only scan. Returns the highest-priority env-provided
- * provider, or null if nothing is present. Intentionally does not touch
- * the network — fast path for the common case where a user has exported
- * one of the standard API-key env vars.
- */
-function isOptionsObject(
-  value: EnvLike | DetectProviderFromEnvOptions | undefined,
-): value is DetectProviderFromEnvOptions {
-  if (!value || typeof value !== 'object') return false
-  if ('hasCodexAuth' in value && typeof value.hasCodexAuth === 'function') {
-    return true
-  }
-  if ('env' in value && typeof (value as { env?: unknown }).env === 'object') {
-    return true
-  }
-  return false
-}
-
-export function detectProviderFromEnv(
-  envOrOptions: EnvLike | DetectProviderFromEnvOptions = process.env,
-): DetectedProvider | null {
-  const options: DetectProviderFromEnvOptions = isOptionsObject(envOrOptions)
-    ? envOrOptions
-    : { env: envOrOptions as EnvLike }
-  const env = options.env ?? process.env
-  const hasCodexAuth = options.hasCodexAuth ?? defaultHasCodexAuthFile
-  if (envHasNonEmpty(env, 'ANTHROPIC_API_KEY')) {
-    return { kind: 'anthropic', source: 'ANTHROPIC_API_KEY set' }
-  }
-
-  if (
-    envHasNonEmpty(env, 'CODEX_API_KEY') ||
-    envHasNonEmpty(env, 'CHATGPT_ACCOUNT_ID') ||
-    envHasNonEmpty(env, 'CODEX_ACCOUNT_ID') ||
-    hasCodexAuth()
-  ) {
-    const sourceEnv =
-      firstSet(env, ['CODEX_API_KEY', 'CHATGPT_ACCOUNT_ID', 'CODEX_ACCOUNT_ID'])
-    return {
-      kind: 'codex',
-      source: sourceEnv ? `${sourceEnv} set` : '~/.codex/auth.json present',
-    }
-  }
-
-  const githubKey = firstSet(env, ['GITHUB_TOKEN', 'GH_TOKEN'])
-  if (githubKey) {
-    return {
-      kind: 'github',
-      source: `${githubKey} set (GitHub Copilot)`,
-    }
-  }
-
-  const openaiKey = firstSet(env, ['OPENAI_API_KEYS', 'OPENAI_API_KEY'])
-  if (openaiKey) {
-    return {
-      kind: 'openai',
-      source: `${openaiKey} set`,
-      baseUrl: env.OPENAI_BASE_URL ?? env.OPENAI_API_BASE,
-    }
-  }
-
-  const geminiKey = firstSet(env, ['GEMINI_API_KEY', 'GOOGLE_API_KEY'])
-  if (geminiKey) {
-    return { kind: 'gemini', source: `${geminiKey} set` }
-  }
-
-  if (envHasNonEmpty(env, 'MISTRAL_API_KEY')) {
-    return { kind: 'mistral', source: 'MISTRAL_API_KEY set' }
-  }
-
-  if (envHasNonEmpty(env, 'MINIMAX_API_KEY')) {
-    return { kind: 'minimax', source: 'MINIMAX_API_KEY set' }
-  }
-
-  return null
-}
-
-type LocalProbe = {
-  kind: DetectedProviderKind
-  url: string
-  timeoutMs: number
-  source: string
-  baseUrl: string
-}
-
-const DEFAULT_LOCAL_PROBE_TIMEOUT_MS = 1200
-
-async function probeReachable(
-  url: string,
-  timeoutMs: number,
-  fetchImpl: typeof fetch,
-): Promise<boolean> {
-  const controller = new AbortController()
-  const timer = setTimeout(() => controller.abort(), timeoutMs)
-  try {
-    const response = await fetchImpl(url, {
-      method: 'GET',
-      signal: controller.signal,
-    })
-    return response.ok
-  } catch {
-    return false
-  } finally {
-    clearTimeout(timer)
-  }
-}
-
-/**
- * Returns the highest-priority local service reachable from the host.
- * Runs probes in parallel and picks by priority rather than first-response,
- * so slow-but-preferred services still win over fast-but-lower-priority ones.
- */
-export async function detectLocalService(options?: {
-  env?: EnvLike
-  fetchImpl?: typeof fetch
-  timeoutMs?: number
-}): Promise<DetectedProvider | null> {
-  const env = options?.env ?? process.env
-  const fetchImpl = options?.fetchImpl ?? globalThis.fetch
-  const timeoutMs = options?.timeoutMs ?? DEFAULT_LOCAL_PROBE_TIMEOUT_MS
-
-  const ollamaBase = (env.OLLAMA_BASE_URL ?? 'http://localhost:11434').replace(
-    /\/+$/,
-    '',
-  )
-  const lmStudioBase = (env.LM_STUDIO_BASE_URL ?? 'http://localhost:1234').replace(
-    /\/+$/,
-    '',
-  )
-
-  const probes: LocalProbe[] = [
-    {
-      kind: 'ollama',
-      url: `${ollamaBase}/api/tags`,
-      timeoutMs,
-      source: `Ollama reachable at ${ollamaBase}`,
-      baseUrl: ollamaBase,
-    },
-    {
-      kind: 'lm-studio',
-      url: `${lmStudioBase}/v1/models`,
-      timeoutMs,
-      source: `LM Studio reachable at ${lmStudioBase}`,
-      baseUrl: lmStudioBase,
-    },
-  ]
-
-  const results = await Promise.all(
-    probes.map(async probe => ({
-      probe,
-      reachable: await probeReachable(probe.url, probe.timeoutMs, fetchImpl),
-    })),
-  )
-
-  for (const { probe, reachable } of results) {
-    if (reachable) {
-      return {
-        kind: probe.kind,
-        source: probe.source,
-        baseUrl: probe.baseUrl,
-      }
-    }
-  }
-
-  return null
-}
-
-/**
- * Orchestrator: env scan first (sync, free), then local-service probes
- * (async, ~1-2s worst case) only if nothing was found in env.
- */
-export async function detectBestProvider(options?: {
-  env?: EnvLike
-  fetchImpl?: typeof fetch
-  timeoutMs?: number
-  /** Skip local-service probes — useful for tests or offline smoke checks. */
-  skipLocal?: boolean
-  /** Override for Codex auth-file detection. See detectProviderFromEnv. */
-  hasCodexAuth?: () => boolean
-}): Promise<DetectedProvider | null> {
-  const env = options?.env ?? process.env
-
-  const fromEnv = detectProviderFromEnv({
-    env,
-    hasCodexAuth: options?.hasCodexAuth,
-  })
-  if (fromEnv) return fromEnv
-
-  if (options?.skipLocal) return null
-
-  return detectLocalService({
-    env,
-    fetchImpl: options?.fetchImpl,
-    timeoutMs: options?.timeoutMs,
-  })
-}
--- a/src/utils/providerDiscovery.test.ts
+++ b/src/utils/providerDiscovery.test.ts
@@ -1,9 +1,9 @@
 import { afterEach, expect, mock, test } from 'bun:test'

-async function loadProviderDiscoveryModule() {
-  // @ts-expect-error cache-busting query string for Bun module mocks
-  return import(`./providerDiscovery.js?ts=${Date.now()}-${Math.random()}`)
-}
+import {
+  getLocalOpenAICompatibleProviderLabel,
+  listOpenAICompatibleModels,
+} from './providerDiscovery.js'

 const originalFetch = globalThis.fetch
 const originalEnv = {
@@ -16,8 +16,6 @@ afterEach(() => {
 })

 test('lists models from a local openai-compatible /models endpoint', async () => {
-  const { listOpenAICompatibleModels } = await loadProviderDiscoveryModule()
-
  globalThis.fetch = mock((input, init) => {
    const url = typeof input === 'string' ? input : input.url
    expect(url).toBe('http://localhost:1234/v1/models')
@@ -49,8 +47,6 @@ test('lists models from a local openai-compatible /models endpoint', async () =>
 })

 test('returns null when a local openai-compatible /models request fails', async () => {
-  const { listOpenAICompatibleModels } = await loadProviderDiscoveryModule()
-
  globalThis.fetch = mock(() =>
    Promise.resolve(new Response('not available', { status: 503 })),
  ) as typeof globalThis.fetch
@@ -60,19 +56,13 @@ test('returns null when a local openai-compatible /models request fails', async
  ).resolves.toBeNull()
 })

-test('detects LM Studio from the default localhost port', async () => {
-  const { getLocalOpenAICompatibleProviderLabel } =
-    await loadProviderDiscoveryModule()
-
+test('detects LM Studio from the default localhost port', () => {
  expect(getLocalOpenAICompatibleProviderLabel('http://localhost:1234/v1')).toBe(
    'LM Studio',
  )
 })

-test('detects common local openai-compatible providers by hostname', async () => {
-  const { getLocalOpenAICompatibleProviderLabel } =
-    await loadProviderDiscoveryModule()
-
+test('detects common local openai-compatible providers by hostname', () => {
  expect(
    getLocalOpenAICompatibleProviderLabel('http://localai.local:8080/v1'),
  ).toBe('LocalAI')
@@ -81,283 +71,8 @@ test('detects common local openai-compatible providers by hostname', async () =>
  ).toBe('vLLM')
 })

-test('detects Moonshot (Kimi) from api.moonshot.ai hostname', async () => {
-  const { getLocalOpenAICompatibleProviderLabel } =
-    await loadProviderDiscoveryModule()
-
-  expect(
-    getLocalOpenAICompatibleProviderLabel('https://api.moonshot.ai/v1'),
-  ).toBe('Moonshot (Kimi)')
-})
-
-test('falls back to a generic local openai-compatible label', async () => {
-  const { getLocalOpenAICompatibleProviderLabel } =
-    await loadProviderDiscoveryModule()
-
+test('falls back to a generic local openai-compatible label', () => {
  expect(
    getLocalOpenAICompatibleProviderLabel('http://127.0.0.1:8080/v1'),
  ).toBe('Local OpenAI-compatible')
-})
-
-test('ollama generation readiness reports unreachable when tags endpoint is down', async () => {
-  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
-
-  const calledUrls: string[] = []
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    calledUrls.push(url)
-    return Promise.resolve(new Response('not available', { status: 503 }))
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeOllamaGenerationReadiness({
-      baseUrl: 'http://localhost:11434',
-    }),
-  ).resolves.toMatchObject({
-    state: 'unreachable',
-    models: [],
-  })
-
-  expect(calledUrls).toEqual([
-    'http://localhost:11434/api/tags',
-  ])
-})
-
-test('ollama generation readiness reports no models when server is reachable', async () => {
-  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
-
-  const calledUrls: string[] = []
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    calledUrls.push(url)
-    return Promise.resolve(
-      new Response(JSON.stringify({ models: [] }), {
-        status: 200,
-        headers: { 'Content-Type': 'application/json' },
-      }),
-    )
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeOllamaGenerationReadiness({
-      baseUrl: 'http://localhost:11434',
-    }),
-  ).resolves.toMatchObject({
-    state: 'no_models',
-    models: [],
-  })
-
-  expect(calledUrls).toEqual([
-    'http://localhost:11434/api/tags',
-  ])
-})
-
-test('ollama generation readiness reports generation_failed when requested model is missing', async () => {
-  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
-
-  const calledUrls: string[] = []
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    calledUrls.push(url)
-    return Promise.resolve(
-      new Response(
-        JSON.stringify({
-          models: [{ name: 'llama3.1:8b', size: 1024 }],
-        }),
-        {
-          status: 200,
-          headers: { 'Content-Type': 'application/json' },
-        },
-      ),
-    )
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeOllamaGenerationReadiness({
-      baseUrl: 'http://localhost:11434',
-      model: 'qwen2.5-coder:7b',
-    }),
-  ).resolves.toMatchObject({
-    state: 'generation_failed',
-    probeModel: 'qwen2.5-coder:7b',
-    detail: 'requested model not installed: qwen2.5-coder:7b',
-  })
-
-  expect(calledUrls).toEqual(['http://localhost:11434/api/tags'])
-})
-
-test('ollama generation readiness reports generation failures when chat probe fails', async () => {
-  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
-
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    if (url.endsWith('/api/tags')) {
-      return Promise.resolve(
-        new Response(
-          JSON.stringify({
-            models: [{ name: 'qwen2.5-coder:7b', size: 42 }],
-          }),
-          {
-            status: 200,
-            headers: { 'Content-Type': 'application/json' },
-          },
-        ),
-      )
-    }
-
-    return Promise.resolve(new Response('model not found', { status: 404 }))
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeOllamaGenerationReadiness({
-      baseUrl: 'http://localhost:11434',
-      model: 'qwen2.5-coder:7b',
-    }),
-  ).resolves.toMatchObject({
-    state: 'generation_failed',
-    probeModel: 'qwen2.5-coder:7b',
-  })
-})
-
-test('ollama generation readiness reports generation_failed when chat probe returns invalid JSON', async () => {
-  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
-
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    if (url.endsWith('/api/tags')) {
-      return Promise.resolve(
-        new Response(
-          JSON.stringify({
-            models: [{ name: 'llama3.1:8b', size: 1024 }],
-          }),
-          {
-            status: 200,
-            headers: { 'Content-Type': 'application/json' },
-          },
-        ),
-      )
-    }
-
-    return Promise.resolve(
-      new Response('<html>proxy error</html>', {
-        status: 200,
-        headers: { 'Content-Type': 'text/html' },
-      }),
-    )
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeOllamaGenerationReadiness({
-      baseUrl: 'http://localhost:11434',
-    }),
-  ).resolves.toMatchObject({
-    state: 'generation_failed',
-    probeModel: 'llama3.1:8b',
-    detail: 'invalid JSON response',
-  })
-})
-
-test('ollama generation readiness reports ready when chat probe succeeds', async () => {
-  const { probeOllamaGenerationReadiness } = await loadProviderDiscoveryModule()
-
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    if (url.endsWith('/api/tags')) {
-      return Promise.resolve(
-        new Response(
-          JSON.stringify({
-            models: [{ name: 'llama3.1:8b', size: 1024 }],
-          }),
-          {
-            status: 200,
-            headers: { 'Content-Type': 'application/json' },
-          },
-        ),
-      )
-    }
-
-    return Promise.resolve(
-      new Response(
-        JSON.stringify({
-          message: { role: 'assistant', content: 'OK' },
-          done: true,
-        }),
-        {
-          status: 200,
-          headers: { 'Content-Type': 'application/json' },
-        },
-      ),
-    )
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeOllamaGenerationReadiness({
-      baseUrl: 'http://localhost:11434',
-    }),
-  ).resolves.toMatchObject({
-    state: 'ready',
-    probeModel: 'llama3.1:8b',
-  })
-})
-
-test('atomic chat readiness reports unreachable when /v1/models is down', async () => {
-  const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
-
-  const calledUrls: string[] = []
-  globalThis.fetch = mock(input => {
-    const url = typeof input === 'string' ? input : input.url
-    calledUrls.push(url)
-    return Promise.resolve(new Response('unavailable', { status: 503 }))
-  }) as typeof globalThis.fetch
-
-  await expect(
-    probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
-  ).resolves.toEqual({ state: 'unreachable' })
-
-  expect(calledUrls[0]).toBe('http://127.0.0.1:1337/v1/models')
-})
-
-test('atomic chat readiness reports no_models when server is reachable but empty', async () => {
-  const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
-
-  globalThis.fetch = mock(() =>
-    Promise.resolve(
-      new Response(JSON.stringify({ data: [] }), {
-        status: 200,
-        headers: { 'Content-Type': 'application/json' },
-      }),
-    ),
-  ) as typeof globalThis.fetch
-
-  await expect(
-    probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
-  ).resolves.toEqual({ state: 'no_models' })
-})
-
-test('atomic chat readiness returns loaded model ids when ready', async () => {
-  const { probeAtomicChatReadiness } = await loadProviderDiscoveryModule()
-
-  globalThis.fetch = mock(() =>
-    Promise.resolve(
-      new Response(
-        JSON.stringify({
-          data: [
-            { id: 'Qwen3_5-4B_Q4_K_M' },
-            { id: 'llama-3.1-8b-instruct' },
-          ],
-        }),
-        {
-          status: 200,
-          headers: { 'Content-Type': 'application/json' },
-        },
-      ),
-    ),
-  ) as typeof globalThis.fetch
-
-  await expect(
-    probeAtomicChatReadiness({ baseUrl: 'http://127.0.0.1:1337' }),
-  ).resolves.toEqual({
-    state: 'ready',
-    models: ['Qwen3_5-4B_Q4_K_M', 'llama-3.1-8b-instruct'],
-  })
 })
--- a/src/utils/providerDiscovery.ts
+++ b/src/utils/providerDiscovery.ts
@@ -4,13 +4,6 @@ import { DEFAULT_OPENAI_BASE_URL } from '../services/api/providerConfig.js'
 export const DEFAULT_OLLAMA_BASE_URL = 'http://localhost:11434'
 export const DEFAULT_ATOMIC_CHAT_BASE_URL = 'http://127.0.0.1:1337'

-export type OllamaGenerationReadiness = {
-  state: 'ready' | 'unreachable' | 'no_models' | 'generation_failed'
-  models: OllamaModelDescriptor[]
-  probeModel?: string
-  detail?: string
-}
-
 function withTimeoutSignal(timeoutMs: number): {
  signal: AbortSignal
  clear: () => void
@@ -27,83 +20,6 @@ function trimTrailingSlash(value: string): string {
  return value.replace(/\/+$/, '')
 }

-function compactDetail(value: string, maxLength = 180): string {
-  const compact = value.trim().replace(/\s+/g, ' ')
-  if (!compact) {
-    return ''
-  }
-
-  if (compact.length <= maxLength) {
-    return compact
-  }
-
-  return `${compact.slice(0, maxLength)}...`
-}
-
-type OllamaTagsPayload = {
-  models?: Array<{
-    name?: string
-    size?: number
-    details?: {
-      family?: string
-      families?: string[]
-      parameter_size?: string
-      quantization_level?: string
-    }
-  }>
-}
-
-function normalizeOllamaModels(
-  payload: OllamaTagsPayload,
-): OllamaModelDescriptor[] {
-  return (payload.models ?? [])
-    .filter(model => Boolean(model.name))
-    .map(model => ({
-      name: model.name!,
-      sizeBytes: typeof model.size === 'number' ? model.size : null,
-      family: model.details?.family ?? null,
-      families: model.details?.families ?? [],
-      parameterSize: model.details?.parameter_size ?? null,
-      quantizationLevel: model.details?.quantization_level ?? null,
-    }))
-}
-
-async function fetchOllamaModelsProbe(
-  baseUrl?: string,
-  timeoutMs = 5000,
-): Promise<{
-  reachable: boolean
-  models: OllamaModelDescriptor[]
-}> {
-  const { signal, clear } = withTimeoutSignal(timeoutMs)
-  try {
-    const response = await fetch(`${getOllamaApiBaseUrl(baseUrl)}/api/tags`, {
-      method: 'GET',
-      signal,
-    })
-
-    if (!response.ok) {
-      return {
-        reachable: false,
-        models: [],
-      }
-    }
-
-    const payload = (await response.json().catch(() => ({}))) as OllamaTagsPayload
-    return {
-      reachable: true,
-      models: normalizeOllamaModels(payload),
-    }
-  } catch {
-    return {
-      reachable: false,
-      models: [],
-    }
-  } finally {
-    clear()
-  }
-}
-
 export function getOllamaApiBaseUrl(baseUrl?: string): string {
  const parsed = new URL(
    baseUrl || process.env.OLLAMA_BASE_URL || DEFAULT_OLLAMA_BASE_URL,
@@ -197,10 +113,6 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
    if (host.includes('minimax') || haystack.includes('minimax')) {
      return 'MiniMax'
    }
-    // Moonshot AI (Kimi) direct API
-    if (host.includes('moonshot') || haystack.includes('moonshot') || haystack.includes('kimi')) {
-      return 'Moonshot (Kimi)'
-    }
  } catch {
    // Fall back to the generic label when the base URL is malformed.
  }
@@ -209,15 +121,61 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
 }

 export async function hasLocalOllama(baseUrl?: string): Promise<boolean> {
-  const { reachable } = await fetchOllamaModelsProbe(baseUrl, 1200)
-  return reachable
+  const { signal, clear } = withTimeoutSignal(1200)
+  try {
+    const response = await fetch(`${getOllamaApiBaseUrl(baseUrl)}/api/tags`, {
+      method: 'GET',
+      signal,
+    })
+    return response.ok
+  } catch {
+    return false
+  } finally {
+    clear()
+  }
 }

 export async function listOllamaModels(
  baseUrl?: string,
 ): Promise<OllamaModelDescriptor[]> {
-  const { models } = await fetchOllamaModelsProbe(baseUrl, 5000)
-  return models
+  const { signal, clear } = withTimeoutSignal(5000)
+  try {
+    const response = await fetch(`${getOllamaApiBaseUrl(baseUrl)}/api/tags`, {
+      method: 'GET',
+      signal,
+    })
+    if (!response.ok) {
+      return []
+    }
+
+    const data = (await response.json()) as {
+      models?: Array<{
+        name?: string
+        size?: number
+        details?: {
+          family?: string
+          families?: string[]
+          parameter_size?: string
+          quantization_level?: string
+        }
+      }>
+    }
+
+    return (data.models ?? [])
+      .filter(model => Boolean(model.name))
+      .map(model => ({
+        name: model.name!,
+        sizeBytes: typeof model.size === 'number' ? model.size : null,
+        family: model.details?.family ?? null,
+        families: model.details?.families ?? [],
+        parameterSize: model.details?.parameter_size ?? null,
+        quantizationLevel: model.details?.quantization_level ?? null,
+      }))
+  } catch {
+    return []
+  } finally {
+    clear()
+  }
 }

 export async function listOpenAICompatibleModels(options?: {
@@ -302,24 +260,6 @@ export async function listAtomicChatModels(
  }
 }

-export type AtomicChatReadiness =
-  | { state: 'unreachable' }
-  | { state: 'no_models' }
-  | { state: 'ready'; models: string[] }
-
-export async function probeAtomicChatReadiness(options?: {
-  baseUrl?: string
-}): Promise<AtomicChatReadiness> {
-  if (!(await hasLocalAtomicChat(options?.baseUrl))) {
-    return { state: 'unreachable' }
-  }
-  const models = await listAtomicChatModels(options?.baseUrl)
-  if (models.length === 0) {
-    return { state: 'no_models' }
-  }
-  return { state: 'ready', models }
-}
-
 export async function benchmarkOllamaModel(
  modelName: string,
  baseUrl?: string,
@@ -354,106 +294,3 @@ export async function benchmarkOllamaModel(
    clear()
  }
 }
-
-export async function probeOllamaGenerationReadiness(options?: {
-  baseUrl?: string
-  model?: string
-  timeoutMs?: number
-}): Promise<OllamaGenerationReadiness> {
-  const timeoutMs = options?.timeoutMs ?? 8000
-  const { reachable, models } = await fetchOllamaModelsProbe(
-    options?.baseUrl,
-    timeoutMs,
-  )
-  if (!reachable) {
-    return {
-      state: 'unreachable',
-      models: [],
-    }
-  }
-
-  if (models.length === 0) {
-    return {
-      state: 'no_models',
-      models: [],
-    }
-  }
-
-  const requestedModel = options?.model?.trim() || undefined
-  if (requestedModel && !models.some(model => model.name === requestedModel)) {
-    return {
-      state: 'generation_failed',
-      models,
-      probeModel: requestedModel,
-      detail: `requested model not installed: ${requestedModel}`,
-    }
-  }
-
-  const probeModel = requestedModel ?? models[0]!.name
-  const { signal, clear } = withTimeoutSignal(timeoutMs)
-
-  try {
-    const response = await fetch(`${getOllamaApiBaseUrl(options?.baseUrl)}/api/chat`, {
-      method: 'POST',
-      headers: {
-        'Content-Type': 'application/json',
-      },
-      signal,
-      body: JSON.stringify({
-        model: probeModel,
-        stream: false,
-        messages: [{ role: 'user', content: 'Reply with OK.' }],
-        options: {
-          temperature: 0,
-          num_predict: 8,
-        },
-      }),
-    })
-
-    if (!response.ok) {
-      const responseBody = await response.text().catch(() => '')
-      const detailSuffix = compactDetail(responseBody)
-      return {
-        state: 'generation_failed',
-        models,
-        probeModel,
-        detail: detailSuffix
-          ? `status ${response.status}: ${detailSuffix}`
-          : `status ${response.status}`,
-      }
-    }
-
-    try {
-      await response.json()
-    } catch {
-      return {
-        state: 'generation_failed',
-        models,
-        probeModel,
-        detail: 'invalid JSON response',
-      }
-    }
-
-    return {
-      state: 'ready',
-      models,
-      probeModel,
-    }
-  } catch (error) {
-    const detail =
-      error instanceof Error
-        ? error.name === 'AbortError'
-          ? 'request timed out'
-          : error.message
-        : String(error)
-
-    return {
-      state: 'generation_failed',
-      models,
-      probeModel,
-      detail,
-    }
-  } finally {
-    clear()
-  }
-}
--- a/src/utils/providerProfile.test.ts
+++ b/src/utils/providerProfile.test.ts
@@ -572,64 +572,31 @@ test('buildStartupEnvFromProfile leaves explicit provider selections untouched',
  assert.equal(env.OPENAI_API_KEY, undefined)
 })

-test('buildStartupEnvFromProfile preserves plural-profile env when the legacy file is stale', async () => {
-  // Regression: a user saves a provider via /provider (plural system).
-  // addProviderProfile does NOT sync the legacy .openclaude-profile.json,
-  // so the legacy file retains whatever it had from an earlier setup (e.g.
-  // OpenAI defaults). At startup, applyActiveProviderProfileFromConfig()
-  // correctly applies the active plural profile (Moonshot) first, marking
-  // env with CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED=1. The legacy-file
-  // load must NOT overwrite that env — it previously did, surfacing as
-  // "banner shows the wrong provider / model".
+test('buildStartupEnvFromProfile lets saved startup profile override profile-managed env', async () => {
  const processEnv = {
    CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED: '1',
-    CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID: 'saved_moonshot',
+    CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID: 'saved_ollama',
    CLAUDE_CODE_USE_OPENAI: '1',
-    OPENAI_BASE_URL: 'https://api.moonshot.ai/v1',
-    OPENAI_MODEL: 'kimi-k2.6',
+    OPENAI_BASE_URL: 'http://localhost:11434/v1',
+    OPENAI_MODEL: 'llama3.1:8b',
  }

  const env = await buildStartupEnvFromProfile({
-    // Stale legacy file — points at SambaNova, but user's active plural
-    // profile is Moonshot and was just applied.
    persisted: profile('openai', {
-      OPENAI_API_KEY: 'sk-stale',
+      OPENAI_API_KEY: 'sk-persisted',
      OPENAI_MODEL: 'Meta-Llama-3.1-70B-Instruct',
      OPENAI_BASE_URL: 'https://api.sambanova.ai/v1',
    }),
    processEnv,
  })

-  assert.equal(env, processEnv)
-  assert.equal(env.OPENAI_BASE_URL, 'https://api.moonshot.ai/v1')
-  assert.equal(env.OPENAI_MODEL, 'kimi-k2.6')
-  // Plural markers are retained — downstream code uses them to verify the
-  // env still belongs to the profile it was applied from.
-  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED, '1')
-  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, 'saved_moonshot')
-})
-
-test('buildStartupEnvFromProfile falls back to legacy file when plural system has not applied', async () => {
-  // Counter-example: first-run user with only the legacy file (no plural
-  // active profile yet). The legacy file is the correct source, so the
-  // load must proceed as before.
-  const processEnv = {
-    CLAUDE_CODE_USE_OPENAI: '1',
-  }
-
-  const env = await buildStartupEnvFromProfile({
-    persisted: profile('openai', {
-      OPENAI_API_KEY: 'sk-legacy',
-      OPENAI_MODEL: 'gpt-4o',
-      OPENAI_BASE_URL: 'https://api.openai.com/v1',
-    }),
-    processEnv,
-  })
-
  assert.notEqual(env, processEnv)
-  assert.equal(env.OPENAI_API_KEY, 'sk-legacy')
-  assert.equal(env.OPENAI_BASE_URL, 'https://api.openai.com/v1')
-  assert.equal(env.OPENAI_MODEL, 'gpt-4o')
+  assert.equal(env.CLAUDE_CODE_USE_OPENAI, '1')
+  assert.equal(env.OPENAI_API_KEY, 'sk-persisted')
+  assert.equal(env.OPENAI_MODEL, 'Meta-Llama-3.1-70B-Instruct')
+  assert.equal(env.OPENAI_BASE_URL, 'https://api.sambanova.ai/v1')
+  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED, undefined)
+  assert.equal(env.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID, undefined)
 })

 test('buildStartupEnvFromProfile treats explicit falsey provider flags as user intent', async () => {
--- a/src/utils/providerProfile.ts
+++ b/src/utils/providerProfile.ts
@@ -841,35 +841,43 @@ export async function buildStartupEnvFromProfile(options?: {
  const processEnv = options?.processEnv ?? process.env
  const persisted = options?.persisted ?? loadProfileFile()

+  // Saved /provider profiles should still win over provider-manager env that was
+  // auto-applied during startup. Only an explicit shell/flag provider selection
+  // should bypass the persisted startup profile.
+  //
  const profileManagedEnv = processEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED === '1'

-  // The legacy single-profile file (~/.openclaude-profile.json) is a
-  // first-run / fallback mechanism. The newer plural provider-profile
-  // system (`/provider` presets + activeProviderProfileId in config) is
-  // applied earlier in the bootstrap via applyActiveProviderProfileFromConfig
-  // and signals completion with CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED=1.
+  // If the user explicitly selected a provider via env, allow it to bypass
+  // the persisted profile only when we can prove it was managed by the
+  // persisted profile env itself.
  //
-  // If the plural system has already set env, trust it — do NOT overlay the
-  // legacy file. addProviderProfile() does not sync the legacy file, so a
-  // stale legacy file (e.g. OpenAI defaults from an earlier manual setup)
-  // would otherwise overwrite the correct plural env and surface as the
-  // "banner shows gpt-4o / api.openai.com even though my saved profile is
-  // Moonshot" bug.
-  if (profileManagedEnv) {
-    return processEnv
-  }
+  // Practically: on initial startup, provider routing env vars can already
+  // be present due to earlier auto-application steps. We should still apply
+  // the persisted profile rather than returning early.

  if (!persisted) {
    return processEnv
  }

+  const launchProcessEnv = profileManagedEnv
+    ? (() => {
+        const cleanedEnv = { ...processEnv }
+        for (const key of PROFILE_ENV_KEYS) {
+          delete cleanedEnv[key]
+        }
+        delete cleanedEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED
+        delete cleanedEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID
+        return cleanedEnv
+      })()
+    : processEnv
+
  return buildLaunchEnv({
    profile: persisted.profile,
    persisted,
    goal:
      options?.goal ??
      normalizeRecommendationGoal(processEnv.OPENCLAUDE_PROFILE_GOAL),
-    processEnv,
+    processEnv: launchProcessEnv,
    getOllamaChatBaseUrl:
      options?.getOllamaChatBaseUrl ?? getOllamaChatBaseUrl,
    resolveOllamaDefaultModel: options?.resolveOllamaDefaultModel,
--- a/Show More
+++ b/Show More