fix(mcp): disable MCP_SKILLS feature flag — source not mirrored

Closes #856. MCP servers that expose resources (e.g. RepoPrompt) failed to load their tools in the open build with: Error fetching tools/commands/resources: fetchMcpSkillsForClient is not a function Root cause: scripts/build.ts set MCP_SKILLS: true, which made feature('MCP_SKILLS') evaluate to true at build time. The guards around the dynamic skill discovery path therefore stayed live. The underlying source file src/skills/mcpSkills.ts is not mirrored into the open tree, so the bundler fell back to its generic missing-module stub — which only exports `default` for require()-style imports, not the named `fetchMcpSkillsForClient` binding. At runtime the require returned an object without that property, and calling it threw. `openclaude mcp doctor` reported RepoPrompt as healthy because doctor does not exercise the skills-fetch path. Fix: flip MCP_SKILLS to false and move it into the "Disabled: missing source" group. With the flag off, every `if (feature('MCP_SKILLS'))` guard becomes a no-op at build time, the require() branch is dead code, and MCP servers with resources load normally via the existing `Promise.resolve([])` fallbacks already present at each call site. Also adds scripts/feature-flags-source-guard.test.ts to fail fast if MCP_SKILLS (or any future flag in the same category) is re-enabled without the corresponding source file being mirrored first. Verification: - Test fails on main, passes with this fix - `bun run build` produces a bundle with no `missing-module-stub:../../skills/mcpSkills.js` reference - Full `bun test` — 1222 pass / 12 fail (same pre-existing 12 as main; new test adds the +1 pass)
fix(startup): url authoritative over model name in banner provider detect (#864 )
2026-04-24 07:54:11 +05:30 · 2026-04-24 01:52:27 +08:00 · 2026-04-24 00:34:08 +08:00 · 2026-04-23 01:37:02 +08:00 · 2026-04-23 01:36:42 +08:00 · 2026-04-23 01:14:00 +08:00
221 changed files with 20039 additions and 3771 deletions
--- a/.env.example
+++ b/.env.example
@@ -149,6 +149,23 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
 # Use a custom OpenAI-compatible endpoint (optional — defaults to api.openai.com)
 # OPENAI_BASE_URL=https://api.openai.com/v1

+# Fallback context window size (tokens) when the model is not found in the
+# built-in table (default: 128000). Increase this for models with larger
+# context windows (e.g. 200000 for Claude-sized contexts).
+# CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW=128000
+
+# Per-model context window overrides as a JSON object.
+# Takes precedence over the built-in table, so you can register new or
+# custom models without patching source.
+# Example: CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS={"my-corp/llm-v3":262144,"gpt-4o-mini":128000}
+# CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS=
+
+# Per-model maximum output token overrides as a JSON object.
+# Use this alongside CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS when your model
+# supports a different output limit than what the built-in table specifies.
+# Example: CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS={"my-corp/llm-v3":8192}
+# CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS=
+

 # -----------------------------------------------------------------------------
 # Option 3: Google Gemini
@@ -225,6 +242,30 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
 # GOOGLE_CLOUD_PROJECT=your-gcp-project-id


+# -----------------------------------------------------------------------------
+# Option 9: NVIDIA NIM
+# -----------------------------------------------------------------------------
+# NVIDIA NIM provides hosted inference endpoints for NVIDIA models.
+# Get your API key from https://build.nvidia.com/
+#
+# CLAUDE_CODE_USE_OPENAI=1
+# NVIDIA_API_KEY=nvapi-your-key-here
+# OPENAI_BASE_URL=https://integrate.api.nvidia.com/v1
+# OPENAI_MODEL=nvidia/llama-3.1-nemotron-70b-instruct
+
+
+# -----------------------------------------------------------------------------
+# Option 10: MiniMax
+# -----------------------------------------------------------------------------
+# MiniMax API provides text generation models.
+# Get your API key from https://platform.minimax.io/
+#
+# CLAUDE_CODE_USE_OPENAI=1
+# MINIMAX_API_KEY=your-minimax-key-here
+# OPENAI_BASE_URL=https://api.minimax.io/v1
+# OPENAI_MODEL=MiniMax-M2.5
+
+
 # =============================================================================
 # OPTIONAL TUNING
 # =============================================================================
@@ -243,6 +284,16 @@ ANTHROPIC_API_KEY=sk-ant-your-key-here
 # Disable "Co-authored-by" line in git commits made by OpenClaude
 # OPENCLAUDE_DISABLE_CO_AUTHORED_BY=1

+# Disable strict tool schema normalization for non-Gemini providers
+# Useful when MCP tools with complex optional params (e.g. list[dict])
+# trigger "Extra required key ... supplied" errors from OpenAI-compatible endpoints
+# OPENCLAUDE_DISABLE_STRICT_TOOLS=1
+
+# Disable hidden <system-reminder> messages injected into tool output
+# Suppresses the file-read cyber-risk reminder and the todo/task tool nudges
+# Useful for users who want full transparency over what the model sees
+# OPENCLAUDE_DISABLE_TOOL_REMINDERS=1
+
 # Custom timeout for API requests in milliseconds (default: varies)
 # API_TIMEOUT_MS=60000

--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -11,6 +11,7 @@ concurrency:

 jobs:
  release-please:
+    if: ${{ github.repository == 'Gitlawb/openclaude' }}
    name: Release Please
    runs-on: ubuntu-latest
    permissions:
--- a/.gitignore
+++ b/.gitignore
@@ -7,6 +7,8 @@ dist/
 .openclaude-profile.json
 reports/
 GEMINI.md
+CLAUDE.md
 package-lock.json
 /.claude
 coverage/
+agent.log
--- a/.release-please-manifest.json
+++ b/.release-please-manifest.json
@@ -1,3 +1,3 @@
 {
-  ".": "0.3.0"
+  ".": "0.6.0"
 }
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,87 @@
 # Changelog

+## [0.6.0](https://github.com/Gitlawb/openclaude/compare/v0.5.2...v0.6.0) (2026-04-22)
+
+
+### Features
+
+* add model caching and benchmarking utilities ([#671](https://github.com/Gitlawb/openclaude/issues/671)) ([2b15e16](https://github.com/Gitlawb/openclaude/commit/2b15e16421f793f954a92c53933a07094544b29d))
+* add thinking token extraction ([#798](https://github.com/Gitlawb/openclaude/issues/798)) ([268c039](https://github.com/Gitlawb/openclaude/commit/268c0398e4bf1ab898069c61500a2b3c226a0322))
+* **api:** compress old tool_result content for small-context providers ([#801](https://github.com/Gitlawb/openclaude/issues/801)) ([a6a3de5](https://github.com/Gitlawb/openclaude/commit/a6a3de5ac155fe9d00befbfcab98d439314effd8))
+* **api:** improve local provider reliability with readiness and self-healing ([#738](https://github.com/Gitlawb/openclaude/issues/738)) ([4cb963e](https://github.com/Gitlawb/openclaude/commit/4cb963e660dbd6ee438c04042700db05a9d32c59))
+* **api:** smart model routing primitive (cheap-for-simple, strong-for-hard) ([#785](https://github.com/Gitlawb/openclaude/issues/785)) ([e908864](https://github.com/Gitlawb/openclaude/commit/e908864da7e7c987a98053ac5d18d702e192db2b))
+* enable 15 additional feature flags in open build ([#667](https://github.com/Gitlawb/openclaude/issues/667)) ([6a62e3f](https://github.com/Gitlawb/openclaude/commit/6a62e3ff76ba9ba446b8e20cf2bb139ee76a9387))
+* native Anthropic API mode for Claude models on GitHub Copilot ([#579](https://github.com/Gitlawb/openclaude/issues/579)) ([fdef4a1](https://github.com/Gitlawb/openclaude/commit/fdef4a1b4ce218ded4937ca83b30acce7c726472))
+* **provider:** expose Atomic Chat in /provider picker with autodetect ([#810](https://github.com/Gitlawb/openclaude/issues/810)) ([ee19159](https://github.com/Gitlawb/openclaude/commit/ee19159c17b3de3b4a8b4a4541a6569f4261d54e))
+* **provider:** zero-config autodetection primitive ([#784](https://github.com/Gitlawb/openclaude/issues/784)) ([a5bfcbb](https://github.com/Gitlawb/openclaude/commit/a5bfcbbadf8e9a1fd42f3e103d295524b8da64b0))
+
+
+### Bug Fixes
+
+* **api:** ensure strict role sequence and filter empty assistant messages after interruption ([#745](https://github.com/Gitlawb/openclaude/issues/745) regression) ([#794](https://github.com/Gitlawb/openclaude/issues/794)) ([06e7684](https://github.com/Gitlawb/openclaude/commit/06e7684eb56df8e694ac784575e163641931c44c))
+* Collapse all-text arrays to string for DeepSeek compatibility ([#806](https://github.com/Gitlawb/openclaude/issues/806)) ([761924d](https://github.com/Gitlawb/openclaude/commit/761924daa7e225fe8acf41651408c7cae639a511))
+* **model:** codex/nvidia-nim/minimax now read OPENAI_MODEL env ([#815](https://github.com/Gitlawb/openclaude/issues/815)) ([4581208](https://github.com/Gitlawb/openclaude/commit/458120889f6ce54cc9f0b287461d5e38eae48a20))
+* **provider:** saved profile ignored when stale CLAUDE_CODE_USE_* in shell ([#807](https://github.com/Gitlawb/openclaude/issues/807)) ([13de4e8](https://github.com/Gitlawb/openclaude/commit/13de4e85df7f5fadc8cd15a76076374dc112360b))
+* rename .claude.json to .openclaude.json with legacy fallback ([#582](https://github.com/Gitlawb/openclaude/issues/582)) ([4d4fb28](https://github.com/Gitlawb/openclaude/commit/4d4fb2880e4d0e3a62d8715e1ec13d932e736279))
+* replace discontinued gemini-2.5-pro-preview-03-25 with stable gemini-2.5-pro ([#802](https://github.com/Gitlawb/openclaude/issues/802)) ([64582c1](https://github.com/Gitlawb/openclaude/commit/64582c119d5d0278195271379da4a68d59a89c1f)), closes [#398](https://github.com/Gitlawb/openclaude/issues/398)
+* **security:** harden project settings trust boundary + MCP sanitization ([#789](https://github.com/Gitlawb/openclaude/issues/789)) ([ae3b723](https://github.com/Gitlawb/openclaude/commit/ae3b723f3b297b49925cada4728f3174aee8bf12))
+* **test:** autoCompact floor assertion is flag-sensitive ([#816](https://github.com/Gitlawb/openclaude/issues/816)) ([c13842e](https://github.com/Gitlawb/openclaude/commit/c13842e91c7227246520955de6ae0636b30def9a))
+* **ui:** prevent provider manager lag by deferring sync I/O ([#803](https://github.com/Gitlawb/openclaude/issues/803)) ([85eab27](https://github.com/Gitlawb/openclaude/commit/85eab2751e7d351bb0ed6a3fe0e15461d241c9cb))
+
+## [0.5.2](https://github.com/Gitlawb/openclaude/compare/v0.5.1...v0.5.2) (2026-04-20)
+
+
+### Bug Fixes
+
+* **api:** replace phrase-based reasoning sanitizer with tag-based filter ([#779](https://github.com/Gitlawb/openclaude/issues/779)) ([336ddcc](https://github.com/Gitlawb/openclaude/commit/336ddcc50d59d79ebff50993f2673652aecb0d7d))
+
+## [0.5.1](https://github.com/Gitlawb/openclaude/compare/v0.5.0...v0.5.1) (2026-04-20)
+
+
+### Bug Fixes
+
+* enforce Bash path constraints after sandbox allow ([#777](https://github.com/Gitlawb/openclaude/issues/777)) ([7002cb3](https://github.com/Gitlawb/openclaude/commit/7002cb302b78ea2a19da3f26226de24e2903fa1d))
+* enforce MCP OAuth callback state before errors ([#775](https://github.com/Gitlawb/openclaude/issues/775)) ([739b8d1](https://github.com/Gitlawb/openclaude/commit/739b8d1f40fde0e401a5cbd2b9a55d88bd5124ad))
+* require trusted approval for sandbox override ([#778](https://github.com/Gitlawb/openclaude/issues/778)) ([aab4890](https://github.com/Gitlawb/openclaude/commit/aab489055c53dd64369414116fe93226d2656273))
+
+## [0.5.0](https://github.com/Gitlawb/openclaude/compare/v0.4.0...v0.5.0) (2026-04-20)
+
+
+### Features
+
+* add OPENCLAUDE_DISABLE_STRICT_TOOLS env var to opt out of strict MCP tool schema normalization ([#770](https://github.com/Gitlawb/openclaude/issues/770)) ([e6e8d9a](https://github.com/Gitlawb/openclaude/commit/e6e8d9a24897e4c9ef08b72df20fabbf8ef27f38))
+* mask provider api key input ([#772](https://github.com/Gitlawb/openclaude/issues/772)) ([13e9f22](https://github.com/Gitlawb/openclaude/commit/13e9f22a83a2b0f85f557b1e12c9442ba61241e4))
+
+
+### Bug Fixes
+
+* allow provider recovery during startup ([#765](https://github.com/Gitlawb/openclaude/issues/765)) ([f828171](https://github.com/Gitlawb/openclaude/commit/f828171ef1ab94e2acf73a28a292799e4e26cc0d))
+* **api:** drop orphan tool results to satisfy strict role sequence ([#745](https://github.com/Gitlawb/openclaude/issues/745)) ([b786b76](https://github.com/Gitlawb/openclaude/commit/b786b765f01f392652eaf28ed3579a96b7260a53))
+* **help:** prevent /help tab crash from undefined descriptions ([#732](https://github.com/Gitlawb/openclaude/issues/732)) ([3d1979f](https://github.com/Gitlawb/openclaude/commit/3d1979ff066db32415e0c8321af916d81f5f2621))
+* **mcp:** sync required array with properties in tool schemas ([#754](https://github.com/Gitlawb/openclaude/issues/754)) ([002a8f1](https://github.com/Gitlawb/openclaude/commit/002a8f1f6de2fcfc917165d828501d3047bad61f))
+* remove cached mcpClient in diagnostic tracking to prevent stale references ([#727](https://github.com/Gitlawb/openclaude/issues/727)) ([2c98be7](https://github.com/Gitlawb/openclaude/commit/2c98be700274a4241963b5f43530bf3bd8f8963f))
+* use raw context window for auto-compact percentage display ([#748](https://github.com/Gitlawb/openclaude/issues/748)) ([55c5f26](https://github.com/Gitlawb/openclaude/commit/55c5f262a9a5a8be0aa9ae8dc6c7dafc465eb2c6))
+
+## [0.4.0](https://github.com/Gitlawb/openclaude/compare/v0.3.0...v0.4.0) (2026-04-17)
+
+
+### Features
+
+* add Alibaba Coding Plan (DashScope) provider support ([#509](https://github.com/Gitlawb/openclaude/issues/509)) ([43ac6db](https://github.com/Gitlawb/openclaude/commit/43ac6dba75537282da1e2ad8f855082bc4e25f1e))
+* add NVIDIA NIM and MiniMax provider support ([#552](https://github.com/Gitlawb/openclaude/issues/552)) ([51191d6](https://github.com/Gitlawb/openclaude/commit/51191d61326e1f8319d70b3a3c0d9229e185a564))
+* add ripgrep to Dockerfile for faster file searching ([#688](https://github.com/Gitlawb/openclaude/issues/688)) ([12dd375](https://github.com/Gitlawb/openclaude/commit/12dd3755c619cc27af3b151ae8fdb9d425a7b9a2))
+* **api:** classify openai-compatible provider failures ([#708](https://github.com/Gitlawb/openclaude/issues/708)) ([80a00ac](https://github.com/Gitlawb/openclaude/commit/80a00acc2c6dc4657a78de7366f7a9ebc920bfbb))
+* **vscode:** add full chat interface to OpenClaude extension ([#608](https://github.com/Gitlawb/openclaude/issues/608)) ([fbcd928](https://github.com/Gitlawb/openclaude/commit/fbcd928f7f8511da795aea3ad318bddf0ab9a1a7))
+
+
+### Bug Fixes
+
+* focus "Done" option after completing provider manager actions ([#718](https://github.com/Gitlawb/openclaude/issues/718)) ([d6f5130](https://github.com/Gitlawb/openclaude/commit/d6f5130c204d8ffe582212466768706cd7fd6774))
+* **models:** prevent /models crash from non-string saved model values ([#691](https://github.com/Gitlawb/openclaude/issues/691)) ([6b2121d](https://github.com/Gitlawb/openclaude/commit/6b2121da12189fa7ce1f33394d18abd24cf8a01b))
+* prevent crash in commands tab when description is undefined ([#730](https://github.com/Gitlawb/openclaude/issues/730)) ([eed77e6](https://github.com/Gitlawb/openclaude/commit/eed77e6579866a98384dcc948a0ad6406614ede3))
+* strip comments before scanning for missing imports ([#676](https://github.com/Gitlawb/openclaude/issues/676)) ([a00b792](https://github.com/Gitlawb/openclaude/commit/a00b7928de9662ffb7ef6abd8cd040afe6f4f122))
+* **ui:** show correct endpoint URL in intro screen for custom Anthropic endpoints ([#735](https://github.com/Gitlawb/openclaude/issues/735)) ([3424663](https://github.com/Gitlawb/openclaude/commit/34246635fb9a09499047a52e7f96ca9b36c8a85a))
+
 ## [0.3.0](https://github.com/Gitlawb/openclaude/compare/v0.2.3...v0.3.0) (2026-04-14)


--- a/9
+++ b/9
@@ -36,14 +36,11 @@ COPY --from=build /app/node_modules/ node_modules/
 COPY --from=build /app/package.json package.json
 COPY README.md ./

-# Install git — many CLI tool operations depend on it
-RUN apt-get update && apt-get install -y --no-install-recommends git \
+# Install git and ripgrep — many CLI tool operations depend on them
+RUN apt-get update && apt-get install -y --no-install-recommends git ripgrep \
    && rm -rf /var/lib/apt/lists/*

 # Run as non-root user
-RUN groupadd --gid 1000 appuser && useradd --uid 1000 --gid appuser --shell /bin/bash --create-home appuser
-USER appuser
-WORKDIR /home/appuser
-ENV HOME=/home/appuser
+USER node

 ENTRYPOINT ["node", "/app/dist/cli.mjs"]
--- a/README.md
+++ b/README.md
@@ -15,6 +15,10 @@ OpenClaude is also mirrored to GitLawb:

 [Quick Start](#quick-start) | [Setup Guides](#setup-guides) | [Providers](#supported-providers) | [Source Build](#source-build-and-local-development) | [VS Code Extension](#vs-code-extension) | [Community](#community)

+## Star History
+
+[![Star History Chart](https://api.star-history.com/chart?repos=gitlawb/openclaude&type=date&legend=top-left)](https://www.star-history.com/?repos=gitlawb%2Fopenclaude&type=date&legend=top-left)
+
 ## Why OpenClaude

 - Use one CLI across cloud APIs and local model backends
@@ -88,6 +92,16 @@ $env:OPENAI_MODEL="qwen2.5-coder:7b"
 openclaude
 ```

+### Using Ollama's launch command
+
+If you have [Ollama](https://ollama.com) installed, you can skip the env var setup entirely:
+
+```bash
+ollama launch openclaude --model qwen2.5-coder:7b
+```
+
+This automatically sets `ANTHROPIC_BASE_URL`, model routing, and auth so all API traffic goes through your local Ollama instance. Works with any model you have pulled — local or cloud.
+
 ## Setup Guides

 Beginner-friendly guides:
@@ -110,8 +124,8 @@ Advanced and source-build guides:
 | GitHub Models | `/onboard-github` | Interactive onboarding with saved credentials |
 | Codex OAuth | `/provider` | Opens ChatGPT sign-in in your browser and stores Codex credentials securely |
 | Codex | `/provider` | Uses existing Codex CLI auth, OpenClaude secure storage, or env credentials |
-| Ollama | `/provider` or env vars | Local inference with no API key |
-| Atomic Chat | advanced setup | Local Apple Silicon backend |
+| Ollama | `/provider`, env vars, or `ollama launch` | Local inference with no API key |
+| Atomic Chat | `/provider`, env vars, or `bun run dev:atomic-chat` | Local Model Provider; auto-detects loaded models |
 | Bedrock / Vertex / Foundry | env vars | Additional provider integrations for supported environments |

 ## What Works
@@ -317,7 +331,8 @@ For larger changes, open an issue first so the scope is clear before implementat
 - `bun run build`
 - `bun run test:coverage`
 - `bun run smoke`
- focused `bun test ...` runs for touched areas
+- focused `bun test ...` runs for files and flows you changed
+

 ## Disclaimer

--- a/docs/advanced-setup.md
+++ b/docs/advanced-setup.md
@@ -84,6 +84,16 @@ OpenRouter model availability changes over time. If a model stops working, try a

 ### Ollama

+Using `ollama launch` (recommended if you have Ollama installed):
+
+```bash
+ollama launch openclaude --model llama3.3:70b
+```
+
+This handles all environment setup automatically — no env vars needed. Works with any local or cloud model available in your Ollama instance.
+
+Using environment variables manually:
+
 ```bash
 ollama pull llama3.3:70b

--- a/docs/hook-chains.md
+++ b/docs/hook-chains.md
@@ -0,0 +1,333 @@
+# Hook Chains (Self-Healing Agent Mesh MVP)
+
+Hook Chains provide an event-driven recovery layer for important workflow failures.
+When a matching hook event occurs, OpenClaude evaluates declarative rules and can dispatch remediation actions such as:
+
+- `spawn_fallback_agent`
+- `notify_team`
+- `warm_remote_capacity`
+
+## Disabled-By-Default Rollout
+
+> **Rollout recommendation:** keep Hook Chains disabled until you validate rules in your environment.
+>
+> - Set top-level config to `"enabled": false` initially.
+> - Enable per environment when ready.
+> - Dispatch is gated by `feature('HOOK_CHAINS')`.
+> - Env gate defaults to off unless `CLAUDE_CODE_ENABLE_HOOK_CHAINS=1` is set.
+
+This keeps existing workflows unchanged while you tune guard windows and action behavior.
+
+## Feature Overview
+
+Hook Chains are loaded from a deterministic config file and evaluated on dispatched hook events.
+
+MVP runtime trigger wiring:
+
+- `PostToolUseFailure` hooks dispatch Hook Chains with outcome `failed`.
+- `TaskCompleted` hooks dispatch Hook Chains with outcome:
+  - `success` when completion hooks did not block.
+  - `failed` when completion hooks returned blocking errors or prevented continuation.
+
+Default config path:
+
+- `.openclaude/hook-chains.json`
+
+Override path:
+
+- `CLAUDE_CODE_HOOK_CHAINS_CONFIG_PATH=/abs/or/relative/path/to/hook-chains.json`
+
+Global gate:
+
+- `feature('HOOK_CHAINS')` must be enabled in the build
+- `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0|1` (defaults to disabled when unset)
+
+## Safety Guarantees
+
+The runtime is intentionally conservative:
+
+- **Depth guard:** chain dispatch is blocked when `chainDepth >= maxChainDepth`.
+- **Rule cooldown:** each rule can only re-fire after cooldown expires.
+- **Dedup window:** identical event/action combinations are suppressed for a window.
+- **Abort-safe behavior:** if the current signal is aborted, actions skip safely.
+- **Policy-aware remote warm:** `warm_remote_capacity` skips when remote sessions are policy denied.
+- **Bridge inactive no-op:** `warm_remote_capacity` safely skips when no active bridge handle exists.
+- **Missing team context safety:** `notify_team` skips with structured reason if no team context/team file is available.
+- **Fallback launcher safety:** `spawn_fallback_agent` fails with a structured reason when launch permissions/context are unavailable.
+
+## Configuration Schema Reference
+
+Top-level object:
+
+```json
+{
+  "version": 1,
+  "enabled": true,
+  "maxChainDepth": 2,
+  "defaultCooldownMs": 30000,
+  "defaultDedupWindowMs": 30000,
+  "rules": []
+}
+```
+
+### Top-Level Fields
+
+| Field | Type | Required | Notes |
+|---|---|---:|---|
+| `version` | `1` | No | Defaults to `1`. |
+| `enabled` | `boolean` | No | Global feature switch for this config file. |
+| `maxChainDepth` | `integer` | No | Global depth guard (default `2`, max `10`). |
+| `defaultCooldownMs` | `integer` | No | Default rule cooldown in ms (default `30000`). |
+| `defaultDedupWindowMs` | `integer` | No | Default action dedup window in ms (default `30000`). |
+| `rules` | `HookChainRule[]` | No | Defaults to `[]`. May be omitted or empty; when no rules are present, dispatch is a no-op and returns `enabled: false`. |
+
+> **Note:** An empty ruleset is valid and can be used to keep Hook Chains configured but effectively disabled until rules are added.
+### Rule Object (`HookChainRule`)
+
+```json
+{
+  "id": "task-failure-recovery",
+  "enabled": true,
+  "trigger": {
+    "event": "TaskCompleted",
+    "outcome": "failed"
+  },
+  "condition": {
+    "toolNames": ["Edit"],
+    "taskStatuses": ["failed"],
+    "errorIncludes": ["timeout", "permission denied"],
+    "eventFieldEquals": {
+      "meta.source": "scheduler"
+    }
+  },
+  "cooldownMs": 60000,
+  "dedupWindowMs": 30000,
+  "maxDepth": 2,
+  "actions": []
+}
+```
+
+| Field | Type | Required | Notes |
+|---|---|---:|---|
+| `id` | `string` | Yes | Stable identifier used in telemetry/guards. |
+| `enabled` | `boolean` | No | Per-rule switch. |
+| `trigger.event` | `HookEvent` | Yes | Event name to match. |
+| `trigger.outcome` | `"success"|"failed"|"timeout"|"unknown"` | No | Single outcome matcher. |
+| `trigger.outcomes` | `Outcome[]` | No | Multi-outcome matcher. Use either `outcome` or `outcomes`. |
+| `condition` | `object` | No | Optional extra matching constraints. |
+| `cooldownMs` | `integer` | No | Overrides global cooldown for this rule. |
+| `dedupWindowMs` | `integer` | No | Overrides global dedup for this rule. |
+| `maxDepth` | `integer` | No | Per-rule depth cap. |
+| `actions` | `HookChainAction[]` | Yes | One or more actions to execute in order. |
+
+### Condition Fields
+
+| Field | Type | Notes |
+|---|---|---|
+| `toolNames` | `string[]` | Matches `tool_name` / `toolName` in event payload. |
+| `taskStatuses` | `string[]` | Matches `task_status` / `taskStatus` / `status`. |
+| `errorIncludes` | `string[]` | Case-insensitive substring match against `error` / `reason` / `message`. |
+| `eventFieldEquals` | `Record<string, string\|number\|boolean>` | Dot-path equality against payload (example: `"meta.source": "scheduler"`). |
+
+### Actions
+
+#### `spawn_fallback_agent`
+
+```json
+{
+  "type": "spawn_fallback_agent",
+  "id": "fallback-1",
+  "enabled": true,
+  "dedupWindowMs": 30000,
+  "description": "Fallback recovery for failed task",
+  "promptTemplate": "Recover task ${TASK_SUBJECT}. Event=${EVENT_NAME}, outcome=${OUTCOME}, error=${ERROR}. Payload=${PAYLOAD_JSON}",
+  "agentType": "general-purpose",
+  "model": "sonnet"
+}
+```
+
+#### `notify_team`
+
+```json
+{
+  "type": "notify_team",
+  "id": "notify-ops",
+  "enabled": true,
+  "dedupWindowMs": 30000,
+  "teamName": "mesh-team",
+  "recipients": ["*"],
+  "summary": "Hook chain ${RULE_ID} fired",
+  "messageTemplate": "Event=${EVENT_NAME} outcome=${OUTCOME}\nTask=${TASK_ID}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
+}
+```
+
+#### `warm_remote_capacity`
+
+```json
+{
+  "type": "warm_remote_capacity",
+  "id": "warm-bridge",
+  "enabled": true,
+  "dedupWindowMs": 60000,
+  "createDefaultEnvironmentIfMissing": false
+}
+```
+
+## Complete Example Configs
+
+### 1) Retry via Fallback Agent
+
+```json
+{
+  "version": 1,
+  "enabled": true,
+  "maxChainDepth": 2,
+  "defaultCooldownMs": 30000,
+  "defaultDedupWindowMs": 30000,
+  "rules": [
+    {
+      "id": "retry-task-via-fallback",
+      "trigger": {
+        "event": "TaskCompleted",
+        "outcome": "failed"
+      },
+      "cooldownMs": 60000,
+      "actions": [
+        {
+          "type": "spawn_fallback_agent",
+          "id": "spawn-retry-agent",
+          "description": "Retry failed task with fallback agent",
+          "promptTemplate": "A task failed. Recover it safely.\nTask=${TASK_SUBJECT}\nDescription=${TASK_DESCRIPTION}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}",
+          "agentType": "general-purpose",
+          "model": "sonnet"
+        }
+      ]
+    }
+  ]
+}
+```
+
+### 2) Notify Only
+
+```json
+{
+  "version": 1,
+  "enabled": true,
+  "maxChainDepth": 2,
+  "defaultCooldownMs": 30000,
+  "defaultDedupWindowMs": 30000,
+  "rules": [
+    {
+      "id": "notify-on-tool-failure",
+      "trigger": {
+        "event": "PostToolUseFailure",
+        "outcome": "failed"
+      },
+      "condition": {
+        "toolNames": ["Edit", "Write", "Bash"]
+      },
+      "actions": [
+        {
+          "type": "notify_team",
+          "id": "notify-team-failure",
+          "recipients": ["*"],
+          "summary": "Tool failure detected",
+          "messageTemplate": "Tool failure detected.\nEvent=${EVENT_NAME} outcome=${OUTCOME}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
+        }
+      ]
+    }
+  ]
+}
+```
+
+### 3) Combined Fallback + Notify + Bridge Warm
+
+```json
+{
+  "version": 1,
+  "enabled": true,
+  "maxChainDepth": 2,
+  "defaultCooldownMs": 45000,
+  "defaultDedupWindowMs": 30000,
+  "rules": [
+    {
+      "id": "full-recovery-chain",
+      "trigger": {
+        "event": "TaskCompleted",
+        "outcomes": ["failed", "timeout"]
+      },
+      "condition": {
+        "errorIncludes": ["timeout", "capacity", "connection"]
+      },
+      "cooldownMs": 90000,
+      "actions": [
+        {
+          "type": "spawn_fallback_agent",
+          "id": "fallback-agent",
+          "description": "Recover failed task execution",
+          "promptTemplate": "Recover failed task and produce a concise fix summary.\nTask=${TASK_SUBJECT}\nError=${ERROR}\nPayload=${PAYLOAD_JSON}"
+        },
+        {
+          "type": "notify_team",
+          "id": "notify-team",
+          "recipients": ["*"],
+          "summary": "Recovery chain triggered",
+          "messageTemplate": "Recovery chain ${RULE_ID} fired.\nOutcome=${OUTCOME}\nTask=${TASK_SUBJECT}\nError=${ERROR}"
+        },
+        {
+          "type": "warm_remote_capacity",
+          "id": "warm-capacity",
+          "createDefaultEnvironmentIfMissing": false
+        }
+      ]
+    }
+  ]
+}
+```
+
+## Template Variables
+
+The following placeholders are supported by `promptTemplate`, `summary`, and `messageTemplate`:
+
+- `${EVENT_NAME}`
+- `${OUTCOME}`
+- `${RULE_ID}`
+- `${TASK_SUBJECT}`
+- `${TASK_DESCRIPTION}`
+- `${TASK_ID}`
+- `${ERROR}`
+- `${PAYLOAD_JSON}`
+
+## Troubleshooting
+
+### Rule never triggers
+
+- Verify `trigger.event` and `trigger.outcome`/`trigger.outcomes` exactly match dispatched event data.
+- Check `condition` filters (especially `toolNames` and `eventFieldEquals` dot-path keys).
+- Confirm the config file is valid JSON and schema-valid.
+
+### Actions show as skipped
+
+Common skip reasons:
+
+- `action disabled`
+- `rule cooldown active ...`
+- `dedup window active ...`
+- `max chain depth reached ...`
+- `No team context is available ...`
+- `Team file not found ...`
+- `Remote sessions are blocked by policy`
+- `Bridge is not active; warm_remote_capacity is a safe no-op`
+- `No fallback agent launcher is registered in runtime context`
+
+### Config changes not reflected
+
+- Loader uses memoization by file mtime/size.
+- Ensure your editor writes the file fully and updates mtime.
+- If needed, force reload from the caller side with `forceReloadConfig: true`.
+
+### Existing workflows changed unexpectedly
+
+- Set `"enabled": false` at top-level.
+- Or globally disable with `CLAUDE_CODE_ENABLE_HOOK_CHAINS=0`.
+- Re-enable gradually after validating one rule at a time.
--- a/docs/repo-map.md
+++ b/docs/repo-map.md
@@ -1,67 +0,0 @@
-# Codebase Intelligence — Repo Map
-
-The repo map feature gives the AI model structural awareness of your codebase at the start of each session. Instead of the model needing to explore the repository with `Grep`, `Glob`, and `Read` calls, it starts with a ranked summary of the most important files and their key signatures.
-
-## How it works
-
-1. **File enumeration** — Lists all tracked files via `git ls-files` (falls back to a manual directory walk when not in a git repo)
-2. **Symbol extraction** — Parses each supported source file with tree-sitter to extract function, class, type, and interface definitions, plus cross-file references
-3. **Reference graph** — Builds a directed graph where an edge from file A to file B means A references a symbol defined in B. Edges are weighted by reference count multiplied by the IDF (inverse document frequency) of the symbol name — common names like `get`, `set`, `value` contribute less
-4. **PageRank** — Ranks files by structural importance using PageRank. Files imported by many others rank highest
-5. **Rendering** — Walks ranked files top-down, emitting file paths and definition signatures, stopping when the token budget is reached
-
-Results are cached to disk (`~/.openclaude/repomap-cache/`) keyed by file path, mtime, and size. Only changed files are re-parsed on subsequent runs.
-
-## Supported languages
-
- TypeScript (`.ts`, `.tsx`)
- JavaScript (`.js`, `.jsx`, `.mjs`, `.cjs`)
- Python (`.py`)
-
-Additional language grammars will be added in future releases.
-
-## Enabling auto-injection
-
-The repo map is gated behind the `REPO_MAP` feature flag, **off by default**. To enable auto-injection into the session context:
-
-Set the environment variable before launching:
-
-```bash
-REPO_MAP=1 openclaude
-```
-
-Or add it to your shell profile for persistent use.
-
-When enabled, the map is built once per session and prepended to the system context alongside git status and CLAUDE.md content. The default budget is 1024 tokens.
-
-Auto-injection is skipped in:
- Bare mode (`--bare`)
- Remote sessions (`CLAUDE_CODE_REMOTE`)
-
-## The /repomap slash command
-
-The `/repomap` command is always available regardless of the feature flag. It lets you inspect and tune the map interactively.
-
-```
-/repomap                          # Show the map with default settings (1024 tokens)
-/repomap --tokens 4096            # Increase the token budget for a larger map
-/repomap --focus src/tools/       # Boost specific paths in the ranking
-/repomap --focus src/context.ts   # Can use multiple --focus flags
-/repomap --stats                  # Show cache statistics
-/repomap --invalidate             # Clear cache and rebuild from scratch
-```
-
-## The RepoMap tool
-
-The model can also call the `RepoMap` tool on demand during a session. This is useful when:
- The model needs structural context mid-conversation
- The user asks about specific areas (the model can pass `focus_files` or `focus_symbols`)
- A larger token budget is needed than the auto-injected default
-
-## Known limitations
-
- **Signatures only** — The map shows function/class/type declarations, not implementations. The model still needs `Read` to see function bodies.
- **Cold build time** — First build on large repos (2000+ files) can take 20-30 seconds due to WASM-based parsing. Subsequent builds use the disk cache and complete in under 100ms.
- **Language coverage** — Only TypeScript, JavaScript, and Python are supported. Files in other languages are skipped.
- **TypeScript references** — The TypeScript tree-sitter query captures type annotations and `new` expressions as references, but not plain function calls. This means the ranking slightly favors type-heavy hub files.
- **Git dependency** — File enumeration uses `git ls-files` by default. Non-git repos fall back to a directory walk with hardcoded exclusions.
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "@gitlawb/openclaude",
-  "version": "0.3.0",
+  "version": "0.6.0",
  "description": "Claude Code opened to any LLM — OpenAI, Gemini, DeepSeek, Ollama, and 200+ models",
  "type": "module",
  "bin": {
@@ -95,12 +95,8 @@
    "fuse.js": "7.1.0",
    "get-east-asian-width": "1.5.0",
    "google-auth-library": "9.15.1",
-    "graphology": "^0.26.0",
-    "graphology-operators": "^1.6.0",
-    "graphology-pagerank": "^1.1.0",
    "https-proxy-agent": "7.0.6",
    "ignore": "7.0.5",
-    "js-tiktoken": "^1.0.16",
    "indent-string": "5.0.0",
    "jsonc-parser": "3.3.1",
    "lodash-es": "4.18.1",
@@ -121,13 +117,11 @@
    "strip-ansi": "7.2.0",
    "supports-hyperlinks": "3.2.0",
    "tree-kill": "1.2.2",
-    "tree-sitter-wasms": "^0.1.12",
    "turndown": "7.2.2",
    "type-fest": "4.41.0",
    "undici": "7.24.6",
    "usehooks-ts": "3.1.1",
    "vscode-languageserver-protocol": "3.17.5",
-    "web-tree-sitter": "^0.25.0",
    "wrap-ansi": "9.0.2",
    "ws": "8.20.0",
    "xss": "1.0.15",
--- a/scripts/build.ts
+++ b/scripts/build.ts
@@ -19,30 +19,46 @@ const version = pkg.version
 // Most Anthropic-internal features stay off; open-build features can be
 // selectively enabled here when their full source exists in the mirror.
 const featureFlags: Record<string, boolean> = {
-  VOICE_MODE: false,
-  PROACTIVE: false,
-  KAIROS: false,
-  BRIDGE_MODE: false,
-  DAEMON: false,
-  AGENT_TRIGGERS: false,
-  MONITOR_TOOL: true,
-  ABLATION_BASELINE: false,
-  DUMP_SYSTEM_PROMPT: false,
-  CACHED_MICROCOMPACT: false,
-  COORDINATOR_MODE: true,
-  BUILTIN_EXPLORE_PLAN_AGENTS: true,
-  CONTEXT_COLLAPSE: false,
-  COMMIT_ATTRIBUTION: false,
-  TEAMMEM: true,
-  UDS_INBOX: false,
-  BG_SESSIONS: false,
-  AWAY_SUMMARY: false,
-  TRANSCRIPT_CLASSIFIER: false,
-  WEB_BROWSER_TOOL: false,
-  MESSAGE_ACTIONS: true,
-  BUDDY: true,
-  CHICAGO_MCP: false,
-  COWORKER_TYPE_TELEMETRY: false,
+  // ── Disabled: require Anthropic infrastructure or missing source ─────
+  VOICE_MODE: false,              // Push-to-talk STT via claude.ai OAuth endpoint
+  PROACTIVE: false,               // Autonomous agent mode (missing proactive/ module)
+  KAIROS: false,                  // Persistent assistant/session mode (cloud backend)
+  BRIDGE_MODE: false,             // Remote desktop bridge via CCR infrastructure
+  DAEMON: false,                  // Background daemon process (stubbed in open build)
+  AGENT_TRIGGERS: false,          // Scheduled remote agent triggers
+  ABLATION_BASELINE: false,       // A/B testing harness for eval experiments
+  CONTEXT_COLLAPSE: false,        // Context collapsing optimization (stubbed)
+  COMMIT_ATTRIBUTION: false,      // Co-Authored-By metadata in git commits
+  UDS_INBOX: false,               // Unix Domain Socket inter-session messaging
+  BG_SESSIONS: false,             // Background sessions via tmux (stubbed)
+  WEB_BROWSER_TOOL: false,        // Built-in browser automation (source not mirrored)
+  CHICAGO_MCP: false,             // Computer-use MCP (native Swift modules stubbed)
+  COWORKER_TYPE_TELEMETRY: false, // Telemetry for agent/coworker type classification
+  MCP_SKILLS: false,              // Dynamic MCP skill discovery (src/skills/mcpSkills.ts not mirrored; enabling this causes "fetchMcpSkillsForClient is not a function" when MCP servers with resources connect — see #856)
+
+  // ── Enabled: upstream defaults ──────────────────────────────────────
+  COORDINATOR_MODE: true,             // Multi-agent coordinator with worker delegation
+  BUILTIN_EXPLORE_PLAN_AGENTS: true,  // Built-in Explore/Plan specialized subagents
+  BUDDY: true,                        // Buddy mode for paired programming
+  MONITOR_TOOL: true,                 // MCP server monitoring/streaming tool
+  TEAMMEM: true,                      // Team memory management
+  MESSAGE_ACTIONS: true,              // Message action buttons in the UI
+
+  // ── Enabled: new activations ────────────────────────────────────────
+  DUMP_SYSTEM_PROMPT: true,           // --dump-system-prompt CLI flag for debugging
+  CACHED_MICROCOMPACT: true,          // Cache-aware tool result truncation optimization
+  AWAY_SUMMARY: true,                 // "While you were away" recap after 5min blur
+  TRANSCRIPT_CLASSIFIER: true,        // Auto-approval classifier for safe tool uses
+  ULTRATHINK: true,                   // Deep thinking mode — type "ultrathink" to boost reasoning
+  TOKEN_BUDGET: true,                 // Token budget tracking with usage warnings
+  HISTORY_PICKER: true,               // Enhanced interactive prompt history picker
+  QUICK_SEARCH: true,                 // Ctrl+G quick search across prompts
+  SHOT_STATS: true,                   // Shot distribution stats in session summary
+  EXTRACT_MEMORIES: true,             // Auto-extract durable memories from conversations
+  FORK_SUBAGENT: true,                // Implicit context-forking when omitting subagent_type
+  VERIFICATION_AGENT: true,           // Built-in read-only agent for test/verification
+  PROMPT_CACHE_BREAK_DETECTION: true, // Detect & log unexpected prompt cache invalidations
+  HOOK_PROMPTS: true,                 // Allow tools to request interactive user prompts
 }

 // ── Pre-process: replace feature() calls with boolean literals ──────
@@ -367,9 +383,17 @@ export const SeverityNumber = {};
              const full = pathMod.join(dir, ent.name)
              if (ent.isDirectory()) { walk(full); continue }
              if (!/\.(ts|tsx)$/.test(ent.name)) continue
-              const code: string = fs.readFileSync(full, 'utf-8')
+              const rawCode: string = fs.readFileSync(full, 'utf-8')
              const fileDir = pathMod.dirname(full)

+              // Strip comments before scanning for imports/requires.
+              // The regex scanner matches require()/import() patterns
+              // inside JSDoc comments, causing false-positive missing
+              // module detection that breaks the build with noop stubs.
+              const code = rawCode
+                .replace(/\/\*[\s\S]*?\*\//g, '')  // block comments
+                .replace(/\/\/.*$/gm, '')           // line comments
+
              // Collect static imports: import { X } from '...'
              for (const m of code.matchAll(/import\s+(?:\{([^}]*)\}|(\w+))?\s*(?:,\s*\{([^}]*)\})?\s*from\s+['"](.*?)['"]/g)) {
                checkAndRegister(m[4], fileDir, m[1] || m[3] || '')
--- a/scripts/feature-flags-source-guard.test.ts
+++ b/scripts/feature-flags-source-guard.test.ts
@@ -0,0 +1,47 @@
+import { existsSync, readFileSync } from 'fs'
+import { join } from 'path'
+import { expect, test } from 'bun:test'
+
+// Regression guard for #856. Several build feature flags require source files
+// that are not mirrored into the open build. When such a flag is set to `true`
+// without the source present, the bundler falls back to a missing-module stub
+// that only exports `default`, which causes runtime errors like
+// `fetchMcpSkillsForClient is not a function` when downstream code reaches
+// through the `require()` to a named export.
+//
+// This test fails fast at test-time if someone re-enables one of these flags
+// without first mirroring the corresponding source file.
+
+const BUILD_SCRIPT = join(import.meta.dir, 'build.ts')
+const REPO_ROOT = join(import.meta.dir, '..')
+
+type FlagGuard = {
+  flag: string
+  source: string // path relative to repo root
+}
+
+const FLAG_REQUIRES_SOURCE: FlagGuard[] = [
+  { flag: 'MCP_SKILLS', source: 'src/skills/mcpSkills.ts' },
+]
+
+test('build feature flags are not enabled without their source files', () => {
+  const buildScript = readFileSync(BUILD_SCRIPT, 'utf-8')
+
+  for (const { flag, source } of FLAG_REQUIRES_SOURCE) {
+    const enabledRe = new RegExp(`^\\s*${flag}\\s*:\\s*true\\b`, 'm')
+    const isEnabled = enabledRe.test(buildScript)
+    const sourceExists = existsSync(join(REPO_ROOT, source))
+
+    if (isEnabled && !sourceExists) {
+      throw new Error(
+        `Feature flag ${flag} is enabled in scripts/build.ts, but its required source file "${source}" does not exist. ` +
+          `Enabling this flag without the source will cause runtime errors (missing named exports from the missing-module stub). ` +
+          `Either mirror the source file or set ${flag}: false.`,
+      )
+    }
+
+    // When the source IS present, the flag can be either true or false; either
+    // is fine. We only care about the "enabled but missing" combination.
+    expect(true).toBe(true)
+  }
+})
--- a/scripts/no-telemetry-growthbook-stub.test.ts
+++ b/scripts/no-telemetry-growthbook-stub.test.ts
@@ -50,6 +50,23 @@ describe('growthbook stub — local feature flag overrides', () => {
    expect(stub.getAllGrowthBookFeatures()).toEqual({})
  })

+  // ── Open-build defaults (_openBuildDefaults) ────────────────────
+
+  test('returns open-build default when flags file is absent', () => {
+    // tengu_passport_quail is in _openBuildDefaults as true; without a
+    // flags file the stub should return the open-build override, not
+    // the call-site defaultValue.
+    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', false)).toBe(true)
+    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_coral_fern', false)).toBe(true)
+  })
+
+  test('flags file overrides open-build defaults', () => {
+    // User-provided feature-flags.json takes priority over _openBuildDefaults.
+    writeFileSync(flagsFile, JSON.stringify({ tengu_passport_quail: false }))
+
+    expect(stub.getFeatureValue_CACHED_MAY_BE_STALE('tengu_passport_quail', true)).toBe(false)
+  })
+
  // ── Valid JSON object ────────────────────────────────────────────

  test('loads and returns values from a valid JSON file', () => {
--- a/scripts/no-telemetry-plugin.ts
+++ b/scripts/no-telemetry-plugin.ts
@@ -40,6 +40,151 @@ import _os from 'node:os';

 let _flags = undefined;

+// ── Open-build GrowthBook overrides ───────────────────────────────────
+// Override upstream defaultValue for runtime gates tied to build-time
+// features. Only keys that DIFFER from upstream belong here — the
+// catalog below is pure documentation and does NOT affect resolution.
+//
+// Priority: ~/.claude/feature-flags.json > _openBuildDefaults > defaultValue
+//
+// To override at runtime, create ~/.claude/feature-flags.json:
+//   { "tengu_some_flag": true }
+const _openBuildDefaults = {
+  'tengu_sedge_lantern': true,  // AWAY_SUMMARY — "while you were away" recap (upstream: false)
+  'tengu_hive_evidence': true,  // VERIFICATION_AGENT — read-only test/verification agent (upstream: false)
+  'tengu_passport_quail': true, // EXTRACT_MEMORIES — enable memory extraction (upstream: false)
+  'tengu_coral_fern': true,     // EXTRACT_MEMORIES — enable memory search in past context (upstream: false)
+};
+
+/* ── Known runtime feature keys (reference only) ───────────────────────
+ * This catalog does NOT participate in flag resolution. It documents
+ * the known GrowthBook keys and their upstream default values, scraped
+ * from src/ call sites. It is NOT exhaustive — new keys may be added
+ * upstream between catalog updates.
+ *
+ * Some keys have different defaults at different call sites — this is
+ * intentional upstream (the server unifies the value at runtime).
+ *
+ * To activate any of these, add them to ~/.claude/feature-flags.json
+ * or to _openBuildDefaults above.
+ *
+ * ── Reasoning & thinking ──────────────────────────────────────────────
+ *   tengu_turtle_carbon            = true       ULTRATHINK deep thinking runtime gate
+ *   tengu_thinkback                = gate       /thinkback replay command
+ *
+ * ── Agents & orchestration ────────────────────────────────────────────
+ *   tengu_amber_flint              = true       Agent swarms coordination
+ *   tengu_amber_stoat              = true       Built-in agent availability (Explore, Plan, etc.)
+ *   tengu_agent_list_attach        = true       Attach file context to agent list
+ *   tengu_auto_background_agents   = false      Auto-spawn background agents
+ *   tengu_slim_subagent_claudemd   = true       Lighter ClaudeMD for subagents
+ *   tengu_hive_evidence            = false      Verification agent / evidence tracking (4 call sites)
+ *   tengu_ultraplan_model          = model cfg  ULTRAPLAN model selection (dynamic config)
+ *
+ * ── Memory & context ──────────────────────────────────────────────────
+ *   tengu_passport_quail           = false      EXTRACT_MEMORIES main gate (isExtractModeActive)
+ *   tengu_coral_fern               = false      EXTRACT_MEMORIES search in past context
+ *   tengu_slate_thimble            = false      Memory dir paths (non-interactive sessions)
+ *   tengu_herring_clock            = true/false Team memory paths (varies by call site)
+ *   tengu_bramble_lintel           = null       Extract memories throttle (null → every turn)
+ *   tengu_sedge_lantern            = false      AWAY_SUMMARY "while you were away" recap
+ *   tengu_session_memory           = false      Session memory service
+ *   tengu_sm_config                = {}         Session memory config (dynamic)
+ *   tengu_sm_compact_config        = {}         Session memory compaction config (dynamic)
+ *   tengu_cobalt_raccoon           = false      Reactive compaction (suppress auto-compact)
+ *   tengu_pebble_leaf_prune        = false      Session storage pruning
+ *
+ * ── Kairos & cron ─────────────────────────────────────────────────────
+ *   tengu_kairos_brief             = false      Brief layout mode (KAIROS)
+ *   tengu_kairos_brief_config      = {}         Brief config (dynamic)
+ *   tengu_kairos_cron              = true       Cron scheduler enable
+ *   tengu_kairos_cron_durable      = true       Durable (disk-persistent) cron tasks
+ *   tengu_kairos_cron_config       = {}         Cron jitter config (dynamic)
+ *
+ * ── Bridge & remote (require Anthropic infra) ─────────────────────────
+ *   tengu_ccr_bridge               = false      CCR bridge connection
+ *   tengu_ccr_bridge_multi_session = gate       Multi-session spawn mode
+ *   tengu_ccr_mirror               = false      CCR session mirroring
+ *   tengu_ccr_bundle_seed_enabled  = gate       Git bundle seeding for CCR
+ *   tengu_ccr_bundle_max_bytes     = null       Bundle size limit (null → default)
+ *   tengu_bridge_repl_v2           = false      Environment-less REPL bridge v2
+ *   tengu_bridge_repl_v2_cse_shim_enabled = true CSE→Session tag retag shim
+ *   tengu_bridge_min_version       = {min:'0'}  Min CLI version for bridge (dynamic)
+ *   tengu_bridge_initial_history_cap = 200      Initial history cap for bridge
+ *   tengu_bridge_system_init       = false      Bridge system initialization
+ *   tengu_cobalt_harbor            = false      Auto-connect CCR at startup
+ *   tengu_cobalt_lantern           = false      Remote setup preconditions
+ *   tengu_remote_backend           = false      Remote TUI backend
+ *   tengu_surreal_dali             = false      Remote agent tasks / triggers
+ *
+ * ── Prompt & API ──────────────────────────────────────────────────────
+ *   tengu_attribution_header       = true       Attribution header in API requests
+ *   tengu_basalt_3kr               = true       MCP instructions delta
+ *   tengu_slate_prism              = true/false Message formatting (varies by call site)
+ *   tengu_amber_prism              = false      Message content formatting
+ *   tengu_amber_json_tools         = false      JSON format for tool schemas
+ *   tengu_fgts                     = false      API feature gates
+ *   tengu_otk_slot_v1              = false      One-time key slots for API auth
+ *   tengu_cicada_nap_ms            = 0          Background GrowthBook refresh throttle (ms)
+ *   tengu_miraculo_the_bard        = false      Service initialization gate
+ *   tengu_immediate_model_command  = false      Immediate /model command execution
+ *   tengu_chomp_inflection         = false      Prompt suggestions after responses
+ *   tengu_tool_pear                = gate       API betas for tool use
+ *   tengu-off-switch               = {act:false} Service kill switch (dynamic; uses dash)
+ *
+ * ── Permissions & security ────────────────────────────────────────────
+ *   tengu_birch_trellis            = true       Bash auto-mode permissions config
+ *   tengu_auto_mode_config         = {}         Auto-mode configuration (dynamic, many call sites)
+ *   tengu_iron_gate_closed         = true       Permission iron gate (with refresh)
+ *   tengu_destructive_command_warning = false    Warning for destructive bash commands
+ *   tengu_disable_bypass_permissions_mode = security Security killswitch (always false in open build)
+ *
+ * ── UI & UX ───────────────────────────────────────────────────────────
+ *   tengu_willow_mode              = 'off'      REPL rendering mode
+ *   tengu_terminal_panel           = false      Terminal panel keybinding
+ *   tengu_terminal_sidebar         = false      Terminal sidebar in REPL/config
+ *   tengu_marble_sandcastle        = false      Fast mode gate
+ *   tengu_jade_anvil_4             = false      Rate limit options UI ordering
+ *   tengu_collage_kaleidoscope     = true       Native clipboard image paste (macOS)
+ *   tengu_lapis_finch              = false      Plugin/hint recommendation
+ *   tengu_lodestone_enabled        = false      Deep links claude-cli:// protocol
+ *   tengu_copper_panda             = false      Skill improvement suggestions
+ *   tengu_desktop_upsell           = {}         Desktop app upsell config (dynamic)
+ *   tengu-top-of-feed-tip          = {}         Emergency tip of feed (dynamic; uses dash)
+ *
+ * ── File operations ───────────────────────────────────────────────────
+ *   tengu_quartz_lantern           = false      File read/write dedup optimization
+ *   tengu_moth_copse               = false      Attachments handling (variant A)
+ *   tengu_marble_fox               = false      Attachments handling (variant B)
+ *   tengu_scratch                  = gate       Scratchpad filesystem access / coordinator
+ *
+ * ── MCP & plugins ─────────────────────────────────────────────────────
+ *   tengu_harbor                   = false      MCP channel allowlist verification
+ *   tengu_harbor_permissions       = false      MCP channel permissions enforcement
+ *   tengu_copper_bridge            = false      Chrome MCP bridge
+ *   tengu_chrome_auto_enable       = false      Auto-enable Chrome MCP on startup
+ *   tengu_glacier_2xr              = false      Enhanced tool search / ToolSearchTool
+ *   tengu_malort_pedway            = {}         Computer-use (Chicago) config (dynamic)
+ *
+ * ── VSCode / IDE ──────────────────────────────────────────────────────
+ *   tengu_quiet_fern               = false      VSCode browser support
+ *   tengu_vscode_cc_auth           = false      VSCode in-band OAuth via claude_authenticate
+ *   tengu_vscode_review_upsell     = gate       VSCode review upsell
+ *   tengu_vscode_onboarding        = gate       VSCode onboarding experience
+ *
+ * ── Voice ─────────────────────────────────────────────────────────────
+ *   tengu_amber_quartz_disabled    = false      VOICE_MODE kill-switch (false = voice allowed)
+ *
+ * ── Auto-updater (stubbed in open build) ──────────────────────────────
+ *   tengu_version_config           = {min:'0'}  Min version enforcement (dynamic)
+ *   tengu_max_version_config       = {}         Max version / deprecation config (dynamic)
+ *
+ * ── Telemetry & tracing ───────────────────────────────────────────────
+ *   tengu_trace_lantern            = false      Beta session tracing
+ *   tengu_chair_sermon             = gate       Analytics / message formatting gate
+ *   tengu_strap_foyer              = false      Settings sync to cloud
+ */
+
 function _loadFlags() {
  if (_flags !== undefined) return;
  try {
@@ -55,6 +200,7 @@ function _loadFlags() {
 function _getFlagValue(key, defaultValue) {
  _loadFlags();
  if (_flags != null && Object.hasOwn(_flags, key)) return _flags[key];
+  if (Object.hasOwn(_openBuildDefaults, key)) return _openBuildDefaults[key];
  return defaultValue;
 }

--- a/scripts/system-check.test.ts
+++ b/scripts/system-check.test.ts
@@ -20,6 +20,23 @@ describe('formatReachabilityFailureDetail', () => {
    )
  })

+  test('redacts credentials and sensitive query parameters in endpoint details', () => {
+    const detail = formatReachabilityFailureDetail(
+      'http://user:pass@localhost:11434/v1/models?token=abc123&mode=test',
+      502,
+      'bad gateway',
+      {
+        transport: 'chat_completions',
+        requestedModel: 'llama3.1:8b',
+        resolvedModel: 'llama3.1:8b',
+      },
+    )
+
+    expect(detail).toBe(
+      'Unexpected status 502 from http://redacted:redacted@localhost:11434/v1/models?token=redacted&mode=test. Body: bad gateway',
+    )
+  })
+
  test('adds alias/entitlement hint for codex model support 400s', () => {
    const detail = formatReachabilityFailureDetail(
      'https://chatgpt.com/backend-api/codex/responses',
--- a/scripts/system-check.ts
+++ b/scripts/system-check.ts
@@ -7,6 +7,11 @@ import {
  resolveProviderRequest,
  isLocalProviderUrl as isProviderLocalUrl,
 } from '../src/services/api/providerConfig.js'
+import {
+  getLocalOpenAICompatibleProviderLabel,
+  probeOllamaGenerationReadiness,
+} from '../src/utils/providerDiscovery.js'
+import { redactUrlForDisplay } from '../src/utils/urlRedaction.js'

 type CheckResult = {
  ok: boolean
@@ -69,7 +74,7 @@ export function formatReachabilityFailureDetail(
  },
 ): string {
  const compactBody = responseBody.trim().replace(/\s+/g, ' ').slice(0, 240)
-  const base = `Unexpected status ${status} from ${endpoint}.`
+  const base = `Unexpected status ${status} from ${redactUrlForDisplay(endpoint)}.`
  const bodySuffix = compactBody ? ` Body: ${compactBody}` : ''

  if (request.transport !== 'codex_responses' || status !== 400) {
@@ -255,7 +260,7 @@ function checkOpenAIEnv(): CheckResult[] {
    results.push(pass('OPENAI_MODEL', process.env.OPENAI_MODEL))
  }

-  results.push(pass('OPENAI_BASE_URL', request.baseUrl))
+  results.push(pass('OPENAI_BASE_URL', redactUrlForDisplay(request.baseUrl)))

  if (request.transport === 'codex_responses') {
    const credentials = resolveCodexApiCredentials(process.env)
@@ -308,7 +313,7 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    return pass('Provider reachability', 'Skipped (OpenAI-compatible mode disabled).')
  }

-  if (useGithub) {
+  if (useGithub && !useOpenAI) {
    return pass(
      'Provider reachability',
      'Skipped for GitHub Models (inference endpoint differs from OpenAI /models probe).',
@@ -326,6 +331,7 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
  const endpoint = request.transport === 'codex_responses'
    ? `${request.baseUrl}/responses`
    : `${request.baseUrl}/models`
+  const redactedEndpoint = redactUrlForDisplay(endpoint)

  const controller = new AbortController()
  const timeout = setTimeout(() => controller.abort(), 4000)
@@ -375,7 +381,10 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    })

    if (response.status === 200 || response.status === 401 || response.status === 403) {
-      return pass('Provider reachability', `Reached ${endpoint} (status ${response.status}).`)
+      return pass(
+        'Provider reachability',
+        `Reached ${redactedEndpoint} (status ${response.status}).`,
+      )
    }

    const responseBody = await response.text().catch(() => '')
@@ -391,12 +400,100 @@ async function checkBaseUrlReachability(): Promise<CheckResult> {
    )
  } catch (error) {
    const message = error instanceof Error ? error.message : String(error)
-    return fail('Provider reachability', `Failed to reach ${endpoint}: ${message}`)
+    return fail(
+      'Provider reachability',
+      `Failed to reach ${redactedEndpoint}: ${message}`,
+    )
  } finally {
    clearTimeout(timeout)
  }
 }

+async function checkProviderGenerationReadiness(): Promise<CheckResult> {
+  const useGemini = isTruthy(process.env.CLAUDE_CODE_USE_GEMINI)
+  const useOpenAI = isTruthy(process.env.CLAUDE_CODE_USE_OPENAI)
+  const useGithub = isTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
+  const useMistral = isTruthy(process.env.CLAUDE_CODE_USE_MISTRAL)
+
+  if (!useGemini && !useOpenAI && !useGithub && !useMistral) {
+    return pass('Provider generation readiness', 'Skipped (OpenAI-compatible mode disabled).')
+  }
+
+  if (useGithub && !useOpenAI) {
+    return pass(
+      'Provider generation readiness',
+      'Skipped for GitHub Models (runtime generation uses a different endpoint flow).',
+    )
+  }
+
+  if (useGemini || useMistral) {
+    return pass(
+      'Provider generation readiness',
+      'Skipped for managed provider mode.',
+    )
+  }
+
+  if (!useOpenAI) {
+    return pass('Provider generation readiness', 'Skipped (OpenAI-compatible mode disabled).')
+  }
+
+  const request = resolveProviderRequest({
+    model: process.env.OPENAI_MODEL,
+    baseUrl: process.env.OPENAI_BASE_URL,
+  })
+
+  if (request.transport === 'codex_responses') {
+    return pass(
+      'Provider generation readiness',
+      'Skipped for Codex responses (reachability probe already performs a lightweight generation request).',
+    )
+  }
+
+  if (!isLocalBaseUrl(request.baseUrl)) {
+    return pass('Provider generation readiness', 'Skipped for non-local provider URL.')
+  }
+
+  const localProviderLabel = getLocalOpenAICompatibleProviderLabel(request.baseUrl)
+  if (localProviderLabel !== 'Ollama') {
+    return pass(
+      'Provider generation readiness',
+      `Skipped for ${localProviderLabel} (no provider-specific generation probe).`,
+    )
+  }
+
+  const readiness = await probeOllamaGenerationReadiness({
+    baseUrl: request.baseUrl,
+    model: request.requestedModel,
+  })
+
+  if (readiness.state === 'ready') {
+    return pass(
+      'Provider generation readiness',
+      `Generated a test response with ${readiness.probeModel ?? request.requestedModel}.`,
+    )
+  }
+
+  if (readiness.state === 'unreachable') {
+    return fail(
+      'Provider generation readiness',
+      `Could not reach Ollama at ${redactUrlForDisplay(request.baseUrl)}.`,
+    )
+  }
+
+  if (readiness.state === 'no_models') {
+    return fail(
+      'Provider generation readiness',
+      'Ollama is reachable, but no installed models were found. Pull a model first (for example: ollama pull qwen2.5-coder:7b).',
+    )
+  }
+
+  const detailSuffix = readiness.detail ? ` Detail: ${readiness.detail}.` : ''
+  return fail(
+    'Provider generation readiness',
+    `Ollama is reachable, but generation failed for ${readiness.probeModel ?? request.requestedModel}.${detailSuffix}`,
+  )
+}
+
 function isAtomicChatUrl(baseUrl: string): boolean {
  try {
    const parsed = new URL(baseUrl)
@@ -567,6 +664,7 @@ async function main(): Promise<void> {
  results.push(checkBuildArtifacts())
  results.push(...checkOpenAIEnv())
  results.push(await checkBaseUrlReachability())
+  results.push(await checkProviderGenerationReadiness())
  results.push(checkOllamaProcessorMode())

  if (!options.json) {
--- a/src/Tool.ts
+++ b/src/Tool.ts
@@ -249,6 +249,11 @@ export type ToolUseContext = {
  /** When true, canUseTool must always be called even when hooks auto-approve.
   *  Used by speculation for overlay file path rewriting. */
  requireCanUseTool?: boolean
+  /**
+   * Optional callback used by hook-chain fallback actions that launch
+   * AgentTool from hook runtime paths.
+   */
+  hookChainsCanUseTool?: CanUseToolFn
  messages: Message[]
  fileReadingLimits?: {
    maxTokens?: number
--- a/src/tests/bugfixes.test.ts
+++ b/src/tests/bugfixes.test.ts
@@ -169,6 +169,14 @@ describe('Web search result count improvements', () => {

    expect(content).toMatch(/max_uses:\s*15/)
  })
+
+  test('codex web search path guarantees a non-empty result body', async () => {
+    const content = await file(
+      'tools/WebSearchTool/WebSearchTool.ts',
+    ).text()
+
+    expect(content).toContain("results.push('No results found.')")
+  })
 })

 // ---------------------------------------------------------------------------
--- a/src/tests/security-hardening.test.ts
+++ b/src/tests/security-hardening.test.ts
@@ -0,0 +1,191 @@
+/**
+ * Security hardening regression tests.
+ *
+ * Covers:
+ * 1. MCP tool result Unicode sanitization
+ * 2. Sandbox settings source filtering (exclude projectSettings)
+ * 3. Plugin git clone/pull hooks disabled
+ * 4. ANTHROPIC_FOUNDRY_API_KEY removed from SAFE_ENV_VARS
+ * 5. WebFetch SSRF protection via ssrfGuardedLookup
+ */
+
+import { describe, test, expect } from 'bun:test'
+import { resolve } from 'path'
+
+const SRC = resolve(import.meta.dir, '..')
+const file = (relative: string) => Bun.file(resolve(SRC, relative))
+
+// ---------------------------------------------------------------------------
+// Fix 1: MCP tool result Unicode sanitization
+// ---------------------------------------------------------------------------
+describe('MCP tool result sanitization', () => {
+  test('transformResultContent sanitizes text content', async () => {
+    const content = await file('services/mcp/client.ts').text()
+    // Tool definitions are already sanitized (line ~1798)
+    expect(content).toContain('recursivelySanitizeUnicode(result.tools)')
+    // Tool results must also be sanitized
+    expect(content).toMatch(
+      /case 'text':[\s\S]*?recursivelySanitizeUnicode\(resultContent\.text\)/,
+    )
+  })
+
+  test('resource text content is also sanitized', async () => {
+    const content = await file('services/mcp/client.ts').text()
+    expect(content).toMatch(
+      /recursivelySanitizeUnicode\(\s*`\$\{prefix\}\$\{resource\.text\}`/,
+    )
+  })
+})
+
+// ---------------------------------------------------------------------------
+// Fix 2: Sandbox settings source filtering
+// ---------------------------------------------------------------------------
+describe('Sandbox settings trust boundary', () => {
+  test('getSandboxEnabledSetting does not use getSettings_DEPRECATED', async () => {
+    const content = await file('utils/sandbox/sandbox-adapter.ts').text()
+    // Extract the getSandboxEnabledSetting function body
+    const fnMatch = content.match(
+      /function getSandboxEnabledSetting\(\)[^{]*\{([\s\S]*?)\n\}/,
+    )
+    expect(fnMatch).not.toBeNull()
+    const fnBody = fnMatch![1]
+    // Must NOT use getSettings_DEPRECATED (reads all sources including project)
+    expect(fnBody).not.toContain('getSettings_DEPRECATED')
+    // Must use getSettingsForSource for individual trusted sources
+    expect(fnBody).toContain("getSettingsForSource('userSettings')")
+    expect(fnBody).toContain("getSettingsForSource('policySettings')")
+    // Must NOT read from projectSettings
+    expect(fnBody).not.toContain("'projectSettings'")
+  })
+})
+
+// ---------------------------------------------------------------------------
+// Fix 3: Plugin git hooks disabled
+// ---------------------------------------------------------------------------
+describe('Plugin git operations disable hooks', () => {
+  test('gitClone includes core.hooksPath=/dev/null', async () => {
+    const content = await file('utils/plugins/marketplaceManager.ts').text()
+    // The clone args must disable hooks
+    const cloneSection = content.slice(
+      content.indexOf('export async function gitClone('),
+      content.indexOf('export async function gitClone(') + 2000,
+    )
+    expect(cloneSection).toContain("'core.hooksPath=/dev/null'")
+  })
+
+  test('gitPull includes core.hooksPath=/dev/null', async () => {
+    const content = await file('utils/plugins/marketplaceManager.ts').text()
+    const pullSection = content.slice(
+      content.indexOf('export async function gitPull('),
+      content.indexOf('export async function gitPull(') + 2000,
+    )
+    expect(pullSection).toContain("'core.hooksPath=/dev/null'")
+  })
+
+  test('gitSubmoduleUpdate includes core.hooksPath=/dev/null', async () => {
+    const content = await file('utils/plugins/marketplaceManager.ts').text()
+    const subSection = content.slice(
+      content.indexOf('async function gitSubmoduleUpdate('),
+      content.indexOf('async function gitSubmoduleUpdate(') + 1000,
+    )
+    expect(subSection).toContain("'core.hooksPath=/dev/null'")
+  })
+})
+
+// ---------------------------------------------------------------------------
+// Fix 4: ANTHROPIC_FOUNDRY_API_KEY not in SAFE_ENV_VARS
+// ---------------------------------------------------------------------------
+describe('SAFE_ENV_VARS excludes credentials', () => {
+  test('ANTHROPIC_FOUNDRY_API_KEY is not in SAFE_ENV_VARS', async () => {
+    const content = await file('utils/managedEnvConstants.ts').text()
+    // Extract the SAFE_ENV_VARS set definition
+    const safeStart = content.indexOf('export const SAFE_ENV_VARS')
+    const safeEnd = content.indexOf('])', safeStart)
+    const safeSection = content.slice(safeStart, safeEnd)
+    expect(safeSection).not.toContain('ANTHROPIC_FOUNDRY_API_KEY')
+  })
+})
+
+// ---------------------------------------------------------------------------
+// Fix 5: WebFetch SSRF protection
+// ---------------------------------------------------------------------------
+describe('WebFetch SSRF guard', () => {
+  test('getWithPermittedRedirects uses ssrfGuardedLookup', async () => {
+    const content = await file('tools/WebFetchTool/utils.ts').text()
+    expect(content).toContain(
+      "import { ssrfGuardedLookup } from '../../utils/hooks/ssrfGuard.js'",
+    )
+    // The axios.get call in getWithPermittedRedirects must include lookup
+    const fnSection = content.slice(
+      content.indexOf('export async function getWithPermittedRedirects('),
+      content.indexOf('export async function getWithPermittedRedirects(') +
+        1000,
+    )
+    expect(fnSection).toContain('lookup: ssrfGuardedLookup')
+  })
+})
+
+// ---------------------------------------------------------------------------
+// Fix 6: Swarm permission file polling removed (security hardening)
+// ---------------------------------------------------------------------------
+describe('Swarm permission file polling removed', () => {
+  test('useSwarmPermissionPoller hook no longer exists', async () => {
+    const content = await file(
+      'hooks/useSwarmPermissionPoller.ts',
+    ).text()
+    // The file-based polling hook must not exist — it read from an
+    // unauthenticated resolved/ directory where any local process could
+    // forge approval files.
+    expect(content).not.toContain('function useSwarmPermissionPoller(')
+    // The file-based processResponse must not exist
+    expect(content).not.toContain('function processResponse(')
+  })
+
+  test('poller does not import from permissionSync', async () => {
+    const content = await file(
+      'hooks/useSwarmPermissionPoller.ts',
+    ).text()
+    // Must not import anything from permissionSync — all file-based
+    // functions have been removed from this module's dependencies
+    expect(content).not.toContain('permissionSync')
+  })
+
+  test('file-based permission functions are marked deprecated', async () => {
+    const content = await file(
+      'utils/swarm/permissionSync.ts',
+    ).text()
+    // All file-based functions must have @deprecated JSDoc
+    const deprecatedFns = [
+      'writePermissionRequest',
+      'readPendingPermissions',
+      'readResolvedPermission',
+      'resolvePermission',
+      'pollForResponse',
+      'removeWorkerResponse',
+    ]
+    for (const fn of deprecatedFns) {
+      // Find the function and check that @deprecated appears before it
+      const fnIndex = content.indexOf(`export async function ${fn}(`)
+      if (fnIndex === -1) continue // submitPermissionRequest is a const, not async function
+      const preceding = content.slice(Math.max(0, fnIndex - 500), fnIndex)
+      expect(preceding).toContain('@deprecated')
+    }
+  })
+
+  test('mailbox-based functions are NOT deprecated', async () => {
+    const content = await file(
+      'utils/swarm/permissionSync.ts',
+    ).text()
+    // These are the active path — must not be deprecated
+    const activeFns = [
+      'sendPermissionRequestViaMailbox',
+      'sendPermissionResponseViaMailbox',
+    ]
+    for (const fn of activeFns) {
+      const fnIndex = content.indexOf(`export async function ${fn}(`)
+      expect(fnIndex).not.toBe(-1)
+      const preceding = content.slice(Math.max(0, fnIndex - 300), fnIndex)
+      expect(preceding).not.toContain('@deprecated')
+    }
+  })
+})
--- a/src/cli/handlers/mcp.tsx
+++ b/src/cli/handlers/mcp.tsx
@@ -11,7 +11,12 @@ import { MCPServerDesktopImportDialog } from '../../components/MCPServerDesktopI
 import { render } from '../../ink.js';
 import { KeybindingSetup } from '../../keybindings/KeybindingProviderSetup.js';
 import { type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS, logEvent } from '../../services/analytics/index.js';
-import { clearMcpClientConfig, clearServerTokensFromLocalStorage, readClientSecret, saveMcpClientSecret } from '../../services/mcp/auth.js';
+import {
+  clearMcpClientConfig,
+  clearServerTokensFromSecureStorage,
+  readClientSecret,
+  saveMcpClientSecret,
+} from '../../services/mcp/auth.js'
 import { doctorAllServers, doctorServer, type McpDoctorReport, type McpDoctorScopeFilter } from '../../services/mcp/doctor.js';
 import { connectToServer, getMcpServerConnectionBatchSize } from '../../services/mcp/client.js';
 import { addMcpConfig, getAllMcpConfigs, getMcpConfigByName, getMcpConfigsByScope, removeMcpConfig } from '../../services/mcp/config.js';
--- a/src/commands.test.ts
+++ b/src/commands.test.ts
@@ -0,0 +1,30 @@
+import { formatDescriptionWithSource } from './commands.js'
+
+describe('formatDescriptionWithSource', () => {
+  test('returns empty text for prompt commands missing a description', () => {
+    const command = {
+      name: 'example',
+      type: 'prompt',
+      source: 'builtin',
+      description: undefined,
+    } as any
+
+    expect(formatDescriptionWithSource(command)).toBe('')
+  })
+
+  test('formats plugin commands with missing description safely', () => {
+    const command = {
+      name: 'example',
+      type: 'prompt',
+      source: 'plugin',
+      description: undefined,
+      pluginInfo: {
+        pluginManifest: {
+          name: 'MyPlugin',
+        },
+      },
+    } as any
+
+    expect(formatDescriptionWithSource(command)).toBe('(MyPlugin) ')
+  })
+})
--- a/src/commands.ts
+++ b/src/commands.ts
@@ -22,7 +22,6 @@ import ctx_viz from './commands/ctx_viz/index.js'
 import doctor from './commands/doctor/index.js'
 import onboardGithub from './commands/onboard-github/index.js'
 import memory from './commands/memory/index.js'
-import repomap from './commands/repomap/index.js'
 import help from './commands/help/index.js'
 import ide from './commands/ide/index.js'
 import init from './commands/init.js'
@@ -308,7 +307,6 @@ const COMMANDS = memoize((): Command[] => [
  releaseNotes,
  reloadPlugins,
  rename,
-  repomap,
  resume,
  session,
  skills,
@@ -742,23 +740,23 @@ export function getCommand(commandName: string, commands: Command[]): Command {
 */
 export function formatDescriptionWithSource(cmd: Command): string {
  if (cmd.type !== 'prompt') {
-    return cmd.description
+    return cmd.description ?? ''
  }

  if (cmd.kind === 'workflow') {
-    return `${cmd.description} (workflow)`
+    return `${cmd.description ?? ''} (workflow)`
  }

  if (cmd.source === 'plugin') {
    const pluginName = cmd.pluginInfo?.pluginManifest.name
    if (pluginName) {
-      return `(${pluginName}) ${cmd.description}`
+      return `(${pluginName}) ${cmd.description ?? ''}`
    }
-    return `${cmd.description} (plugin)`
+    return `${cmd.description ?? ''} (plugin)`
  }

  if (cmd.source === 'builtin' || cmd.source === 'mcp') {
-    return cmd.description
+    return cmd.description ?? ''
  }

  if (cmd.source === 'bundled') {
--- a/src/commands/benchmark.ts
+++ b/src/commands/benchmark.ts
@@ -0,0 +1,56 @@
+import type { ToolUseContext } from '../Tool.js'
+import type { Command } from '../types/command.js'
+import {
+  benchmarkModel,
+  benchmarkMultipleModels,
+  formatBenchmarkResults,
+  isBenchmarkSupported,
+} from '../utils/model/benchmark.js'
+import { getOllamaModelOptions } from '../utils/model/ollamaModels.js'
+
+async function runBenchmark(
+  model?: string,
+  context?: ToolUseContext,
+): Promise<void> {
+  if (!isBenchmarkSupported()) {
+    context?.stdout?.write(
+      'Benchmark not supported for this provider.\n' +
+        'Supported: OpenAI-compatible endpoints (Ollama, NVIDIA NIM, MiniMax)\n',
+    )
+    return
+  }
+
+  let modelsToBenchmark: string[]
+
+  if (model) {
+    modelsToBenchmark = [model]
+  } else {
+    const ollamaModels = getOllamaModelOptions()
+    modelsToBenchmark = ollamaModels.slice(0, 3).map((m) => m.value)
+  }
+
+  context?.stdout?.write(`Benchmarking ${modelsToBenchmark.length} model(s)...\n`)
+
+  const results = await benchmarkMultipleModels(
+    modelsToBenchmark,
+    (completed, total, result) => {
+      context?.stdout?.write(
+        `[${completed}/${total}] ${result.model}: ` +
+          `${result.success ? result.tokensPerSecond.toFixed(1) + ' tps' : 'FAILED'}\n`,
+      )
+    },
+  )
+
+  context?.stdout?.write('\n' + formatBenchmarkResults(results) + '\n')
+}
+
+export const benchmark: Command = {
+  name: 'benchmark',
+
+  async onExecute(context: ToolUseContext): Promise<void> {
+    const args = context.args ?? {}
+    const model = args.model as string | undefined
+
+    await runBenchmark(model, context)
+  },
+}
--- a/src/commands/provider/provider.test.tsx
+++ b/src/commands/provider/provider.test.tsx
@@ -401,7 +401,7 @@ test('buildCodexProfileEnv derives oauth source from secure storage when no expl
  })
 })

-test('applySavedProfileToCurrentSession switches the current env to the saved Codex profile', async () => {
+test('explicitly declared env takes precedence over applySavedProfileToCurrentSession', async () => {
  // @ts-expect-error cache-busting query string for Bun module mocks
  const { applySavedProfileToCurrentSession } = await import(
    '../../utils/providerProfile.js?apply-saved-profile-codex'
@@ -430,18 +430,18 @@ test('applySavedProfileToCurrentSession switches the current env to the saved Co

  expect(warning).toBeNull()
  expect(processEnv.CLAUDE_CODE_USE_OPENAI).toBe('1')
-  expect(processEnv.OPENAI_MODEL).toBe('codexplan')
+  expect(processEnv.OPENAI_MODEL).toBe('gpt-4o')
  expect(processEnv.OPENAI_BASE_URL).toBe(
-    'https://chatgpt.com/backend-api/codex',
+    "https://api.openai.com/v1",
  )
-  expect(processEnv.CODEX_API_KEY).toBe('codex-live')
-  expect(processEnv.CHATGPT_ACCOUNT_ID).toBe('acct_codex')
-  expect(processEnv.OPENAI_API_KEY).toBeUndefined()
+  expect(processEnv.CODEX_API_KEY).toBeUndefined()
+  expect(processEnv.CHATGPT_ACCOUNT_ID).toBeUndefined()
+  expect(processEnv.OPENAI_API_KEY).toBe("sk-openai")
  expect(processEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED).toBeUndefined()
  expect(processEnv.CLAUDE_CODE_PROVIDER_PROFILE_ENV_APPLIED_ID).toBeUndefined()
 })

-test('applySavedProfileToCurrentSession ignores stale Codex env overrides for OAuth-backed profiles', async () => {
+test('explicitly declared env takes precedence over applySavedProfileToCurrentSession', async () => {
  // @ts-expect-error cache-busting query string for Bun module mocks
  const { applySavedProfileToCurrentSession } = await import(
    '../../utils/providerProfile.js?apply-saved-profile-codex-oauth'
@@ -465,13 +465,13 @@ test('applySavedProfileToCurrentSession ignores stale Codex env overrides for OA
    processEnv,
  })

-  expect(warning).toBeNull()
-  expect(processEnv.OPENAI_MODEL).toBe('codexplan')
+  expect(warning).not.toBeUndefined()
+  expect(processEnv.OPENAI_MODEL).toBe('gpt-4o')
  expect(processEnv.OPENAI_BASE_URL).toBe(
-    'https://chatgpt.com/backend-api/codex',
+    "https://api.openai.com/v1",
  )
-  expect(processEnv.CODEX_API_KEY).toBeUndefined()
-  expect(processEnv.CHATGPT_ACCOUNT_ID).not.toBe('acct_stale')
+  expect(processEnv.CODEX_API_KEY).toBe("stale-codex-key")
+  expect(processEnv.CHATGPT_ACCOUNT_ID).toBe('acct_stale')
  expect(processEnv.CHATGPT_ACCOUNT_ID).toBeTruthy()
 })

@@ -487,8 +487,8 @@ test('buildCurrentProviderSummary redacts poisoned model and endpoint values', (
  })

  expect(summary.providerLabel).toBe('OpenAI-compatible')
-  expect(summary.modelLabel).toBe('sk-...5678')
-  expect(summary.endpointLabel).toBe('sk-...5678')
+  expect(summary.modelLabel).toBe('sk-...678')
+  expect(summary.endpointLabel).toBe('sk-...678')
 })

 test('buildCurrentProviderSummary labels generic local openai-compatible providers', () => {
--- a/src/commands/provider/provider.tsx
+++ b/src/commands/provider/provider.tsx
@@ -66,10 +66,44 @@ import {
 import {
  getOllamaChatBaseUrl,
  getLocalOpenAICompatibleProviderLabel,
-  hasLocalOllama,
-  listOllamaModels,
+  probeOllamaGenerationReadiness,
+  type OllamaGenerationReadiness,
 } from '../../utils/providerDiscovery.js'

+function describeOllamaReadinessIssue(
+  readiness: OllamaGenerationReadiness,
+  options?: {
+    baseUrl?: string
+    allowManualFallback?: boolean
+  },
+): string {
+  const endpoint = options?.baseUrl ?? 'http://localhost:11434'
+
+  if (readiness.state === 'unreachable') {
+    return `Could not reach Ollama at ${endpoint}. Start Ollama first, then run /provider again.`
+  }
+
+  if (readiness.state === 'no_models') {
+    const manualSuffix = options?.allowManualFallback
+      ? ', or enter details manually'
+      : ''
+    return `Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first${manualSuffix}.`
+  }
+
+  if (readiness.state === 'generation_failed') {
+    const modelHint = readiness.probeModel ?? 'the selected model'
+    const detailSuffix = readiness.detail
+      ? ` Details: ${readiness.detail}.`
+      : ''
+    const manualSuffix = options?.allowManualFallback
+      ? ' You can also enter details manually.'
+      : ''
+    return `Ollama is reachable and models are installed, but a generation probe failed for ${modelHint}.${detailSuffix} Run "ollama run ${modelHint}" once and retry.${manualSuffix}`
+  }
+
+  return ''
+}
+
 type ProviderChoice = 'auto' | ProviderProfile | 'codex-oauth' | 'clear'

 type Step =
@@ -715,6 +749,7 @@ function AutoRecommendationStep({
    | {
        state: 'openai'
        defaultModel: string
+        reason: string
      }
    | {
        state: 'error'
@@ -728,19 +763,27 @@ function AutoRecommendationStep({
    void (async () => {
      const defaultModel = getGoalDefaultOpenAIModel(goal)
      try {
-        const ollamaAvailable = await hasLocalOllama()
-        if (!ollamaAvailable) {
+        const readiness = await probeOllamaGenerationReadiness()
+        if (readiness.state !== 'ready') {
          if (!cancelled) {
-            setStatus({ state: 'openai', defaultModel })
+            setStatus({
+              state: 'openai',
+              defaultModel,
+              reason: describeOllamaReadinessIssue(readiness),
+            })
          }
          return
        }

-        const models = await listOllamaModels()
-        const recommended = recommendOllamaModel(models, goal)
+        const recommended = recommendOllamaModel(readiness.models, goal)
        if (!recommended) {
          if (!cancelled) {
-            setStatus({ state: 'openai', defaultModel })
+            setStatus({
+              state: 'openai',
+              defaultModel,
+              reason:
+                'Ollama responded to a generation probe, but no recommended chat model matched this goal.',
+            })
          }
          return
        }
@@ -796,10 +839,10 @@ function AutoRecommendationStep({
      <Dialog title="Auto setup fallback" onCancel={onCancel}>
        <Box flexDirection="column" gap={1}>
          <Text>
-            No viable local Ollama chat model was detected. Auto setup can
-            continue into OpenAI-compatible setup with a default model of{' '}
+            Auto setup can continue into OpenAI-compatible setup with a default model of{' '}
            {status.defaultModel}.
          </Text>
+          <Text dimColor>{status.reason}</Text>
          <Select
            options={[
              { label: 'Continue to OpenAI-compatible setup', value: 'continue' },
@@ -883,32 +926,19 @@ function OllamaModelStep({
    let cancelled = false

    void (async () => {
-      const available = await hasLocalOllama()
-      if (!available) {
+      const readiness = await probeOllamaGenerationReadiness()
+      if (readiness.state !== 'ready') {
        if (!cancelled) {
          setStatus({
            state: 'unavailable',
-            message:
-              'Could not reach Ollama at http://localhost:11434. Start Ollama first, then run /provider again.',
+            message: describeOllamaReadinessIssue(readiness),
          })
        }
        return
      }

-      const models = await listOllamaModels()
-      if (models.length === 0) {
-        if (!cancelled) {
-          setStatus({
-            state: 'unavailable',
-            message:
-              'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first.',
-          })
-        }
-        return
-      }
-
-      const ranked = rankOllamaModels(models, 'balanced')
-      const recommended = recommendOllamaModel(models, 'balanced')
+      const ranked = rankOllamaModels(readiness.models, 'balanced')
+      const recommended = recommendOllamaModel(readiness.models, 'balanced')
      if (!cancelled) {
        setStatus({
          state: 'ready',
--- a/src/commands/repomap/index.ts
+++ b/src/commands/repomap/index.ts
@@ -1,17 +0,0 @@
-/**
- * /repomap command - minimal metadata only.
- * Implementation is lazy-loaded from repomap.ts to reduce startup time.
- */
-import type { Command } from '../../commands.js'
-
-const repomap = {
-  type: 'local',
-  name: 'repomap',
-  description:
-    'Show or configure the repository structural map (codebase intelligence)',
-  isHidden: false,
-  supportsNonInteractive: true,
-  load: () => import('./repomap.js'),
-} satisfies Command
-
-export default repomap
--- a/src/commands/repomap/repomap.test.ts
+++ b/src/commands/repomap/repomap.test.ts
@@ -1,56 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-import { parseArgs } from './repomap.js'
-
-describe('/repomap argument parsing', () => {
-  test('defaults to 1024 tokens with no flags', () => {
-    const result = parseArgs('')
-    expect(result.tokens).toBe(2048)
-    expect(result.focus).toEqual([])
-    expect(result.invalidate).toBe(false)
-    expect(result.stats).toBe(false)
-  })
-
-  test('parses --tokens flag', () => {
-    const result = parseArgs('--tokens 4096')
-    expect(result.tokens).toBe(4096)
-  })
-
-  test('rejects --tokens below 256', () => {
-    const result = parseArgs('--tokens 100')
-    expect(result.tokens).toBe(2048) // falls back to default
-  })
-
-  test('rejects --tokens above 16384', () => {
-    const result = parseArgs('--tokens 20000')
-    expect(result.tokens).toBe(2048) // falls back to default
-  })
-
-  test('parses --focus flag', () => {
-    const result = parseArgs('--focus src/tools/')
-    expect(result.focus).toEqual(['src/tools/'])
-  })
-
-  test('parses multiple --focus flags', () => {
-    const result = parseArgs('--focus src/tools/ --focus src/context.ts')
-    expect(result.focus).toEqual(['src/tools/', 'src/context.ts'])
-  })
-
-  test('parses --invalidate flag', () => {
-    const result = parseArgs('--invalidate')
-    expect(result.invalidate).toBe(true)
-    expect(result.stats).toBe(false)
-  })
-
-  test('parses --stats flag', () => {
-    const result = parseArgs('--stats')
-    expect(result.stats).toBe(true)
-    expect(result.invalidate).toBe(false)
-  })
-
-  test('parses combined flags', () => {
-    const result = parseArgs('--tokens 2048 --focus src/tools/ --invalidate')
-    expect(result.tokens).toBe(2048)
-    expect(result.focus).toEqual(['src/tools/'])
-    expect(result.invalidate).toBe(true)
-  })
-})
--- a/src/commands/repomap/repomap.ts
+++ b/src/commands/repomap/repomap.ts
@@ -1,93 +0,0 @@
-import type { LocalCommandCall } from '../../types/command.js'
-import { getCwd } from '../../utils/cwd.js'
-
-/** Parse CLI-style arguments from the command string. */
-export function parseArgs(args: string): {
-  tokens: number
-  focus: string[]
-  invalidate: boolean
-  stats: boolean
-} {
-  const parts = args.trim().split(/\s+/).filter(Boolean)
-  let tokens = 2048
-  const focus: string[] = []
-  let invalidate = false
-  let stats = false
-
-  for (let i = 0; i < parts.length; i++) {
-    const part = parts[i]!
-    if (part === '--tokens' && i + 1 < parts.length) {
-      const n = parseInt(parts[i + 1]!, 10)
-      if (!isNaN(n) && n >= 256 && n <= 16384) {
-        tokens = n
-      }
-      i++
-    } else if (part === '--focus' && i + 1 < parts.length) {
-      focus.push(parts[i + 1]!)
-      i++
-    } else if (part === '--invalidate') {
-      invalidate = true
-    } else if (part === '--stats') {
-      stats = true
-    }
-  }
-
-  return { tokens, focus, invalidate, stats }
-}
-
-export const call: LocalCommandCall = async (args) => {
-  const root = getCwd()
-  const { tokens, focus, invalidate, stats } = parseArgs(args ?? '')
-
-  // Lazy import to avoid loading tree-sitter at startup
-  const {
-    buildRepoMap,
-    invalidateCache,
-    getCacheStats,
-  } = await import('../../context/repoMap/index.js')
-
-  if (stats) {
-    const cacheStats = getCacheStats(root)
-    const lines = [
-      `Repository map cache stats:`,
-      `  Cache directory: ${cacheStats.cacheDir}`,
-      `  Cache file: ${cacheStats.cacheFile ?? '(none)'}`,
-      `  Cached entries: ${cacheStats.entryCount}`,
-      `  Cache exists: ${cacheStats.exists}`,
-    ]
-    return { type: 'text', value: lines.join('\n') }
-  }
-
-  if (invalidate) {
-    invalidateCache(root)
-    const result = await buildRepoMap({
-      root,
-      maxTokens: tokens,
-      focusFiles: focus.length > 0 ? focus : undefined,
-    })
-    return {
-      type: 'text',
-      value: [
-        `Cache invalidated and rebuilt.`,
-        `Files: ${result.fileCount} ranked (${result.totalFileCount} total) | Tokens: ${result.tokenCount} | Time: ${result.buildTimeMs}ms | Cache hit: ${result.cacheHit}`,
-        '',
-        result.map,
-      ].join('\n'),
-    }
-  }
-
-  const result = await buildRepoMap({
-    root,
-    maxTokens: tokens,
-    focusFiles: focus.length > 0 ? focus : undefined,
-  })
-
-  return {
-    type: 'text',
-    value: [
-      `Repository map: ${result.fileCount} files ranked (${result.totalFileCount} total) | Tokens: ${result.tokenCount} | Time: ${result.buildTimeMs}ms | Cache hit: ${result.cacheHit}`,
-      '',
-      result.map,
-    ].join('\n'),
-  }
-}
--- a/src/components/ConsoleOAuthFlow.test.tsx
+++ b/src/components/ConsoleOAuthFlow.test.tsx
@@ -112,8 +112,10 @@ test('third-party provider branch opens the first-run provider manager', async (
  )

  expect(output).toContain('Set up provider')
+  // Use alphabetically-early sentinels so they remain visible in the
+  // 13-row test frame after the provider list was sorted A→Z.
  expect(output).toContain('Anthropic')
-  expect(output).toContain('OpenAI')
-  expect(output).toContain('Ollama')
-  expect(output).toContain('LM Studio')
+  expect(output).toContain('Azure OpenAI')
+  expect(output).toContain('DeepSeek')
+  expect(output).toContain('Google Gemini')
 })
--- a/src/components/ProviderManager.test.tsx
+++ b/src/components/ProviderManager.test.tsx
@@ -97,6 +97,47 @@ async function waitForCondition(
  throw new Error('Timed out waiting for ProviderManager test condition')
 }

+// Provider list is sorted alphabetically by label in the preset picker, so
+// reaching a given provider takes more keypresses than it used to. Keep the
+// target-by-label indirection here so these tests survive future list edits
+// without further churn.
+//
+// Order matches ProviderManager.renderPresetSelection() when
+// canUseCodexOAuth === true (default in mocked tests).
+const PRESET_ORDER = [
+  'Alibaba Coding Plan',
+  'Alibaba Coding Plan (China)',
+  'Anthropic',
+  'Atomic Chat',
+  'Azure OpenAI',
+  'Codex OAuth',
+  'DeepSeek',
+  'Google Gemini',
+  'Groq',
+  'LM Studio',
+  'MiniMax',
+  'Mistral',
+  'Moonshot AI',
+  'NVIDIA NIM',
+  'Ollama',
+  'OpenAI',
+  'OpenRouter',
+  'Together AI',
+  'Custom',
+] as const
+
+async function navigateToPreset(
+  stdin: { write: (data: string) => void },
+  label: (typeof PRESET_ORDER)[number],
+): Promise<void> {
+  const index = PRESET_ORDER.indexOf(label)
+  if (index < 0) throw new Error(`Unknown preset label: ${label}`)
+  for (let i = 0; i < index; i++) {
+    stdin.write('j')
+    await Bun.sleep(25)
+  }
+}
+
 function createDeferred<T>(): {
  promise: Promise<T>
  resolve: (value: T) => void
@@ -149,17 +190,21 @@ function mockProviderManagerDependencies(
    applySavedProfileToCurrentSession?: (...args: unknown[]) => Promise<string | null>
    clearCodexCredentials?: () => { success: boolean; warning?: string }
    getProviderProfiles?: () => unknown[]
-    hasLocalOllama?: () => Promise<boolean>
-    listOllamaModels?: () => Promise<
-      Array<{
+    probeOllamaGenerationReadiness?: () => Promise<{
+      state: 'ready' | 'unreachable' | 'no_models' | 'generation_failed'
+      models: Array<
+        {
          name: string
          sizeBytes?: number | null
          family?: string | null
          families?: string[]
          parameterSize?: string | null
          quantizationLevel?: string | null
-      }>
+        }
      >
+      probeModel?: string
+      detail?: string
+    }>
    codexSyncRead?: () => unknown
    codexAsyncRead?: () => Promise<unknown>
    updateProviderProfile?: (...args: unknown[]) => unknown
@@ -189,8 +234,12 @@ function mockProviderManagerDependencies(
  })

  mock.module('../utils/providerDiscovery.js', () => ({
-    hasLocalOllama: options?.hasLocalOllama ?? (async () => false),
-    listOllamaModels: options?.listOllamaModels ?? (async () => []),
+    probeOllamaGenerationReadiness:
+      options?.probeOllamaGenerationReadiness ??
+      (async () => ({
+        state: 'unreachable' as const,
+        models: [],
+      })),
  }))

  mock.module('../utils/githubModelsCredentials.js', () => ({
@@ -455,8 +504,9 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
    async () => undefined,
    {
      addProviderProfile,
-      hasLocalOllama: async () => true,
-      listOllamaModels: async () => [
+      probeOllamaGenerationReadiness: async () => ({
+        state: 'ready',
+        models: [
          {
            name: 'gemma4:31b-cloud',
            family: 'gemma',
@@ -468,6 +518,8 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
            parameterSize: '2.5b',
          },
        ],
+        probeModel: 'gemma4:31b-cloud',
+      }),
    },
  )

@@ -480,11 +532,10 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as

  await waitForFrameOutput(
    mounted.getOutput,
-    frame => frame.includes('Set up provider') && frame.includes('Ollama'),
+    frame => frame.includes('Set up provider'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(50)
+  await navigateToPreset(mounted.stdin, 'Ollama')
  mounted.stdin.write('\r')

  const modelFrame = await waitForFrameOutput(
@@ -579,12 +630,7 @@ test('ProviderManager first-run Codex OAuth switches the current session after l
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -676,12 +722,7 @@ test('ProviderManager first-run Codex OAuth reports next-startup fallback when s
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -775,12 +816,7 @@ test('ProviderManager does not hijack a manual Codex profile when OAuth credenti
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
--- a/src/components/ProviderManager.tsx
+++ b/src/components/ProviderManager.tsx
@@ -3,12 +3,14 @@ import * as React from 'react'
 import { DEFAULT_CODEX_BASE_URL } from '../services/api/providerConfig.js'
 import { Box, Text } from '../ink.js'
 import { useKeybinding } from '../keybindings/useKeybinding.js'
+import { useSetAppState } from '../state/AppState.js'
 import type { ProviderProfile } from '../utils/config.js'
 import {
  clearCodexCredentials,
  readCodexCredentialsAsync,
 } from '../utils/codexCredentials.js'
 import { isBareMode, isEnvTruthy } from '../utils/envUtils.js'
+import { getPrimaryModel, hasMultipleModels, parseModelList } from '../utils/providerModels.js'
 import {
  applySavedProfileToCurrentSession,
  buildCodexOAuthProfileEnv,
@@ -35,13 +37,16 @@ import {
  readGithubModelsTokenAsync,
 } from '../utils/githubModelsCredentials.js'
 import {
-  hasLocalOllama,
-  listOllamaModels,
+  probeAtomicChatReadiness,
+  probeOllamaGenerationReadiness,
+  type AtomicChatReadiness,
+  type OllamaGenerationReadiness,
 } from '../utils/providerDiscovery.js'
 import {
  rankOllamaModels,
  recommendOllamaModel,
 } from '../utils/providerRecommendation.js'
+import { redactUrlForDisplay } from '../utils/urlRedaction.js'
 import { updateSettingsForSource } from '../utils/settings/settings.js'
 import {
  type OptionWithDescription,
@@ -66,6 +71,7 @@ type Screen =
  | 'menu'
  | 'select-preset'
  | 'select-ollama-model'
+  | 'select-atomic-chat-model'
  | 'codex-oauth'
  | 'form'
  | 'select-active'
@@ -86,6 +92,16 @@ type OllamaSelectionState =
    }
  | { state: 'unavailable'; message: string }

+type AtomicChatSelectionState =
+  | { state: 'idle' }
+  | { state: 'loading' }
+  | {
+      state: 'ready'
+      options: OptionWithDescription<string>[]
+      defaultValue?: string
+    }
+  | { state: 'unavailable'; message: string }
+
 const FORM_STEPS: Array<{
  key: DraftField
  label: string
@@ -108,8 +124,8 @@ const FORM_STEPS: Array<{
  {
    key: 'model',
    label: 'Default model',
-    placeholder: 'e.g. llama3.1:8b',
-    helpText: 'Model name to use when this provider is active.',
+    placeholder: 'e.g. llama3.1:8b or glm-4.7, glm-4.7-flash',
+    helpText: 'Model name(s) to use. Separate multiple with commas; first is default.',
  },
  {
    key: 'apiKey',
@@ -153,7 +169,12 @@ function profileSummary(profile: ProviderProfile, isActive: boolean): string {
  const keyInfo = profile.apiKey ? 'key set' : 'no key'
  const providerKind =
    profile.provider === 'anthropic' ? 'anthropic' : 'openai-compatible'
-  return `${providerKind} · ${profile.baseUrl} · ${profile.model} · ${keyInfo}${activeSuffix}`
+  const models = parseModelList(profile.model)
+  const modelDisplay =
+    models.length <= 3
+      ? models.join(', ')
+      : `${models[0]}, ${models[1]} + ${models.length - 2} more`
+  return `${providerKind} · ${profile.baseUrl} · ${modelDisplay} · ${keyInfo}${activeSuffix}`
 }

 function getGithubCredentialSourceFromEnv(
@@ -214,6 +235,44 @@ function getGithubProviderSummary(
  return `github-models · ${GITHUB_PROVIDER_DEFAULT_BASE_URL} · ${getGithubProviderModel(processEnv)} · ${credentialSummary}${activeSuffix}`
 }

+function describeAtomicChatSelectionIssue(
+  readiness: AtomicChatReadiness,
+  baseUrl: string,
+): string {
+  if (readiness.state === 'unreachable') {
+    return `Could not reach Atomic Chat at ${redactUrlForDisplay(baseUrl)}. Start the Atomic Chat app first, or enter the endpoint manually.`
+  }
+
+  if (readiness.state === 'no_models') {
+    return 'Atomic Chat is running, but no models are loaded. Download and load a model inside the Atomic Chat app first, or enter details manually.'
+  }
+
+  return ''
+}
+
+function describeOllamaSelectionIssue(
+  readiness: OllamaGenerationReadiness,
+  baseUrl: string,
+): string {
+  if (readiness.state === 'unreachable') {
+    return `Could not reach Ollama at ${redactUrlForDisplay(baseUrl)}. Start Ollama first, or enter the endpoint manually.`
+  }
+
+  if (readiness.state === 'no_models') {
+    return 'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first, or enter details manually.'
+  }
+
+  if (readiness.state === 'generation_failed') {
+    const modelHint = readiness.probeModel ?? 'the selected model'
+    const detailSuffix = readiness.detail
+      ? ` Details: ${readiness.detail}.`
+      : ''
+    return `Ollama is reachable and models are installed, but a generation probe failed for ${modelHint}.${detailSuffix} Run "ollama run ${modelHint}" once and retry, or enter details manually.`
+  }
+
+  return ''
+}
+
 function findCodexOAuthProfile(
  profiles: ProviderProfile[],
  profileId?: string,
@@ -320,14 +379,17 @@ function CodexOAuthSetup({
 }

 export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
+  const setAppState = useSetAppState()
  const initialGithubCredentialSource = getGithubCredentialSourceFromEnv()
  const initialIsGithubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
  const initialHasGithubCredential = initialGithubCredentialSource !== 'none'

-  const [profiles, setProfiles] = React.useState(() => getProviderProfiles())
-  const [activeProfileId, setActiveProfileId] = React.useState(
-    () => getActiveProviderProfile()?.id,
-  )
+  // Deferred initialization: useState initializers run synchronously during
+  // render, so getProviderProfiles() and getActiveProviderProfile() would block
+  // the UI on first mount (sync file I/O). Use empty initial values and load
+  // asynchronously in useEffect with queueMicrotask to keep UI responsive.
+  const [profiles, setProfiles] = React.useState<ProviderProfile[]>([])
+  const [activeProfileId, setActiveProfileId] = React.useState<string | undefined>()
  const [githubProviderAvailable, setGithubProviderAvailable] = React.useState(
    () => isGithubProviderAvailable(initialGithubCredentialSource),
  )
@@ -353,6 +415,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  const [cursorOffset, setCursorOffset] = React.useState(0)
  const [statusMessage, setStatusMessage] = React.useState<string | undefined>()
  const [errorMessage, setErrorMessage] = React.useState<string | undefined>()
+  const [menuFocusValue, setMenuFocusValue] = React.useState<string | undefined>()
  const [hasStoredCodexOAuthCredentials, setHasStoredCodexOAuthCredentials] =
    React.useState(false)
  const [storedCodexOAuthProfileId, setStoredCodexOAuthProfileId] =
@@ -360,11 +423,88 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  const [ollamaSelection, setOllamaSelection] = React.useState<OllamaSelectionState>({
    state: 'idle',
  })
+  const [atomicChatSelection, setAtomicChatSelection] =
+    React.useState<AtomicChatSelectionState>({ state: 'idle' })
+  // Deferred initialization: useState initializers run synchronously during
+  // render, so getProviderProfiles() and getActiveProviderProfile() would block
+  // the UI (sync file I/O). Defer to queueMicrotask after first render.
+  // In test environment, skip defer to avoid timing issues with mocks.
+  const [isInitializing, setIsInitializing] = React.useState(
+    process.env.NODE_ENV !== 'test',
+  )
+  const [isActivating, setIsActivating] = React.useState(false)
+  const isRefreshingRef = React.useRef(false)
+
+  React.useEffect(() => {
+    // Skip deferred initialization in test environment (mocks are synchronous)
+    if (process.env.NODE_ENV === 'test') {
+      setProfiles(getProviderProfiles())
+      setActiveProfileId(getActiveProviderProfile()?.id)
+      setIsInitializing(false)
+      return
+    }
+
+    queueMicrotask(() => {
+      const profilesData = getProviderProfiles()
+      const activeId = getActiveProviderProfile()?.id
+      setProfiles(profilesData)
+      setActiveProfileId(activeId)
+      setIsInitializing(false)
+    })
+  }, [])

  const currentStep = FORM_STEPS[formStepIndex] ?? FORM_STEPS[0]
  const currentStepKey = currentStep.key
  const currentValue = draft[currentStepKey]

+  // Memoize menu options to prevent unnecessary re-renders when navigating
+  // the select menu. Without this, each arrow key press creates a new options
+  // array reference, causing Select to re-render and feel sluggish.
+  const hasProfiles = profiles.length > 0
+  const hasSelectableProviders = hasProfiles || githubProviderAvailable
+  const menuOptions = React.useMemo(
+    () => [
+      {
+        value: 'add',
+        label: 'Add provider',
+        description: 'Create a new provider profile',
+      },
+      {
+        value: 'activate',
+        label: 'Set active provider',
+        description: 'Switch the active provider profile',
+        disabled: !hasSelectableProviders,
+      },
+      {
+        value: 'edit',
+        label: 'Edit provider',
+        description: 'Update URL, model, or key',
+        disabled: !hasProfiles,
+      },
+      {
+        value: 'delete',
+        label: 'Delete provider',
+        description: 'Remove a provider profile',
+        disabled: !hasSelectableProviders,
+      },
+      ...(hasStoredCodexOAuthCredentials
+        ? [
+            {
+              value: 'logout-codex-oauth',
+              label: 'Log out Codex OAuth',
+              description: 'Clear securely stored Codex OAuth credentials',
+            },
+          ]
+        : []),
+      {
+        value: 'done',
+        label: 'Done',
+        description: 'Return to chat',
+      },
+    ],
+    [hasSelectableProviders, hasProfiles, hasStoredCodexOAuthCredentials],
+  )
+
  const refreshGithubProviderState = React.useCallback((): void => {
    const envCredentialSource = getGithubCredentialSourceFromEnv()
    const githubActive = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
@@ -440,32 +580,21 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    setOllamaSelection({ state: 'loading' })

    void (async () => {
-      const available = await hasLocalOllama(draft.baseUrl)
-      if (!available) {
+      const readiness = await probeOllamaGenerationReadiness({
+        baseUrl: draft.baseUrl,
+      })
+      if (readiness.state !== 'ready') {
        if (!cancelled) {
          setOllamaSelection({
            state: 'unavailable',
-            message:
-              'Could not reach Ollama. Start Ollama first, or enter the endpoint manually.',
+            message: describeOllamaSelectionIssue(readiness, draft.baseUrl),
          })
        }
        return
      }

-      const models = await listOllamaModels(draft.baseUrl)
-      if (models.length === 0) {
-        if (!cancelled) {
-          setOllamaSelection({
-            state: 'unavailable',
-            message:
-              'Ollama is running, but no installed models were found. Pull a chat model such as qwen2.5-coder:7b or llama3.1:8b first, or enter details manually.',
-          })
-        }
-        return
-      }
-
-      const ranked = rankOllamaModels(models, 'balanced')
-      const recommended = recommendOllamaModel(models, 'balanced')
+      const ranked = rankOllamaModels(readiness.models, 'balanced')
+      const recommended = recommendOllamaModel(readiness.models, 'balanced')
      if (!cancelled) {
        setOllamaSelection({
          state: 'ready',
@@ -484,12 +613,61 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    }
  }, [draft.baseUrl, screen])

+  React.useEffect(() => {
+    if (screen !== 'select-atomic-chat-model') {
+      return
+    }
+
+    let cancelled = false
+    setAtomicChatSelection({ state: 'loading' })
+
+    void (async () => {
+      const readiness = await probeAtomicChatReadiness({
+        baseUrl: draft.baseUrl,
+      })
+      if (readiness.state !== 'ready') {
+        if (!cancelled) {
+          setAtomicChatSelection({
+            state: 'unavailable',
+            message: describeAtomicChatSelectionIssue(readiness, draft.baseUrl),
+          })
+        }
+        return
+      }
+
+      if (!cancelled) {
+        setAtomicChatSelection({
+          state: 'ready',
+          defaultValue: readiness.models[0],
+          options: readiness.models.map(model => ({
+            label: model,
+            value: model,
+          })),
+        })
+      }
+    })()
+
+    return () => {
+      cancelled = true
+    }
+  }, [draft.baseUrl, screen])
+
  function refreshProfiles(): void {
+    // Defer sync I/O to next microtask to prevent UI freeze.
+    // getProviderProfiles() and getActiveProviderProfile() read config files
+    // synchronously, which can block the main thread on Windows (antivirus, disk cache).
+    // queueMicrotask ensures the current render completes first.
+    if (isRefreshingRef.current) return
+    isRefreshingRef.current = true
+
+    queueMicrotask(() => {
      const nextProfiles = getProviderProfiles()
      setProfiles(nextProfiles)
      setActiveProfileId(getActiveProviderProfile()?.id)
      refreshGithubProviderState()
      refreshCodexOAuthCredentialState()
+      isRefreshingRef.current = false
+    })
  }

  function clearStartupProviderOverrideFromUserSettings(): string | null {
@@ -562,30 +740,68 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  async function activateSelectedProvider(profileId: string): Promise<void> {
    let providerLabel = 'provider'

+    // Set loading state before sync I/O to keep UI responsive
+    setIsActivating(true)
+    setStatusMessage('Activating provider...')
+
    try {
+      // Defer sync I/O to next microtask - UI renders loading state first.
+      // setActiveProviderProfile(), activateGithubProvider(), and
+      // clearStartupProviderOverrideFromUserSettings() all perform sync file writes
+      // (saveGlobalConfig, saveProfileFile, updateSettingsForSource) which can
+      // block the main thread on Windows (antivirus, disk cache, NTFS metadata).
+      await new Promise<void>(resolve => queueMicrotask(resolve))
+
      if (profileId === GITHUB_PROVIDER_ID) {
        providerLabel = GITHUB_PROVIDER_LABEL
        const githubError = activateGithubProvider()
        if (githubError) {
          setErrorMessage(`Could not activate GitHub provider: ${githubError}`)
-          setScreen('menu')
+          setIsActivating(false)
+          returnToMenu()
          return
        }

+        setAppState(prev => ({
+          ...prev,
+          mainLoopModel: GITHUB_PROVIDER_DEFAULT_MODEL,
+          mainLoopModelForSession: null,
+        }))
        refreshProfiles()
+        setAppState(prev => ({
+          ...prev,
+          mainLoopModel: GITHUB_PROVIDER_DEFAULT_MODEL,
+        }))
        setStatusMessage(`Active provider: ${GITHUB_PROVIDER_LABEL}`)
-        setScreen('menu')
+        setIsActivating(false)
+        returnToMenu()
        return
      }

      const active = setActiveProviderProfile(profileId)
      if (!active) {
        setErrorMessage('Could not change active provider.')
-        setScreen('menu')
+        setIsActivating(false)
+        returnToMenu()
        return
      }

+      // Update the session model to the new provider's first model.
+      // persistActiveProviderProfileModel (called by onChangeAppState) will
+      // not overwrite the multi-model list because it checks if the model
+      // is already in the profile's comma-separated model list.
+      const newModel = getPrimaryModel(active.model)
+      setAppState(prev => ({
+        ...prev,
+        mainLoopModel: newModel,
+      }))
+
      providerLabel = active.name
+      setAppState(prev => ({
+        ...prev,
+        mainLoopModel: active.model,
+        mainLoopModelForSession: null,
+      }))
      const settingsOverrideError =
        clearStartupProviderOverrideFromUserSettings()
      const isActiveCodexOAuth = isCodexOAuthProfile(
@@ -613,16 +829,23 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            ? `Active provider: ${active.name}. Warning: could not clear startup provider override (${settingsOverrideError}).`
            : `Active provider: ${active.name}`,
      )
-      setScreen('menu')
+      setIsActivating(false)
+      returnToMenu()
    } catch (error) {
      refreshProfiles()
      setStatusMessage(undefined)
+      setIsActivating(false)
      const detail = error instanceof Error ? error.message : String(error)
      setErrorMessage(`Could not finish activating ${providerLabel}: ${detail}`)
-      setScreen('menu')
+      returnToMenu()
    }
  }

+  function returnToMenu(): void {
+    setMenuFocusValue('done')
+    setScreen('menu')
+  }
+
  function closeWithCancelled(message: string): void {
    onDone({ action: 'cancelled', message })
  }
@@ -735,6 +958,12 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      return
    }

+    if (preset === 'atomic-chat') {
+      setAtomicChatSelection({ state: 'loading' })
+      setScreen('select-atomic-chat-model')
+      return
+    }
+
    setScreen('form')
  }

@@ -773,6 +1002,13 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    }

    const isActiveSavedProfile = getActiveProviderProfile()?.id === saved.id
+    if (isActiveSavedProfile) {
+      setAppState(prev => ({
+        ...prev,
+        mainLoopModel: saved.model,
+        mainLoopModelForSession: null,
+      }))
+    }
    const settingsOverrideError = isActiveSavedProfile
      ? clearStartupProviderOverrideFromUserSettings()
      : null
@@ -800,7 +1036,87 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    setEditingProfileId(null)
    setFormStepIndex(0)
    setErrorMessage(undefined)
-    setScreen('menu')
+    returnToMenu()
+  }
+
+  function renderAtomicChatSelection(): React.ReactNode {
+    if (
+      atomicChatSelection.state === 'loading' ||
+      atomicChatSelection.state === 'idle'
+    ) {
+      return (
+        <Box flexDirection="column" gap={1}>
+          <Text color="remember" bold>
+            Checking Atomic Chat
+          </Text>
+          <Text dimColor>Looking for loaded Atomic Chat models...</Text>
+        </Box>
+      )
+    }
+
+    if (atomicChatSelection.state === 'unavailable') {
+      return (
+        <Box flexDirection="column" gap={1}>
+          <Text color="remember" bold>
+            Atomic Chat setup
+          </Text>
+          <Text dimColor>{atomicChatSelection.message}</Text>
+          <Select
+            options={[
+              {
+                value: 'manual',
+                label: 'Enter manually',
+                description: 'Fill in the base URL and model yourself',
+              },
+              {
+                value: 'back',
+                label: 'Back',
+                description: 'Choose another provider preset',
+              },
+            ]}
+            onChange={(value: string) => {
+              if (value === 'manual') {
+                setFormStepIndex(0)
+                setCursorOffset(draft.name.length)
+                setScreen('form')
+                return
+              }
+              setScreen('select-preset')
+            }}
+            onCancel={() => setScreen('select-preset')}
+            visibleOptionCount={2}
+          />
+        </Box>
+      )
+    }
+
+    return (
+      <Box flexDirection="column" gap={1}>
+        <Text color="remember" bold>
+          Choose an Atomic Chat model
+        </Text>
+        <Text dimColor>
+          Pick one of the models loaded in Atomic Chat to save into a local
+          provider profile.
+        </Text>
+        <Select
+          options={atomicChatSelection.options}
+          defaultValue={atomicChatSelection.defaultValue}
+          defaultFocusValue={atomicChatSelection.defaultValue}
+          inlineDescriptions
+          visibleOptionCount={Math.min(8, atomicChatSelection.options.length)}
+          onChange={(value: string) => {
+            const nextDraft = {
+              ...draft,
+              model: value,
+            }
+            setDraft(nextDraft)
+            persistDraft(nextDraft)
+          }}
+          onCancel={() => setScreen('select-preset')}
+        />
+      </Box>
+    )
  }

  function renderOllamaSelection(): React.ReactNode {
@@ -923,7 +1239,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      return
    }

-    setScreen('menu')
+    returnToMenu()
  }

  useKeybinding('confirm:no', handleBackFromForm, {
@@ -933,21 +1249,35 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {

  function renderPresetSelection(): React.ReactNode {
    const canUseCodexOAuth = !isBareMode()
+    // Providers sorted alphabetically by label. `Custom` is pinned to the end
+    // because it's the catch-all / escape hatch — users scanning the list
+    // should always find known providers first. `Skip for now` (first-run
+    // only) comes last, after Custom.
    const options = [
+      {
+        value: 'dashscope-intl',
+        label: 'Alibaba Coding Plan',
+        description: 'Alibaba DashScope International endpoint',
+      },
+      {
+        value: 'dashscope-cn',
+        label: 'Alibaba Coding Plan (China)',
+        description: 'Alibaba DashScope China endpoint',
+      },
      {
        value: 'anthropic',
        label: 'Anthropic',
        description: 'Native Claude API (x-api-key auth)',
      },
      {
-        value: 'ollama',
-        label: 'Ollama',
-        description: 'Local or remote Ollama endpoint',
+        value: 'atomic-chat',
+        label: 'Atomic Chat',
+        description: 'Local Model Provider',
      },
      {
-        value: 'openai',
-        label: 'OpenAI',
-        description: 'OpenAI API with API key',
+        value: 'azure-openai',
+        label: 'Azure OpenAI',
+        description: 'Azure OpenAI endpoint (model=deployment name)',
      },
      ...(canUseCodexOAuth
        ? [
@@ -959,11 +1289,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            },
          ]
        : []),
-      {
-        value: 'moonshotai',
-        label: 'Moonshot AI',
-        description: 'Kimi OpenAI-compatible endpoint',
-      },
      {
        value: 'deepseek',
        label: 'DeepSeek',
@@ -974,25 +1299,45 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        label: 'Google Gemini',
        description: 'Gemini OpenAI-compatible endpoint',
      },
-      {
-        value: 'together',
-        label: 'Together AI',
-        description: 'Together chat/completions endpoint',
-      },
      {
        value: 'groq',
        label: 'Groq',
        description: 'Groq OpenAI-compatible endpoint',
      },
+      {
+        value: 'lmstudio',
+        label: 'LM Studio',
+        description: 'Local LM Studio endpoint',
+      },
+      {
+        value: 'minimax',
+        label: 'MiniMax',
+        description: 'MiniMax API endpoint',
+      },
      {
        value: 'mistral',
        label: 'Mistral',
        description: 'Mistral OpenAI-compatible endpoint',
      },
      {
-        value: 'azure-openai',
-        label: 'Azure OpenAI',
-        description: 'Azure OpenAI endpoint (model=deployment name)',
+        value: 'moonshotai',
+        label: 'Moonshot AI',
+        description: 'Kimi OpenAI-compatible endpoint',
+      },
+      {
+        value: 'nvidia-nim',
+        label: 'NVIDIA NIM',
+        description: 'NVIDIA NIM endpoint',
+      },
+      {
+        value: 'ollama',
+        label: 'Ollama',
+        description: 'Local or remote Ollama endpoint',
+      },
+      {
+        value: 'openai',
+        label: 'OpenAI',
+        description: 'OpenAI API with API key',
      },
      {
        value: 'openrouter',
@@ -1000,9 +1345,9 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        description: 'OpenRouter OpenAI-compatible endpoint',
      },
      {
-        value: 'lmstudio',
-        label: 'LM Studio',
-        description: 'Local LM Studio endpoint',
+        value: 'together',
+        label: 'Together AI',
+        description: 'Together chat/completions endpoint',
      },
      {
        value: 'custom',
@@ -1046,7 +1391,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
              closeWithCancelled('Provider setup skipped')
              return
            }
-            setScreen('menu')
+            returnToMenu()
          }}
          visibleOptionCount={Math.min(13, options.length)}
        />
@@ -1084,6 +1429,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            focus={true}
            showCursor={true}
            placeholder={`${currentStep.placeholder}${figures.ellipsis}`}
+            mask={currentStepKey === 'apiKey' ? '*' : undefined}
            columns={80}
            cursorOffset={cursorOffset}
            onChangeCursorOffset={setCursorOffset}
@@ -1098,49 +1444,10 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
  }

  function renderMenu(): React.ReactNode {
+    // Use memoized menuOptions from component scope
    const hasProfiles = profiles.length > 0
    const hasSelectableProviders = hasProfiles || githubProviderAvailable

-    const options = [
-      {
-        value: 'add',
-        label: 'Add provider',
-        description: 'Create a new provider profile',
-      },
-      {
-        value: 'activate',
-        label: 'Set active provider',
-        description: 'Switch the active provider profile',
-        disabled: !hasSelectableProviders,
-      },
-      {
-        value: 'edit',
-        label: 'Edit provider',
-        description: 'Update URL, model, or key',
-        disabled: !hasProfiles,
-      },
-      {
-        value: 'delete',
-        label: 'Delete provider',
-        description: 'Remove a provider profile',
-        disabled: !hasSelectableProviders,
-      },
-      ...(hasStoredCodexOAuthCredentials
-        ? [
-            {
-              value: 'logout-codex-oauth',
-              label: 'Log out Codex OAuth',
-              description: 'Clear securely stored Codex OAuth credentials',
-            },
-          ]
-        : []),
-      {
-        value: 'done',
-        label: 'Done',
-        description: 'Return to chat',
-      },
-    ]
-
    return (
      <Box flexDirection="column" gap={1}>
        <Text color="remember" bold>
@@ -1177,7 +1484,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
          )}
        </Box>
        <Select
-          options={options}
+          options={menuOptions}
          onChange={(value: string) => {
            setErrorMessage(undefined)
            switch (value) {
@@ -1190,7 +1497,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
                }
                break
              case 'edit':
-                if (profiles.length > 0) {
+                if (hasProfiles) {
                  setScreen('select-edit')
                }
                break
@@ -1246,7 +1553,8 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            }
          }}
          onCancel={() => closeWithCancelled('Provider manager closed')}
-          visibleOptionCount={options.length}
+          defaultFocusValue={menuFocusValue}
+          visibleOptionCount={menuOptions.length}
        />
      </Box>
    )
@@ -1293,8 +1601,8 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
                description: 'Return to provider manager',
              },
            ]}
-            onChange={() => setScreen('menu')}
-            onCancel={() => setScreen('menu')}
+            onChange={() => returnToMenu()}
+            onCancel={() => returnToMenu()}
            visibleOptionCount={1}
          />
        </Box>
@@ -1309,7 +1617,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        <Select
          options={selectOptions}
          onChange={onSelect}
-          onCancel={() => setScreen('menu')}
+          onCancel={() => returnToMenu()}
          visibleOptionCount={Math.min(10, Math.max(2, selectOptions.length))}
        />
      </Box>
@@ -1325,6 +1633,9 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
    case 'select-ollama-model':
      content = renderOllamaSelection()
      break
+    case 'select-atomic-chat-model':
+      content = renderAtomicChatSelection()
+      break
    case 'codex-oauth':
      content = (
        <CodexOAuthSetup
@@ -1350,7 +1661,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
              setErrorMessage(
                'Codex OAuth login finished, but the provider profile could not be saved.',
              )
-              setScreen('menu')
+              returnToMenu()
              return
            }

@@ -1362,7 +1673,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
              setErrorMessage(
                'Codex OAuth login finished, but the provider could not be set as the startup provider.',
              )
-              setScreen('menu')
+              returnToMenu()
              return
            }

@@ -1396,7 +1707,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {

            setStatusMessage(message)
            setErrorMessage(undefined)
-            setScreen('menu')
+            returnToMenu()
          }}
        />
      )
@@ -1436,7 +1747,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
              refreshProfiles()
              setStatusMessage('GitHub provider deleted')
            }
-            setScreen('menu')
+            returnToMenu()
            return
          }

@@ -1471,7 +1782,7 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
                : 'Provider deleted',
            )
          }
-          setScreen('menu')
+          returnToMenu()
        },
        { includeGithub: true },
      )
@@ -1482,5 +1793,21 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
      break
  }

-  return <Pane color="permission">{content}</Pane>
+  return (
+    <Pane color="permission">
+      {isInitializing ? (
+        <Box flexDirection="column" gap={1}>
+          <Text color="remember" bold>Loading providers...</Text>
+          <Text dimColor>Reading provider profiles from disk.</Text>
+        </Box>
+      ) : isActivating ? (
+        <Box flexDirection="column" gap={1}>
+          <Text color="remember" bold>Activating provider...</Text>
+          <Text dimColor>Please wait while the provider is being configured.</Text>
+        </Box>
+      ) : (
+        content
+      )}
+    </Pane>
+  )
 }
--- a/src/components/Settings/Config.tsx
+++ b/src/components/Settings/Config.tsx
@@ -281,6 +281,24 @@ export function Config({
        enabled: autoCompactEnabled
      });
    }
+  }, {
+    id: 'toolHistoryCompressionEnabled',
+    label: 'Tool history compression',
+    value: globalConfig.toolHistoryCompressionEnabled,
+    type: 'boolean' as const,
+    onChange(toolHistoryCompressionEnabled: boolean) {
+      saveGlobalConfig(current => ({
+        ...current,
+        toolHistoryCompressionEnabled
+      }));
+      setGlobalConfig({
+        ...getGlobalConfig(),
+        toolHistoryCompressionEnabled
+      });
+      logEvent('tengu_tool_history_compression_setting_changed', {
+        enabled: toolHistoryCompressionEnabled
+      });
+    }
  }, {
    id: 'spinnerTipsEnabled',
    label: 'Show tips',
@@ -1158,6 +1176,9 @@ export function Config({
    if (globalConfig.autoCompactEnabled !== initialConfig.current.autoCompactEnabled) {
      formattedChanges.push(`${globalConfig.autoCompactEnabled ? 'Enabled' : 'Disabled'} auto-compact`);
    }
+    if (globalConfig.toolHistoryCompressionEnabled !== initialConfig.current.toolHistoryCompressionEnabled) {
+      formattedChanges.push(`${globalConfig.toolHistoryCompressionEnabled ? 'Enabled' : 'Disabled'} tool history compression`);
+    }
    if (globalConfig.respectGitignore !== initialConfig.current.respectGitignore) {
      formattedChanges.push(`${globalConfig.respectGitignore ? 'Enabled' : 'Disabled'} respect .gitignore in file picker`);
    }
--- a/src/components/StartupScreen.test.ts
+++ b/src/components/StartupScreen.test.ts
@@ -0,0 +1,158 @@
+import { afterEach, beforeEach, describe, expect, test } from 'bun:test'
+import { detectProvider } from './StartupScreen.js'
+
+const ENV_KEYS = [
+  'CLAUDE_CODE_USE_OPENAI',
+  'CLAUDE_CODE_USE_GEMINI',
+  'CLAUDE_CODE_USE_GITHUB',
+  'CLAUDE_CODE_USE_BEDROCK',
+  'CLAUDE_CODE_USE_VERTEX',
+  'CLAUDE_CODE_USE_MISTRAL',
+  'OPENAI_BASE_URL',
+  'OPENAI_API_KEY',
+  'OPENAI_MODEL',
+  'GEMINI_MODEL',
+  'MISTRAL_MODEL',
+  'ANTHROPIC_MODEL',
+  'NVIDIA_NIM',
+  'MINIMAX_API_KEY',
+]
+
+const originalEnv: Record<string, string | undefined> = {}
+
+beforeEach(() => {
+  for (const key of ENV_KEYS) {
+    originalEnv[key] = process.env[key]
+    delete process.env[key]
+  }
+})
+
+afterEach(() => {
+  for (const key of ENV_KEYS) {
+    if (originalEnv[key] === undefined) {
+      delete process.env[key]
+    } else {
+      process.env[key] = originalEnv[key]
+    }
+  }
+})
+
+function setupOpenAIMode(baseUrl: string, model: string): void {
+  process.env.CLAUDE_CODE_USE_OPENAI = '1'
+  process.env.OPENAI_BASE_URL = baseUrl
+  process.env.OPENAI_MODEL = model
+  process.env.OPENAI_API_KEY = 'test-key'
+}
+
+// --- Issue #855: aggregator URL must win over vendor-prefixed model name ---
+
+describe('detectProvider — aggregator URL authoritative over model-name substring (#855)', () => {
+  test('OpenRouter + deepseek/deepseek-chat labels as OpenRouter', () => {
+    setupOpenAIMode('https://openrouter.ai/api/v1', 'deepseek/deepseek-chat')
+    expect(detectProvider().name).toBe('OpenRouter')
+  })
+
+  test('OpenRouter + moonshotai/kimi-k2 labels as OpenRouter', () => {
+    setupOpenAIMode('https://openrouter.ai/api/v1', 'moonshotai/kimi-k2')
+    expect(detectProvider().name).toBe('OpenRouter')
+  })
+
+  test('OpenRouter + mistralai/mistral-large labels as OpenRouter', () => {
+    setupOpenAIMode('https://openrouter.ai/api/v1', 'mistralai/mistral-large')
+    expect(detectProvider().name).toBe('OpenRouter')
+  })
+
+  test('OpenRouter + meta-llama/llama-3.3 labels as OpenRouter', () => {
+    setupOpenAIMode('https://openrouter.ai/api/v1', 'meta-llama/llama-3.3-70b-instruct')
+    expect(detectProvider().name).toBe('OpenRouter')
+  })
+
+  test('Together + deepseek-ai/DeepSeek-V3 labels as Together AI', () => {
+    setupOpenAIMode('https://api.together.xyz/v1', 'deepseek-ai/DeepSeek-V3')
+    expect(detectProvider().name).toBe('Together AI')
+  })
+
+  test('Together + meta-llama/Llama-3.3 labels as Together AI', () => {
+    setupOpenAIMode('https://api.together.xyz/v1', 'meta-llama/Llama-3.3-70B-Instruct-Turbo')
+    expect(detectProvider().name).toBe('Together AI')
+  })
+
+  test('Groq + deepseek-r1-distill-llama-70b labels as Groq', () => {
+    setupOpenAIMode('https://api.groq.com/openai/v1', 'deepseek-r1-distill-llama-70b')
+    expect(detectProvider().name).toBe('Groq')
+  })
+
+  test('Groq + llama-3.3-70b-versatile labels as Groq', () => {
+    setupOpenAIMode('https://api.groq.com/openai/v1', 'llama-3.3-70b-versatile')
+    expect(detectProvider().name).toBe('Groq')
+  })
+
+  test('Azure + any deepseek deployment labels as Azure OpenAI', () => {
+    setupOpenAIMode('https://my-resource.openai.azure.com/', 'deepseek-chat')
+    expect(detectProvider().name).toBe('Azure OpenAI')
+  })
+})
+
+// --- Direct vendor endpoints still label correctly (regression) ---
+
+describe('detectProvider — direct vendor endpoints', () => {
+  test('api.deepseek.com labels as DeepSeek', () => {
+    setupOpenAIMode('https://api.deepseek.com/v1', 'deepseek-chat')
+    expect(detectProvider().name).toBe('DeepSeek')
+  })
+
+  test('api.moonshot.cn labels as Moonshot (Kimi)', () => {
+    setupOpenAIMode('https://api.moonshot.cn/v1', 'moonshot-v1-8k')
+    expect(detectProvider().name).toBe('Moonshot (Kimi)')
+  })
+
+  test('api.mistral.ai labels as Mistral', () => {
+    setupOpenAIMode('https://api.mistral.ai/v1', 'mistral-large-latest')
+    expect(detectProvider().name).toBe('Mistral')
+  })
+
+  test('default OpenAI URL + gpt-4o labels as OpenAI', () => {
+    setupOpenAIMode('https://api.openai.com/v1', 'gpt-4o')
+    expect(detectProvider().name).toBe('OpenAI')
+  })
+})
+
+// --- rawModel fallback for generic/custom endpoints ---
+
+describe('detectProvider — rawModel fallback when URL is generic', () => {
+  test('custom proxy + deepseek-chat falls back to DeepSeek', () => {
+    setupOpenAIMode('https://my-proxy.internal/v1', 'deepseek-chat')
+    expect(detectProvider().name).toBe('DeepSeek')
+  })
+
+  test('custom proxy + kimi-k2 falls back to Moonshot (Kimi)', () => {
+    setupOpenAIMode('https://my-proxy.internal/v1', 'kimi-k2-instruct')
+    expect(detectProvider().name).toBe('Moonshot (Kimi)')
+  })
+
+  test('custom proxy + llama-3.3 falls back to Meta Llama', () => {
+    setupOpenAIMode('https://my-proxy.internal/v1', 'llama-3.3-70b')
+    expect(detectProvider().name).toBe('Meta Llama')
+  })
+
+  test('custom proxy + mistral-large falls back to Mistral', () => {
+    setupOpenAIMode('https://my-proxy.internal/v1', 'mistral-large-latest')
+    expect(detectProvider().name).toBe('Mistral')
+  })
+})
+
+// --- Explicit env flags win over URL heuristics ---
+
+describe('detectProvider — explicit dedicated-provider env flags', () => {
+  test('NVIDIA_NIM=1 overrides aggregator URL', () => {
+    setupOpenAIMode('https://openrouter.ai/api/v1', 'some-nim-model')
+    process.env.NVIDIA_NIM = '1'
+    expect(detectProvider().name).toBe('NVIDIA NIM')
+  })
+
+  test('MINIMAX_API_KEY overrides aggregator URL', () => {
+    setupOpenAIMode('https://openrouter.ai/api/v1', 'any-model')
+    process.env.MINIMAX_API_KEY = 'test-key'
+    expect(detectProvider().name).toBe('MiniMax')
+  })
+})
--- a/src/components/StartupScreen.ts
+++ b/src/components/StartupScreen.ts
@@ -83,7 +83,7 @@ const LOGO_CLAUDE = [

 // ─── Provider detection ───────────────────────────────────────────────────────

-function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
+export function detectProvider(): { name: string; model: string; baseUrl: string; isLocal: boolean } {
  const useGemini = process.env.CLAUDE_CODE_USE_GEMINI === '1' || process.env.CLAUDE_CODE_USE_GEMINI === 'true'
  const useGithub = process.env.CLAUDE_CODE_USE_GITHUB === '1' || process.env.CLAUDE_CODE_USE_GITHUB === 'true'
  const useOpenAI = process.env.CLAUDE_CODE_USE_OPENAI === '1' || process.env.CLAUDE_CODE_USE_OPENAI === 'true'
@@ -117,15 +117,32 @@ function detectProvider(): { name: string; model: string; baseUrl: string; isLoc
    const baseUrl = resolvedRequest.baseUrl
    const isLocal = isLocalProviderUrl(baseUrl)
    let name = 'OpenAI'
-    // Override to Codex when resolved endpoint is Codex
-    if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex')) {
+    // Explicit dedicated-provider env flags win.
+    if (process.env.NVIDIA_NIM) name = 'NVIDIA NIM'
+    else if (process.env.MINIMAX_API_KEY) name = 'MiniMax'
+    else if (
+      resolvedRequest.transport === 'codex_responses' ||
+      baseUrl.includes('chatgpt.com/backend-api/codex')
+    )
      name = 'Codex'
-    } else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))       name = 'DeepSeek'
+    // Base URL is authoritative — must precede rawModel checks so aggregators
+    // (OpenRouter/Together/Groq) aren't mislabelled as DeepSeek/Kimi/etc.
+    // when routed to models whose IDs contain a vendor prefix. See issue #855.
    else if (/openrouter/i.test(baseUrl)) name = 'OpenRouter'
    else if (/together/i.test(baseUrl)) name = 'Together AI'
    else if (/groq/i.test(baseUrl)) name = 'Groq'
-    else if (/mistral/i.test(baseUrl) || /mistral/i.test(rawModel))     name = 'Mistral'
    else if (/azure/i.test(baseUrl)) name = 'Azure OpenAI'
+    else if (/nvidia/i.test(baseUrl)) name = 'NVIDIA NIM'
+    else if (/minimax/i.test(baseUrl)) name = 'MiniMax'
+    else if (/moonshot/i.test(baseUrl)) name = 'Moonshot (Kimi)'
+    else if (/deepseek/i.test(baseUrl)) name = 'DeepSeek'
+    else if (/mistral/i.test(baseUrl)) name = 'Mistral'
+    // rawModel fallback — fires only when base URL is generic/custom.
+    else if (/nvidia/i.test(rawModel)) name = 'NVIDIA NIM'
+    else if (/minimax/i.test(rawModel)) name = 'MiniMax'
+    else if (/kimi/i.test(rawModel)) name = 'Moonshot (Kimi)'
+    else if (/deepseek/i.test(rawModel)) name = 'DeepSeek'
+    else if (/mistral/i.test(rawModel)) name = 'Mistral'
    else if (/llama/i.test(rawModel)) name = 'Meta Llama'
    else if (isLocal) name = getLocalOpenAICompatibleProviderLabel(baseUrl)
    
@@ -142,7 +159,9 @@ function detectProvider(): { name: string; model: string; baseUrl: string; isLoc
  const settings = getSettings_DEPRECATED() || {}
  const modelSetting = settings.model || process.env.ANTHROPIC_MODEL || process.env.CLAUDE_MODEL || 'claude-sonnet-4-6'
  const resolvedModel = parseUserSpecifiedModel(modelSetting)
-  return { name: 'Anthropic', model: resolvedModel, baseUrl: 'https://api.anthropic.com', isLocal: false }
+  const baseUrl = process.env.ANTHROPIC_BASE_URL ?? 'https://api.anthropic.com'
+  const isLocal = isLocalProviderUrl(baseUrl)
+  return { name: 'Anthropic', model: resolvedModel, baseUrl, isLocal }
 }

 // ─── Box drawing ──────────────────────────────────────────────────────────────
--- a/src/components/TextInput.test.tsx
+++ b/src/components/TextInput.test.tsx
@@ -6,6 +6,7 @@ import stripAnsi from 'strip-ansi'

 import { createRoot } from '../ink.js'
 import { AppStateProvider } from '../state/AppState.js'
+import { maskTextWithVisibleEdges } from '../utils/Cursor.js'
 import TextInput from './TextInput.js'
 import VimTextInput from './VimTextInput.js'

@@ -199,6 +200,13 @@ test('TextInput renders typed characters before delayed parent value commits', a
  expect(output).not.toContain('Type here...')
 })

+test('maskTextWithVisibleEdges preserves only the first and last three chars', () => {
+  expect(maskTextWithVisibleEdges('sk-secret-12345678', '*')).toBe(
+    'sk-************678',
+  )
+  expect(maskTextWithVisibleEdges('abcdef', '*')).toBe('******')
+})
+
 test('VimTextInput preserves rapid typed characters before delayed parent value commits', async () => {
  const { stdout, stdin, getOutput } = createTestStreams()
  const root = await createRoot({
--- a/src/components/memory/memoryFileSelectorPaths.test.ts
+++ b/src/components/memory/memoryFileSelectorPaths.test.ts
@@ -53,17 +53,20 @@ describe('getProjectMemoryPathForSelector', () => {
  })

  test('defaults to a new AGENTS.md in the current cwd when no project file is loaded', () => {
-    expect(getProjectMemoryPathForSelector([], '/repo/packages/app')).toBe(
-      '/repo/packages/app/AGENTS.md',
+    const cwd = join('/repo', 'packages', 'app')
+    expect(getProjectMemoryPathForSelector([], cwd)).toBe(
+      join(cwd, 'AGENTS.md'),
    )
  })

  test('ignores loaded project instruction files outside the current cwd ancestry', () => {
+    const outsideRepoPath = join('/other-worktree', 'AGENTS.md')
+    const cwd = join('/repo', 'packages', 'app')
    expect(
      getProjectMemoryPathForSelector(
-        [projectFile('/other-worktree/AGENTS.md')],
-        '/repo/packages/app',
+        [projectFile(outsideRepoPath)],
+        cwd,
      ),
-    ).toBe('/repo/packages/app/AGENTS.md')
+    ).toBe(join(cwd, 'AGENTS.md'))
  })
 })
--- a/src/constants/promptIdentity.test.ts
+++ b/src/constants/promptIdentity.test.ts
@@ -1,5 +1,16 @@
 import { afterEach, expect, test } from 'bun:test'

+// MACRO is replaced at build time by Bun.define but not in test mode.
+// Define it globally so tests that import modules using MACRO don't crash.
+;(globalThis as Record<string, unknown>).MACRO = {
+  VERSION: '99.0.0',
+  DISPLAY_VERSION: '0.0.0-test',
+  BUILD_TIME: new Date().toISOString(),
+  ISSUES_EXPLAINER: 'report the issue at https://github.com/anthropics/claude-code/issues',
+  PACKAGE_URL: '@gitlawb/openclaude',
+  NATIVE_PACKAGE_URL: undefined,
+}
+
 import { getSystemPrompt, DEFAULT_AGENT_PROMPT } from './prompts.js'
 import { CLI_SYSPROMPT_PREFIXES, getCLISyspromptPrefix } from './system.js'
 import { CLAUDE_CODE_GUIDE_AGENT } from '../tools/AgentTool/built-in/claudeCodeGuideAgent.js'
--- a/src/constants/prompts.ts
+++ b/src/constants/prompts.ts
@@ -823,6 +823,11 @@ function getFunctionResultClearingSection(model: string): string | null {
    return null
  }
  const config = getCachedMCConfigForFRC()
+  if (!config) {
+    // External/stub builds return null from getCachedMCConfig — abort the
+    // section rather than trying to read .supportedModels off null.
+    return null
+  }
  const isModelSupported = config.supportedModels?.some(pattern =>
    model.includes(pattern),
  )
--- a/src/context.repoMap.test.ts
+++ b/src/context.repoMap.test.ts
@@ -1,64 +0,0 @@
-import { afterEach, describe, expect, test } from 'bun:test'
-
-afterEach(() => {
-  delete process.env.REPO_MAP
-})
-
-describe('getRepoMapContext', () => {
-  test('returns null when REPO_MAP env flag is off (default)', async () => {
-    const { getRepoMapContext } = await import('./context.js')
-    const result = await getRepoMapContext()
-    expect(result).toBeNull()
-  })
-
-  test('buildRepoMap produces valid output for context injection', async () => {
-    process.env.REPO_MAP = '1'
-    const { mkdtempSync, writeFileSync, rmSync } = await import('fs')
-    const { tmpdir } = await import('os')
-    const { join } = await import('path')
-    const { buildRepoMap } = await import('./context/repoMap/index.js')
-
-    const tempDir = mkdtempSync(join(tmpdir(), 'repomap-ctx-'))
-    try {
-      writeFileSync(
-        join(tempDir, 'main.ts'),
-        'export function main(): void { console.log("hello") }\n',
-      )
-      writeFileSync(
-        join(tempDir, 'utils.ts'),
-        'import { main } from "./main"\nexport function helper(): void { main() }\n',
-      )
-
-      const result = await buildRepoMap({
-        root: tempDir,
-        maxTokens: 1024,
-      })
-
-      // Valid map that could be injected
-      expect(result.map.length).toBeGreaterThan(0)
-      expect(result.tokenCount).toBeGreaterThan(0)
-      expect(result.tokenCount).toBeLessThanOrEqual(1024)
-      expect(typeof result.cacheHit).toBe('boolean')
-    } finally {
-      rmSync(tempDir, { recursive: true, force: true })
-      const { invalidateCache } = await import('./context/repoMap/index.js')
-      invalidateCache(tempDir)
-    }
-  })
-
-  test('getSystemContext does not include repoMap key when flag is off', async () => {
-    const { getSystemContext } = await import('./context.js')
-    const result = await getSystemContext()
-    expect('repoMap' in result).toBe(false)
-  })
-
-  test('getSystemContext includes repoMap key when REPO_MAP env flag is on', async () => {
-    process.env.REPO_MAP = '1'
-    const { getSystemContext, getRepoMapContext } = await import('./context.js')
-    getRepoMapContext.cache.clear?.()
-    getSystemContext.cache.clear?.()
-    const result = await getSystemContext()
-    expect(typeof result.repoMap).toBe('string')
-    expect(result.repoMap!.length).toBeGreaterThan(0)
-  })
-})
--- a/src/context.ts
+++ b/src/context.ts
@@ -31,7 +31,6 @@ export function setSystemPromptInjection(value: string | null): void {
  // Clear context caches immediately when injection changes
  getUserContext.cache.clear?.()
  getSystemContext.cache.clear?.()
-  getRepoMapContext.cache.clear?.()
 }

 export const getGitStatus = memoize(async (): Promise<string | null> => {
@@ -111,35 +110,6 @@ export const getGitStatus = memoize(async (): Promise<string | null> => {
  }
 })

-export const getRepoMapContext = memoize(
-  async (): Promise<string | null> => {
-    const runtimeEnabled = isEnvTruthy(process.env.REPO_MAP)
-    if (!runtimeEnabled) return null
-    if (isBareMode()) return null
-    if (isEnvTruthy(process.env.CLAUDE_CODE_REMOTE)) return null
-
-    try {
-      const startTime = Date.now()
-      logForDiagnosticsNoPII('info', 'repo_map_started')
-      const { buildRepoMap } = await import('./context/repoMap/index.js')
-      const result = await buildRepoMap({ maxTokens: 1024 })
-      logForDiagnosticsNoPII('info', 'repo_map_completed', {
-        duration_ms: Date.now() - startTime,
-        token_count: result.tokenCount,
-        file_count: result.fileCount,
-        cache_hit: result.cacheHit,
-      })
-      if (!result.map || result.map.length === 0) return null
-      return `This is a structural map of the repository, ranked by importance. Use it to understand the codebase architecture.\n\n${result.map}`
-    } catch (err) {
-      logForDiagnosticsNoPII('warn', 'repo_map_failed', {
-        error: String(err),
-      })
-      return null
-    }
-  },
-)
-
 /**
 * This context is prepended to each conversation, and cached for the duration of the conversation.
 */
@@ -157,8 +127,6 @@ export const getSystemContext = memoize(
        ? null
        : await getGitStatus()

-    const repoMap = await getRepoMapContext()
-
    // Include system prompt injection if set (for cache breaking, internal-only)
    const injection = feature('BREAK_CACHE_COMMAND')
      ? getSystemPromptInjection()
@@ -167,13 +135,11 @@ export const getSystemContext = memoize(
    logForDiagnosticsNoPII('info', 'system_context_completed', {
      duration_ms: Date.now() - startTime,
      has_git_status: gitStatus !== null,
-      has_repo_map: repoMap !== null,
      has_injection: injection !== null,
    })

    return {
      ...(gitStatus && { gitStatus }),
-      ...(repoMap && { repoMap }),
      ...(feature('BREAK_CACHE_COMMAND') && injection
        ? {
            cacheBreaker: `[CACHE_BREAKER: ${injection}]`,
--- a/src/context/repoMap/fixtures/mini-repo/fileA.ts
+++ b/src/context/repoMap/fixtures/mini-repo/fileA.ts
@@ -1,29 +0,0 @@
-// fileA — imports from fileB and fileC
-
-import { CacheLayer, buildCache } from './fileB'
-import { createStore, type StoreConfig } from './fileC'
-
-export class AppController {
-  private cache: CacheLayer
-  private config: StoreConfig
-
-  constructor(config: StoreConfig) {
-    this.cache = buildCache()
-    this.config = config
-  }
-
-  initialize(): void {
-    const store = createStore()
-    this.cache.cacheSet('primary', store)
-  }
-
-  getFromCache(key: string): unknown {
-    return this.cache.cacheGet(key)
-  }
-}
-
-export function startApp(config: StoreConfig): AppController {
-  const app = new AppController(config)
-  app.initialize()
-  return app
-}
--- a/src/context/repoMap/fixtures/mini-repo/fileB.ts
+++ b/src/context/repoMap/fixtures/mini-repo/fileB.ts
@@ -1,23 +0,0 @@
-// fileB — imports from fileC
-
-import { DataStore, createStore } from './fileC'
-
-export class CacheLayer {
-  private store: DataStore
-
-  constructor() {
-    this.store = createStore()
-  }
-
-  cacheGet(key: string): unknown | undefined {
-    return this.store.lookup(key)
-  }
-
-  cacheSet(key: string, value: unknown): void {
-    this.store.add(key, value)
-  }
-}
-
-export function buildCache(): CacheLayer {
-  return new CacheLayer()
-}
--- a/src/context/repoMap/fixtures/mini-repo/fileC.ts
+++ b/src/context/repoMap/fixtures/mini-repo/fileC.ts
@@ -1,22 +0,0 @@
-// fileC — the most imported module (imported by fileA and fileB)
-
-export class DataStore {
-  private items: Map<string, unknown> = new Map()
-
-  add(key: string, value: unknown): void {
-    this.items.set(key, value)
-  }
-
-  lookup(key: string): unknown | undefined {
-    return this.items.get(key)
-  }
-}
-
-export function createStore(): DataStore {
-  return new DataStore()
-}
-
-export interface StoreConfig {
-  maxSize: number
-  ttl: number
-}
--- a/src/context/repoMap/fixtures/mini-repo/fileD.ts
+++ b/src/context/repoMap/fixtures/mini-repo/fileD.ts
@@ -1,9 +0,0 @@
-// fileD — imports from fileA
-
-import { AppController, startApp } from './fileA'
-
-export function runApp(): void {
-  const controller: AppController = startApp({ maxSize: 100, ttl: 3600 })
-  const result = controller.getFromCache('test')
-  console.log(result)
-}
--- a/src/context/repoMap/fixtures/mini-repo/fileE.ts
+++ b/src/context/repoMap/fixtures/mini-repo/fileE.ts
@@ -1,25 +0,0 @@
-// fileE — isolated, no imports from other fixture files
-
-export interface Logger {
-  log(message: string): void
-  warn(message: string): void
-  error(message: string): void
-}
-
-export class ConsoleLogger implements Logger {
-  log(message: string): void {
-    console.log(`[LOG] ${message}`)
-  }
-
-  warn(message: string): void {
-    console.warn(`[WARN] ${message}`)
-  }
-
-  error(message: string): void {
-    console.error(`[ERROR] ${message}`)
-  }
-}
-
-export function createLogger(): Logger {
-  return new ConsoleLogger()
-}
--- a/src/context/repoMap/cache.ts
+++ b/src/context/repoMap/cache.ts
@@ -1,139 +0,0 @@
-import { createHash } from 'crypto'
-import {
-  existsSync,
-  mkdirSync,
-  readFileSync,
-  statSync,
-  writeFileSync,
-} from 'fs'
-import { homedir } from 'os'
-import { join } from 'path'
-import type { CacheData, CacheEntry, CacheStats, Tag } from './types.js'
-
-const CACHE_VERSION = 1
-const CACHE_DIR = join(homedir(), '.openclaude', 'repomap-cache')
-
-function getCacheFilePath(root: string): string {
-  const hash = createHash('sha1').update(root).digest('hex')
-  return join(CACHE_DIR, `${hash}.json`)
-}
-
-function ensureCacheDir(): void {
-  if (!existsSync(CACHE_DIR)) {
-    mkdirSync(CACHE_DIR, { recursive: true })
-  }
-}
-
-/** Load cache from disk. Returns empty cache if not found or invalid. */
-export function loadCache(root: string): CacheData {
-  const path = getCacheFilePath(root)
-  try {
-    const raw = readFileSync(path, 'utf-8')
-    const data = JSON.parse(raw) as CacheData
-    if (data.version !== CACHE_VERSION) {
-      return { version: CACHE_VERSION, entries: {} }
-    }
-    return data
-  } catch {
-    return { version: CACHE_VERSION, entries: {} }
-  }
-}
-
-/** Save cache to disk. */
-export function saveCache(root: string, cache: CacheData): void {
-  ensureCacheDir()
-  const path = getCacheFilePath(root)
-  writeFileSync(path, JSON.stringify(cache), 'utf-8')
-}
-
-/**
- * Check if a file's cached entry is still valid based on mtime and size.
- * Returns the cached tags if valid, null otherwise.
- */
-export function getCachedTags(
-  cache: CacheData,
-  filePath: string,
-  root: string,
-): Tag[] | null {
-  const entry = cache.entries[filePath]
-  if (!entry) return null
-
-  try {
-    const absolutePath = join(root, filePath)
-    const stat = statSync(absolutePath)
-    if (stat.mtimeMs === entry.mtimeMs && stat.size === entry.size) {
-      return entry.tags
-    }
-  } catch {
-    // File may have been deleted
-  }
-  return null
-}
-
-/** Update the cache entry for a file. */
-export function setCachedTags(
-  cache: CacheData,
-  filePath: string,
-  root: string,
-  tags: Tag[],
-): void {
-  try {
-    const absolutePath = join(root, filePath)
-    const stat = statSync(absolutePath)
-    cache.entries[filePath] = {
-      tags,
-      mtimeMs: stat.mtimeMs,
-      size: stat.size,
-    }
-  } catch {
-    // If we can't stat, don't cache
-  }
-}
-
-/**
- * Compute a hash of the inputs that affect the rendered map.
- * Used to cache the final rendered output.
- */
-export function computeMapHash(
-  files: string[],
-  maxTokens: number,
-  focusFiles: string[],
-): string {
-  const sorted = [...files].sort()
-  const input = JSON.stringify({ files: sorted, maxTokens, focusFiles: [...focusFiles].sort() })
-  return createHash('sha1').update(input).digest('hex')
-}
-
-/** Get cache statistics. */
-export function getCacheStats(root: string): CacheStats {
-  const cacheFile = getCacheFilePath(root)
-  const exists = existsSync(cacheFile)
-  let entryCount = 0
-
-  if (exists) {
-    try {
-      const data = JSON.parse(readFileSync(cacheFile, 'utf-8')) as CacheData
-      entryCount = Object.keys(data.entries).length
-    } catch {
-      // corrupted cache
-    }
-  }
-
-  return {
-    cacheDir: CACHE_DIR,
-    cacheFile: exists ? cacheFile : null,
-    entryCount,
-    exists,
-  }
-}
-
-/** Delete the cache for a repo root. */
-export function invalidateCache(root: string): void {
-  const path = getCacheFilePath(root)
-  try {
-    const { unlinkSync } = require('fs')
-    unlinkSync(path)
-  } catch {
-    // File may not exist
-  }
-}
--- a/src/context/repoMap/gitFiles.ts
+++ b/src/context/repoMap/gitFiles.ts
@@ -1,109 +0,0 @@
-import { execFile } from 'child_process'
-import { readdirSync } from 'fs'
-import { join, relative } from 'path'
-import type { SupportedLanguage } from './types.js'
-
-const SUPPORTED_EXTENSIONS: Record<string, SupportedLanguage> = {
-  '.ts': 'typescript',
-  '.tsx': 'typescript',
-  '.js': 'javascript',
-  '.jsx': 'javascript',
-  '.mjs': 'javascript',
-  '.cjs': 'javascript',
-  '.py': 'python',
-}
-
-const EXCLUDED_DIRS = new Set([
-  'node_modules',
-  'dist',
-  '.git',
-  '.hg',
-  '.svn',
-  'build',
-  'out',
-  'coverage',
-  '__pycache__',
-  '.next',
-  '.nuxt',
-  'vendor',
-  '.worktrees',
-])
-
-const EXCLUDED_FILES = new Set([
-  'bun.lock',
-  'bun.lockb',
-  'package-lock.json',
-  'yarn.lock',
-  'pnpm-lock.yaml',
-])
-
-export function getLanguageForFile(filePath: string): SupportedLanguage | null {
-  const ext = filePath.substring(filePath.lastIndexOf('.'))
-  return SUPPORTED_EXTENSIONS[ext] ?? null
-}
-
-export function isSupportedFile(filePath: string): boolean {
-  return getLanguageForFile(filePath) !== null
-}
-
-/** List files using git ls-files. Returns relative paths. */
-function gitLsFiles(root: string): Promise<string[]> {
-  return new Promise((resolve, reject) => {
-    execFile(
-      'git',
-      ['ls-files', '--cached', '--others', '--exclude-standard'],
-      { cwd: root, maxBuffer: 10 * 1024 * 1024 },
-      (error, stdout) => {
-        if (error) {
-          reject(error)
-          return
-        }
-        const files = stdout
-          .split('\n')
-          .map(f => f.trim())
-          .filter(f => f.length > 0)
-        resolve(files)
-      },
-    )
-  })
-}
-
-/** Walk directory tree manually as fallback when git is unavailable. */
-function walkDirectory(root: string, currentDir: string = root): string[] {
-  const results: string[] = []
-  let entries: ReturnType<typeof readdirSync>
-  try {
-    entries = readdirSync(currentDir, { withFileTypes: true })
-  } catch {
-    return results
-  }
-
-  for (const entry of entries) {
-    const name = entry.name
-    if (entry.isDirectory()) {
-      if (!EXCLUDED_DIRS.has(name) && !name.startsWith('.')) {
-        results.push(...walkDirectory(root, join(currentDir, name)))
-      }
-    } else if (entry.isFile()) {
-      if (!EXCLUDED_FILES.has(name)) {
-        results.push(relative(root, join(currentDir, name)))
-      }
-    }
-  }
-  return results
-}
-
-/**
- * Enumerate all supported source files in the repo.
- * Tries git ls-files first, falls back to manual walk.
- */
-export async function getRepoFiles(root: string): Promise<string[]> {
-  let files: string[]
-  try {
-    files = await gitLsFiles(root)
-  } catch {
-    files = walkDirectory(root)
-  }
-
-  return files.filter(isSupportedFile)
-}
--- a/src/context/repoMap/graph.ts
+++ b/src/context/repoMap/graph.ts
@@ -1,88 +0,0 @@
-import Graph from 'graphology'
-import type { FileTags } from './types.js'
-
-// Common identifiers that should contribute less weight (high IDF penalty).
-const COMMON_NAMES = new Set([
-  'map', 'get', 'set', 'value', 'key', 'data', 'result', 'error',
-  'name', 'type', 'id', 'index', 'item', 'items', 'list', 'options',
-  'config', 'args', 'params', 'props', 'state', 'event', 'callback',
-  'handler', 'fn', 'func', 'self', 'this', 'ctx', 'context', 'req',
-  'res', 'next', 'err', 'msg', 'obj', 'arr', 'str', 'num', 'val',
-  'init', 'start', 'stop', 'run', 'main', 'test', 'setup', 'teardown',
-  'constructor', 'toString', 'valueOf', 'length', 'size', 'count',
-  'push', 'pop', 'shift', 'filter', 'reduce', 'forEach', 'find',
-  'log', 'warn', 'info', 'debug', 'trace',
-])
-
-/**
- * Build a directed graph from file tags.
- *
- * Nodes are file paths. An edge from A to B means file A references
- * a symbol defined in file B. Edge weight = refCount * idf(symbolName).
- */
-export function buildGraph(allFileTags: FileTags[]): Graph {
-  const graph = new Graph({ multi: false, type: 'directed' })
-
-  // Build a map from symbol name → files that define it
-  const defIndex = new Map<string, Set<string>>()
-  for (const ft of allFileTags) {
-    for (const tag of ft.tags) {
-      if (tag.kind === 'def') {
-        let files = defIndex.get(tag.name)
-        if (!files) {
-          files = new Set()
-          defIndex.set(tag.name, files)
-        }
-        files.add(ft.path)
-      }
-    }
-  }
-
-  // Compute IDF: log(totalFiles / filesDefiningSymbol)
-  // Common names get an extra penalty
-  const totalFiles = allFileTags.length
-  function idf(symbolName: string): number {
-    const defFiles = defIndex.get(symbolName)
-    const docFreq = defFiles ? defFiles.size : 1
-    const rawIdf = Math.log(totalFiles / docFreq)
-    return COMMON_NAMES.has(symbolName) ? rawIdf * 0.1 : rawIdf
-  }
-
-  // Add all files as nodes
-  for (const ft of allFileTags) {
-    if (!graph.hasNode(ft.path)) {
-      graph.addNode(ft.path)
-    }
-  }
-
-  // Build edges: for each ref in a file, find where it's defined
-  for (const ft of allFileTags) {
-    // Count refs per target file
-    const edgeWeights = new Map<string, number>()
-
-    for (const tag of ft.tags) {
-      if (tag.kind !== 'ref') continue
-
-      const defFiles = defIndex.get(tag.name)
-      if (!defFiles) continue
-
-      const weight = idf(tag.name)
-      for (const defFile of defFiles) {
-        if (defFile === ft.path) continue // skip self-references
-        const current = edgeWeights.get(defFile) ?? 0
-        edgeWeights.set(defFile, current + weight)
-      }
-    }
-
-    for (const [target, weight] of edgeWeights) {
-      if (graph.hasEdge(ft.path, target)) {
-        graph.setEdgeAttribute(ft.path, target, 'weight',
-          graph.getEdgeAttribute(ft.path, target, 'weight') + weight)
-      } else {
-        graph.addEdge(ft.path, target, { weight })
-      }
-    }
-  }
-
-  return graph
-}
--- a/src/context/repoMap/index.ts
+++ b/src/context/repoMap/index.ts
@@ -1,144 +0,0 @@
-import {
-  computeMapHash,
-  getCachedTags,
-  getCacheStats as getCacheStatsImpl,
-  invalidateCache as invalidateCacheImpl,
-  loadCache,
-  saveCache,
-  setCachedTags,
-} from './cache.js'
-import { getRepoFiles } from './gitFiles.js'
-import { buildGraph } from './graph.js'
-import { rankFiles } from './pagerank.js'
-import { initParser } from './parser.js'
-import { renderMap } from './renderer.js'
-import { extractTags } from './symbolExtractor.js'
-import type { FileTags, RepoMapOptions, RepoMapResult, CacheStats } from './types.js'
-
-const DEFAULT_MAX_TOKENS = 2048
-
-/**
- * Build a structural summary of a code repository.
- *
- * Walks the repo, extracts symbols via tree-sitter, builds an IDF-weighted
- * reference graph, ranks files with PageRank, and renders a token-budgeted
- * structural summary.
- */
-export async function buildRepoMap(options: RepoMapOptions = {}): Promise<RepoMapResult> {
-  const startTime = Date.now()
-  const root = options.root ?? process.cwd()
-  const maxTokens = options.maxTokens ?? DEFAULT_MAX_TOKENS
-  const focusFiles = options.focusFiles ?? []
-
-  // Initialize tree-sitter
-  await initParser()
-
-  // Get files
-  const files = options.files ?? await getRepoFiles(root)
-  const totalFileCount = files.length
-
-  // Check if we have a cached rendered map
-  const mapHash = computeMapHash(files, maxTokens, focusFiles)
-  const cache = loadCache(root)
-
-  // Check if rendered map is cached (stored as a special entry)
-  const renderedCacheKey = `__rendered__${mapHash}`
-  const renderedEntry = cache.entries[renderedCacheKey]
-  if (renderedEntry && renderedEntry.tags.length === 1) {
-    const cachedResult = renderedEntry.tags[0]!
-    // The cached "tag" stores the rendered map in the signature field
-    // and metadata in name/line fields
-    try {
-      const meta = JSON.parse(cachedResult.name)
-      return {
-        map: cachedResult.signature,
-        cacheHit: true,
-        buildTimeMs: Date.now() - startTime,
-        fileCount: meta.fileCount ?? 0,
-        totalFileCount,
-        tokenCount: meta.tokenCount ?? 0,
-      }
-    } catch {
-      // Invalid cached data, continue with full build
-    }
-  }
-
-  // Extract tags for all files (using per-file cache).
-  // Separate cached hits from files needing extraction.
-  const allFileTags: FileTags[] = []
-  const uncachedFiles: string[] = []
-
-  for (const file of files) {
-    const cachedTags = getCachedTags(cache, file, root)
-    if (cachedTags) {
-      allFileTags.push({ path: file, tags: cachedTags })
-    } else {
-      uncachedFiles.push(file)
-    }
-  }
-
-  // Process uncached files in parallel batches
-  const BATCH_SIZE = 50
-  for (let i = 0; i < uncachedFiles.length; i += BATCH_SIZE) {
-    const batch = uncachedFiles.slice(i, i + BATCH_SIZE)
-    const results = await Promise.all(
-      batch.map(file => extractTags(file, root).catch(() => null))
-    )
-    for (let j = 0; j < results.length; j++) {
-      const fileTags = results[j]
-      if (fileTags) {
-        allFileTags.push(fileTags)
-        setCachedTags(cache, fileTags.path, root, fileTags.tags)
-      }
-    }
-  }
-
-  // Build graph and rank
-  const graph = buildGraph(allFileTags)
-  const ranked = rankFiles(graph, focusFiles)
-
-  // Build a lookup map
-  const fileTagsMap = new Map<string, FileTags>()
-  for (const ft of allFileTags) {
-    fileTagsMap.set(ft.path, ft)
-  }
-
-  // Render
-  const { map, tokenCount, fileCount } = renderMap(ranked, fileTagsMap, maxTokens)
-
-  // Cache the rendered result
-  cache.entries[renderedCacheKey] = {
-    tags: [{
-      kind: 'def',
-      name: JSON.stringify({ fileCount, tokenCount }),
-      line: 0,
-      signature: map,
-    }],
-    mtimeMs: Date.now(),
-    size: 0,
-  }
-
-  saveCache(root, cache)
-
-  return {
-    map,
-    cacheHit: false,
-    buildTimeMs: Date.now() - startTime,
-    fileCount,
-    totalFileCount,
-    tokenCount,
-  }
-}
-
-/** Invalidate the disk cache for a given repo root. */
-export function invalidateCache(root?: string): void {
-  invalidateCacheImpl(root ?? process.cwd())
-}
-
-/** Get cache statistics for a given repo root. */
-export function getCacheStats(root?: string): CacheStats {
-  return getCacheStatsImpl(root ?? process.cwd())
-}
-
-// Re-export types for convenience
-export type { RepoMapOptions, RepoMapResult, CacheStats } from './types.js'
--- a/src/context/repoMap/pagerank.ts
+++ b/src/context/repoMap/pagerank.ts
@@ -1,57 +0,0 @@
-import type Graph from 'graphology'
-import pagerank from 'graphology-pagerank'
-
-export interface RankedFile {
-  path: string
-  score: number
-}
-
-/**
- * Run PageRank on the file reference graph.
- *
- * focusFiles get a 100x boost in the personalization vector so they
- * and their neighbors rank higher.
- *
- * Returns files sorted by score descending.
- */
-export function rankFiles(
-  graph: Graph,
-  focusFiles: string[] = [],
-): RankedFile[] {
-  if (graph.order === 0) return []
-
-  const hasPersonalization = focusFiles.length > 0
-
-  // graphology-pagerank accepts getEdgeWeight option
-  const scores: Record<string, number> = pagerank(graph, {
-    alpha: 0.85,
-    maxIterations: 100,
-    tolerance: 1e-6,
-    getEdgeWeight: 'weight',
-  })
-
-  // Apply focus boost post-hoc if focus files are specified
-  if (hasPersonalization) {
-    for (const file of focusFiles) {
-      if (scores[file] !== undefined) {
-        scores[file] *= 100
-      }
-    }
-
-    // Also boost direct neighbors of focus files
-    for (const file of focusFiles) {
-      if (!graph.hasNode(file)) continue
-      graph.forEachNeighbor(file, (neighbor) => {
-        if (scores[neighbor] !== undefined) {
-          scores[neighbor] *= 10
-        }
-      })
-    }
-  }
-
-  const ranked: RankedFile[] = Object.entries(scores)
-    .map(([path, score]) => ({ path, score }))
-    .sort((a, b) => b.score - a.score)
-
-  return ranked
-}
--- a/src/context/repoMap/parser.ts
+++ b/src/context/repoMap/parser.ts
@@ -1,166 +0,0 @@
-import { existsSync, readFileSync } from 'fs'
-import { join, resolve } from 'path'
-import { fileURLToPath } from 'url'
-import type { SupportedLanguage } from './types.js'
-
-// Resolve project root in both source and bundled modes.
-// In source (bun test/dev): import.meta.url is src/context/repoMap/parser.ts → go up 4 levels
-// In bundle (node dist/cli.mjs): import.meta.url is dist/cli.mjs → go up 2 levels
-const __filename = fileURLToPath(import.meta.url)
-const __projectRoot = join(
-  __filename,
-  process.env.NODE_ENV === 'test' ? '../../../../' : '../../',
-)
-
-// web-tree-sitter types
-type TreeSitterParser = {
-  parse(input: string): { rootNode: unknown }
-  setLanguage(lang: unknown): void
-  delete(): void
-}
-
-type TreeSitterLanguage = {
-  query(source: string): unknown
-}
-
-// The actual module exports { Parser, Language } as named exports
-let ParserClass: (new () => TreeSitterParser) & {
-  init(opts?: { locateFile?: (file: string) => string }): Promise<void>
-} | null = null
-let LanguageLoader: {
-  load(path: string | Uint8Array): Promise<TreeSitterLanguage>
-} | null = null
-
-let initialized = false
-const languageCache = new Map<SupportedLanguage, TreeSitterLanguage>()
-const queryCache = new Map<SupportedLanguage, string>()
-
-/** Resolve the path to the tree-sitter WASM file. */
-function getTreeSitterWasmPath(): string {
-  // Try require.resolve first (works in source mode with node_modules)
-  try {
-    const webTsDir = resolve(
-      require.resolve('web-tree-sitter/package.json'),
-      '..',
-    )
-    return join(webTsDir, 'tree-sitter.wasm')
-  } catch {
-    // Fallback: relative to project root
-    return join(__projectRoot, 'node_modules', 'web-tree-sitter', 'tree-sitter.wasm')
-  }
-}
-
-/** Resolve the path to a language WASM grammar file. */
-function getLanguageWasmPath(language: SupportedLanguage): string {
-  const wasmName = language === 'typescript' ? 'tree-sitter-typescript' :
-    language === 'javascript' ? 'tree-sitter-javascript' :
-      `tree-sitter-${language}`
-
-  try {
-    const wasmDir = resolve(
-      require.resolve('tree-sitter-wasms/package.json'),
-      '..',
-      'out',
-    )
-    return join(wasmDir, `${wasmName}.wasm`)
-  } catch {
-    return join(__projectRoot, 'node_modules', 'tree-sitter-wasms', 'out', `${wasmName}.wasm`)
-  }
-}
-
-/** Resolve the path to a tag query .scm file for the given language. */
-function getQueryPath(language: SupportedLanguage): string {
-  // Try source location first (works in both source and when queries are alongside the bundle)
-  const sourcePath = join(__projectRoot, 'src', 'context', 'repoMap', 'queries', `${language}-tags.scm`)
-  if (existsSync(sourcePath)) {
-    return sourcePath
-  }
-  // Fallback: relative to this file (source mode)
-  return join(fileURLToPath(import.meta.url), '..', 'queries', `${language}-tags.scm`)
-}
-
-/** Initialize the tree-sitter WASM module. */
-export async function initParser(): Promise<void> {
-  if (initialized) return
-
-  try {
-    const mod = await import('web-tree-sitter')
-    ParserClass = mod.Parser as typeof ParserClass
-    LanguageLoader = mod.Language as typeof LanguageLoader
-
-    const wasmPath = getTreeSitterWasmPath()
-    await ParserClass!.init({
-      locateFile: () => wasmPath,
-    })
-    initialized = true
-  } catch (err) {
-    // eslint-disable-next-line no-console
-    console.error('[repoMap] Failed to initialize tree-sitter:', err)
-    throw err
-  }
-}
-
-/** Load a language grammar. Cached after first load. */
-export async function loadLanguage(language: SupportedLanguage): Promise<TreeSitterLanguage | null> {
-  if (languageCache.has(language)) {
-    return languageCache.get(language)!
-  }
-
-  if (!initialized) {
-    await initParser()
-  }
-
-  try {
-    const wasmPath = getLanguageWasmPath(language)
-    const lang = await LanguageLoader!.load(wasmPath)
-    languageCache.set(language, lang)
-    return lang
-  } catch (err) {
-    // eslint-disable-next-line no-console
-    console.error(`[repoMap] Failed to load ${language} grammar:`, err)
-    return null
-  }
-}
-
-/** Load the tag query for a language. Cached after first load. */
-export function loadQuery(language: SupportedLanguage): string | null {
-  if (queryCache.has(language)) {
-    return queryCache.get(language)!
-  }
-
-  try {
-    const queryPath = getQueryPath(language)
-    const content = readFileSync(queryPath, 'utf-8')
-    queryCache.set(language, content)
-    return content
-  } catch {
-    return null
-  }
-}
-
-/** Create a new parser instance with the given language set. */
-export async function createParser(language: SupportedLanguage): Promise<TreeSitterParser | null> {
-  if (!initialized) {
-    await initParser()
-  }
-
-  const lang = await loadLanguage(language)
-  if (!lang) return null
-
-  try {
-    const parser = new ParserClass!()
-    parser.setLanguage(lang)
-    return parser
-  } catch {
-    return null
-  }
-}
-
-/** Clear all caches (useful for testing). */
-export function clearParserCaches(): void {
-  languageCache.clear()
-  queryCache.clear()
-  initialized = false
-  ParserClass = null
-  LanguageLoader = null
-}
--- a/src/context/repoMap/queries/javascript-tags.scm
+++ b/src/context/repoMap/queries/javascript-tags.scm
@@ -1,92 +0,0 @@
-; Source: https://github.com/Aider-AI/aider/blob/main/aider/queries/tree-sitter-languages/javascript-tags.scm
-; License: MIT (Apache-2.0 dual) — see https://github.com/Aider-AI/aider/blob/main/LICENSE
-; Copied for use in openclaude's repo-map feature.
-
-(
-  (comment)* @doc
-  .
-  (method_definition
-    name: (property_identifier) @name.definition.method) @definition.method
-  (#not-eq? @name.definition.method "constructor")
-  (#strip! @doc "^[\\s\\*/]+|^[\\s\\*/]$")
-  (#select-adjacent! @doc @definition.method)
-)
-
-(
-  (comment)* @doc
-  .
-  [
-    (class
-      name: (_) @name.definition.class)
-    (class_declaration
-      name: (_) @name.definition.class)
-  ] @definition.class
-  (#strip! @doc "^[\\s\\*/]+|^[\\s\\*/]$")
-  (#select-adjacent! @doc @definition.class)
-)
-
-(
-  (comment)* @doc
-  .
-  [
-    (function
-      name: (identifier) @name.definition.function)
-    (function_declaration
-      name: (identifier) @name.definition.function)
-    (generator_function
-      name: (identifier) @name.definition.function)
-    (generator_function_declaration
-      name: (identifier) @name.definition.function)
-  ] @definition.function
-  (#strip! @doc "^[\\s\\*/]+|^[\\s\\*/]$")
-  (#select-adjacent! @doc @definition.function)
-)
-
-(
-  (comment)* @doc
-  .
-  (lexical_declaration
-    (variable_declarator
-      name: (identifier) @name.definition.function
-      value: [(arrow_function) (function)]) @definition.function)
-  (#strip! @doc "^[\\s\\*/]+|^[\\s\\*/]$")
-  (#select-adjacent! @doc @definition.function)
-)
-
-(
-  (comment)* @doc
-  .
-  (variable_declaration
-    (variable_declarator
-      name: (identifier) @name.definition.function
-      value: [(arrow_function) (function)]) @definition.function)
-  (#strip! @doc "^[\\s\\*/]+|^[\\s\\*/]$")
-  (#select-adjacent! @doc @definition.function)
-)
-
-(assignment_expression
-  left: [
-    (identifier) @name.definition.function
-    (member_expression
-      property: (property_identifier) @name.definition.function)
-  ]
-  right: [(arrow_function) (function)]
-) @definition.function
-
-(pair
-  key: (property_identifier) @name.definition.function
-  value: [(arrow_function) (function)]) @definition.function
-
-(
-  (call_expression
-    function: (identifier) @name.reference.call) @reference.call
-  (#not-match? @name.reference.call "^(require)$")
-)
-
-(call_expression
-  function: (member_expression
-    property: (property_identifier) @name.reference.call)
-  arguments: (_) @reference.call)
-
-(new_expression
-  constructor: (_) @name.reference.class) @reference.class
--- a/src/context/repoMap/queries/python-tags.scm
+++ b/src/context/repoMap/queries/python-tags.scm
@@ -1,16 +0,0 @@
-; Source: https://github.com/Aider-AI/aider/blob/main/aider/queries/tree-sitter-languages/python-tags.scm
-; License: MIT (Apache-2.0 dual) — see https://github.com/Aider-AI/aider/blob/main/LICENSE
-; Copied for use in openclaude's repo-map feature.
-
-(class_definition
-  name: (identifier) @name.definition.class) @definition.class
-
-(function_definition
-  name: (identifier) @name.definition.function) @definition.function
-
-(call
-  function: [
-      (identifier) @name.reference.call
-      (attribute
-        attribute: (identifier) @name.reference.call)
-  ]) @reference.call
--- a/src/context/repoMap/queries/typescript-tags.scm
+++ b/src/context/repoMap/queries/typescript-tags.scm
@@ -1,45 +0,0 @@
-; Source: https://github.com/Aider-AI/aider/blob/main/aider/queries/tree-sitter-languages/typescript-tags.scm
-; License: MIT (Apache-2.0 dual) — see https://github.com/Aider-AI/aider/blob/main/LICENSE
-; Copied for use in openclaude's repo-map feature.
-
-(function_signature
-  name: (identifier) @name.definition.function) @definition.function
-
-(method_signature
-  name: (property_identifier) @name.definition.method) @definition.method
-
-(abstract_method_signature
-  name: (property_identifier) @name.definition.method) @definition.method
-
-(abstract_class_declaration
-  name: (type_identifier) @name.definition.class) @definition.class
-
-(module
-  name: (identifier) @name.definition.module) @definition.module
-
-(interface_declaration
-  name: (type_identifier) @name.definition.interface) @definition.interface
-
-(type_annotation
-  (type_identifier) @name.reference.type) @reference.type
-
-(new_expression
-  constructor: (identifier) @name.reference.class) @reference.class
-
-(function_declaration
-  name: (identifier) @name.definition.function) @definition.function
-
-(method_definition
-  name: (property_identifier) @name.definition.method) @definition.method
-
-(class_declaration
-  name: (type_identifier) @name.definition.class) @definition.class
-
-(interface_declaration
-  name: (type_identifier) @name.definition.class) @definition.class
-
-(type_alias_declaration
-  name: (type_identifier) @name.definition.type) @definition.type
-
-(enum_declaration
-  name: (identifier) @name.definition.enum) @definition.enum
--- a/src/context/repoMap/renderer.ts
+++ b/src/context/repoMap/renderer.ts
@@ -1,72 +0,0 @@
-import type { FileTags, Tag } from './types.js'
-import type { RankedFile } from './pagerank.js'
-import { countTokens } from './tokenize.js'
-
-/**
- * Render a token-budgeted repo map from ranked files and their tags.
- *
- * Format per file:
- *   path/to/file.ts:
- *   ⋮
- *     signature line for def 1
- *   ⋮
- *     signature line for def 2
- *   ⋮
- *
- * Files that don't fit within the budget are dropped entirely.
- */
-export function renderMap(
-  rankedFiles: RankedFile[],
-  fileTagsMap: Map<string, FileTags>,
-  maxTokens: number,
-): { map: string; tokenCount: number; fileCount: number } {
-  const sections: string[] = []
-  let currentTokens = 0
-  let fileCount = 0
-
-  for (const { path } of rankedFiles) {
-    const ft = fileTagsMap.get(path)
-    if (!ft) continue
-
-    // Only include definitions in the rendered output
-    const defs = ft.tags
-      .filter(t => t.kind === 'def')
-      .sort((a, b) => a.line - b.line)
-
-    if (defs.length === 0) continue
-
-    const section = renderFileSection(path, defs)
-    const sectionTokens = countTokens(section)
-
-    // Would this section bust the budget?
-    if (currentTokens + sectionTokens > maxTokens) {
-      // Don't include partial files — drop entirely
-      break
-    }
-
-    sections.push(section)
-    currentTokens += sectionTokens
-    fileCount++
-  }
-
-  const map = sections.join('\n')
-  return { map, tokenCount: currentTokens, fileCount }
-}
-
-function renderFileSection(path: string, defs: Tag[]): string {
-  const lines: string[] = [`${path}:`]
-  let lastLine = 0
-
-  for (const def of defs) {
-    // Add elision marker if there's a gap
-    if (def.line > lastLine + 1) {
-      lines.push('⋮')
-    }
-    lines.push(`  ${def.signature}`)
-    lastLine = def.line
-  }
-
-  // Trailing elision marker
-  lines.push('⋮')
-  return lines.join('\n')
-}
--- a/src/context/repoMap/repoMap.test.ts
+++ b/src/context/repoMap/repoMap.test.ts
@@ -1,275 +0,0 @@
-import { afterEach, beforeAll, describe, expect, test } from 'bun:test'
-import { cpSync, mkdtempSync, rmSync, utimesSync, writeFileSync } from 'fs'
-import { tmpdir } from 'os'
-import { join } from 'path'
-import { invalidateCache, buildRepoMap } from './index.js'
-import { extractTags } from './symbolExtractor.js'
-import { buildGraph } from './graph.js'
-import { initParser } from './parser.js'
-import { countTokens } from './tokenize.js'
-
-const FIXTURE_ROOT = join(import.meta.dir, '__fixtures__', 'mini-repo')
-const FIXTURE_FILES = ['fileA.ts', 'fileB.ts', 'fileC.ts', 'fileD.ts', 'fileE.ts']
-
-beforeAll(async () => {
-  await initParser()
-})
-
-// Clean up cache between tests to avoid cross-test interference
-afterEach(() => {
-  invalidateCache(FIXTURE_ROOT)
-})
-
-describe('symbol extraction', () => {
-  test('extracts function and class defs from a TypeScript file', async () => {
-    const result = await extractTags('fileC.ts', FIXTURE_ROOT)
-    expect(result).not.toBeNull()
-
-    const defs = result!.tags.filter(t => t.kind === 'def')
-    const defNames = defs.map(t => t.name)
-
-    expect(defNames).toContain('DataStore')
-    expect(defNames).toContain('createStore')
-    expect(defNames).toContain('StoreConfig')
-
-    // All defs should have kind='def'
-    for (const d of defs) {
-      expect(d.kind).toBe('def')
-    }
-  })
-
-  test('extracts references to imported symbols', async () => {
-    const result = await extractTags('fileA.ts', FIXTURE_ROOT)
-    expect(result).not.toBeNull()
-
-    const refs = result!.tags.filter(t => t.kind === 'ref')
-    const refNames = refs.map(t => t.name)
-
-    // fileA imports CacheLayer from fileB and StoreConfig from fileC
-    expect(refNames).toContain('CacheLayer')
-    expect(refNames).toContain('StoreConfig')
-  })
-})
-
-describe('graph', () => {
-  test('builds edges between files that reference each other\'s symbols', async () => {
-    const allTags = []
-    for (const f of FIXTURE_FILES) {
-      const tags = await extractTags(f, FIXTURE_ROOT)
-      if (tags) allTags.push(tags)
-    }
-
-    const graph = buildGraph(allTags)
-
-    // fileA imports from fileB (references CacheLayer defined in fileB)
-    expect(graph.hasEdge('fileA.ts', 'fileB.ts')).toBe(true)
-
-    // fileA imports from fileC (references StoreConfig, DataStore defined in fileC)
-    expect(graph.hasEdge('fileA.ts', 'fileC.ts')).toBe(true)
-
-    // fileB imports from fileC (references DataStore defined in fileC)
-    expect(graph.hasEdge('fileB.ts', 'fileC.ts')).toBe(true)
-
-    // fileD imports from fileA
-    expect(graph.hasEdge('fileD.ts', 'fileA.ts')).toBe(true)
-
-    // fileE is isolated — no edges to/from it
-    expect(graph.degree('fileE.ts')).toBe(0)
-  })
-})
-
-describe('pagerank', () => {
-  test('ranks the most-imported file highest', async () => {
-    const result = await buildRepoMap({
-      root: FIXTURE_ROOT,
-      maxTokens: 2048,
-      files: FIXTURE_FILES,
-    })
-
-    // The map starts with the highest-ranked file
-    const firstFile = result.map.split('\n')[0]
-    expect(firstFile).toBe('fileC.ts:')
-
-    // fileE should be ranked lowest (or near last)
-    const lines = result.map.split('\n')
-    const filePositions = FIXTURE_FILES.map(f => {
-      const idx = lines.findIndex(l => l === `${f}:`)
-      return { file: f, position: idx }
-    }).filter(x => x.position >= 0)
-    .sort((a, b) => a.position - b.position)
-
-    // fileC should be first
-    expect(filePositions[0]!.file).toBe('fileC.ts')
-
-    // fileE should be last (or among the last)
-    const lastFile = filePositions[filePositions.length - 1]!.file
-    expect(['fileD.ts', 'fileE.ts']).toContain(lastFile)
-  })
-})
-
-describe('renderer', () => {
-  test('respects the token budget within 5%', async () => {
-    const maxTokens = 500
-    const result = await buildRepoMap({
-      root: FIXTURE_ROOT,
-      maxTokens,
-      files: FIXTURE_FILES,
-    })
-
-    const actualTokens = countTokens(result.map)
-    expect(actualTokens).toBeLessThanOrEqual(maxTokens * 1.05)
-    expect(result.tokenCount).toBeLessThanOrEqual(maxTokens * 1.05)
-  })
-
-  test('drops files that don\'t fit rather than listing their names', async () => {
-    // Very tight budget — should only fit 1-2 files
-    const result = await buildRepoMap({
-      root: FIXTURE_ROOT,
-      maxTokens: 100,
-      files: FIXTURE_FILES,
-    })
-
-    // Count how many files appear as headers in the output
-    const fileHeaders = result.map.split('\n').filter(l => l.endsWith(':') && !l.startsWith(' '))
-
-    // Every file header in the output should have its signatures listed
-    for (const header of fileHeaders) {
-      // The file must have at least one signature line after it
-      const headerIdx = result.map.indexOf(header)
-      const afterHeader = result.map.slice(headerIdx + header.length)
-      // Should have content (signatures), not just the filename
-      expect(afterHeader.trim().length).toBeGreaterThan(0)
-    }
-
-    // Should have fewer files than total
-    expect(fileHeaders.length).toBeLessThan(FIXTURE_FILES.length)
-  })
-})
-
-describe('cache', () => {
-  test('second build of unchanged fixture uses the cache', async () => {
-    // First build (cold)
-    const result1 = await buildRepoMap({
-      root: FIXTURE_ROOT,
-      maxTokens: 2048,
-      files: FIXTURE_FILES,
-    })
-    expect(result1.cacheHit).toBe(false)
-
-    // Second build (warm)
-    const result2 = await buildRepoMap({
-      root: FIXTURE_ROOT,
-      maxTokens: 2048,
-      files: FIXTURE_FILES,
-    })
-    expect(result2.cacheHit).toBe(true)
-    expect(result2.buildTimeMs).toBeLessThan(result1.buildTimeMs)
-
-    // Output should be identical
-    expect(result2.map).toBe(result1.map)
-  })
-
-  test('modifying a file invalidates only that file', async () => {
-    // Create a temp copy of the fixture
-    const tempDir = mkdtempSync(join(tmpdir(), 'repomap-test-'))
-    try {
-      for (const f of FIXTURE_FILES) {
-        cpSync(join(FIXTURE_ROOT, f), join(tempDir, f))
-      }
-
-      // First build
-      const result1 = await buildRepoMap({
-        root: tempDir,
-        maxTokens: 2048,
-        files: FIXTURE_FILES,
-      })
-      expect(result1.cacheHit).toBe(false)
-
-      // Touch one file to change its mtime
-      const targetFile = join(tempDir, 'fileE.ts')
-      const now = new Date()
-      utimesSync(targetFile, now, now)
-
-      // Second build — rendered cache should be invalidated because file list hash
-      // includes the files and the rendered map hash changes with different mtimes
-      // for the per-file cache check
-      invalidateCache(tempDir)
-      const result2 = await buildRepoMap({
-        root: tempDir,
-        maxTokens: 2048,
-        files: FIXTURE_FILES,
-      })
-      // The per-file cache for fileE should miss (mtime changed),
-      // but other files should still hit the per-file cache
-      expect(result2.cacheHit).toBe(false)
-
-      // Output should still be valid
-      expect(result2.map.length).toBeGreaterThan(0)
-      expect(result2.fileCount).toBe(result1.fileCount)
-    } finally {
-      rmSync(tempDir, { recursive: true, force: true })
-      invalidateCache(tempDir)
-    }
-  })
-})
-
-describe('gitFiles', () => {
-  test('falls back gracefully when not in a git repo', async () => {
-    // Create a temp directory with source files but NO .git
-    const tempDir = mkdtempSync(join(tmpdir(), 'repomap-nogit-'))
-    try {
-      writeFileSync(
-        join(tempDir, 'hello.ts'),
-        'export function hello(): string { return "world" }\n',
-      )
-      writeFileSync(
-        join(tempDir, 'utils.ts'),
-        'export function add(a: number, b: number): number { return a + b }\n',
-      )
-
-      const result = await buildRepoMap({
-        root: tempDir,
-        maxTokens: 1024,
-      })
-
-      // Should succeed without throwing
-      expect(result.map.length).toBeGreaterThan(0)
-      expect(result.totalFileCount).toBeGreaterThan(0)
-    } finally {
-      rmSync(tempDir, { recursive: true, force: true })
-      invalidateCache(tempDir)
-    }
-  })
-})
-
-describe('error handling', () => {
-  test('no crash on malformed source file', async () => {
-    const tempDir = mkdtempSync(join(tmpdir(), 'repomap-malformed-'))
-    try {
-      // Valid file
-      writeFileSync(
-        join(tempDir, 'good.ts'),
-        'export function good(): number { return 1 }\n',
-      )
-      // Malformed file — severe syntax errors
-      writeFileSync(
-        join(tempDir, 'bad.ts'),
-        '}{}{}{export classclass [[[ function ,,, @@@ ###\n',
-      )
-
-      const result = await buildRepoMap({
-        root: tempDir,
-        maxTokens: 1024,
-        files: ['good.ts', 'bad.ts'],
-      })
-
-      // Should complete successfully
-      expect(result.map.length).toBeGreaterThan(0)
-      // The good file should be in the output
-      expect(result.map).toContain('good.ts')
-    } finally {
-      rmSync(tempDir, { recursive: true, force: true })
-      invalidateCache(tempDir)
-    }
-  })
-})
--- a/src/context/repoMap/symbolExtractor.ts
+++ b/src/context/repoMap/symbolExtractor.ts
@@ -1,108 +0,0 @@
-import { readFileSync } from 'fs'
-import { join } from 'path'
-import { getLanguageForFile } from './gitFiles.js'
-import { createParser, loadLanguage, loadQuery } from './parser.js'
-import type { FileTags, Tag } from './types.js'
-
-/**
- * Extract definition and reference tags from a single source file.
- * Returns null if the file can't be parsed (unsupported language, parse error, etc).
- */
-export async function extractTags(
-  filePath: string,
-  root: string,
-): Promise<FileTags | null> {
-  const language = getLanguageForFile(filePath)
-  if (!language) return null
-
-  const absolutePath = join(root, filePath)
-  let source: string
-  try {
-    source = readFileSync(absolutePath, 'utf-8')
-  } catch {
-    return null
-  }
-
-  const lines = source.split('\n')
-
-  const parser = await createParser(language)
-  if (!parser) return null
-
-  const querySource = loadQuery(language)
-  if (!querySource) {
-    parser.delete()
-    return null
-  }
-
-  try {
-    const tree = parser.parse(source) as {
-      rootNode: unknown
-    }
-
-    const lang = await loadLanguage(language)
-    if (!lang) {
-      parser.delete()
-      return null
-    }
-
-    // Use the non-deprecated Query constructor
-    const { Query } = await import('web-tree-sitter')
-    const query = new Query(lang, querySource) as {
-      matches(rootNode: unknown): Array<{
-        pattern: number
-        captures: Array<{
-          name: string
-          node: {
-            text: string
-            startPosition: { row: number; column: number }
-            endPosition: { row: number; column: number }
-          }
-        }>
-      }>
-    }
-
-    const matches = query.matches(tree.rootNode)
-    const tags: Tag[] = []
-    const seen = new Set<string>() // dedup by kind+name+line
-
-    for (const match of matches) {
-      let name: string | null = null
-      let kind: 'def' | 'ref' | null = null
-      let subKind: string | undefined
-      let lineRow = 0
-
-      for (const capture of match.captures) {
-        const captureName = capture.name
-
-        // Name captures: name.definition.X or name.reference.X
-        if (captureName.startsWith('name.definition.')) {
-          name = capture.node.text
-          kind = 'def'
-          subKind = captureName.slice('name.definition.'.length)
-          lineRow = capture.node.startPosition.row
-        } else if (captureName.startsWith('name.reference.')) {
-          name = capture.node.text
-          kind = 'ref'
-          subKind = captureName.slice('name.reference.'.length)
-          lineRow = capture.node.startPosition.row
-        }
-      }
-
-      if (name && kind) {
-        const key = `${kind}:${name}:${lineRow}`
-        if (!seen.has(key)) {
-          seen.add(key)
-          const line = lineRow + 1 // convert 0-based to 1-based
-          const signature = lines[lineRow]?.trimEnd() ?? ''
-          tags.push({ kind, name, line, signature, subKind })
-        }
-      }
-    }
-
-    parser.delete()
-    return { path: filePath, tags }
-  } catch {
-    parser.delete()
-    return null
-  }
-}
--- a/src/context/repoMap/tokenize.ts
+++ b/src/context/repoMap/tokenize.ts
@@ -1,15 +0,0 @@
-import { getEncoding, type Tiktoken } from 'js-tiktoken'
-
-let encoder: Tiktoken | null = null
-
-function getEncoder() {
-  if (!encoder) {
-    encoder = getEncoding('cl100k_base')
-  }
-  return encoder
-}
-
-/** Count the number of tokens in a string using cl100k_base encoding. */
-export function countTokens(text: string): number {
-  return getEncoder().encode(text).length
-}
--- a/src/context/repoMap/types.ts
+++ b/src/context/repoMap/types.ts
@@ -1,65 +0,0 @@
-export interface Tag {
-  /** 'def' for definitions, 'ref' for references */
-  kind: 'def' | 'ref'
-  /** Symbol name (e.g. function name, class name) */
-  name: string
-  /** 1-based line number in the source file */
-  line: number
-  /** The full line of source code at this position (used as signature for defs) */
-  signature: string
-  /** Sub-kind from the query (e.g. 'function', 'class', 'method', 'type') */
-  subKind?: string
-}
-
-export interface FileTags {
-  /** Relative path from the repo root */
-  path: string
-  /** All tags extracted from this file */
-  tags: Tag[]
-}
-
-export interface RepoMapOptions {
-  /** Root directory of the repo (defaults to cwd) */
-  root?: string
-  /** Maximum token budget for the rendered map */
-  maxTokens?: number
-  /** Files to boost in PageRank (relative paths) */
-  focusFiles?: string[]
-  /** Override the list of files to process (relative paths) */
-  files?: string[]
-}
-
-export interface RepoMapResult {
-  /** The rendered repo map string */
-  map: string
-  /** Whether the result came from cache */
-  cacheHit: boolean
-  /** Time in milliseconds to build the map */
-  buildTimeMs: number
-  /** Number of files included in the rendered map */
-  fileCount: number
-  /** Total number of files processed */
-  totalFileCount: number
-  /** Actual token count of the rendered map */
-  tokenCount: number
-}
-
-export interface CacheEntry {
-  tags: Tag[]
-  mtimeMs: number
-  size: number
-}
-
-export interface CacheData {
-  version: number
-  entries: Record<string, CacheEntry>
-}
-
-export interface CacheStats {
-  cacheDir: string
-  cacheFile: string | null
-  entryCount: number
-  exists: boolean
-}
-
-export type SupportedLanguage = 'typescript' | 'javascript' | 'python'
--- a/src/entrypoints/cli.tsx
+++ b/src/entrypoints/cli.tsx
@@ -5,7 +5,7 @@ import {
 } from '../utils/providerProfile.js'
 import {
  getProviderValidationError,
-  validateProviderEnvOrExit,
+  validateProviderEnvForStartupOrExit,
 } from '../utils/providerValidation.js'

 // OpenClaude: polyfill globalThis.File for Node < 20.
@@ -132,7 +132,7 @@ async function main(): Promise<void> {
    hydrateGithubModelsTokenFromSecureStorage()
  }

-  await validateProviderEnvOrExit()
+  await validateProviderEnvForStartupOrExit()

  // Print the gradient startup screen before the Ink UI loads
  const { printStartupScreen } = await import('../components/StartupScreen.js')
--- a/src/entrypoints/mcp.test.ts
+++ b/src/entrypoints/mcp.test.ts
@@ -0,0 +1,75 @@
+import { describe, it, expect, mock } from 'bun:test'
+import { getCombinedTools, loadReexposedMcpTools } from './mcp.js'
+import type { Tool as InternalTool } from '../Tool.js'
+import type { MCPServerConnection } from '../services/mcp/types.js'
+import type { Tool } from '@modelcontextprotocol/sdk/types.js'
+
+// Mock the MCP client service to control the tools and connections returned
+const mockGetMcpToolsCommandsAndResources = mock(async (onConnectionAttempt: any) => {})
+mock.module('../services/mcp/client.js', () => ({
+  getMcpToolsCommandsAndResources: mockGetMcpToolsCommandsAndResources
+}))
+
+describe('getCombinedTools', () => {
+  it('deduplicates builtins when mcpTools have the same name, prioritizing mcpTools', () => {
+    const builtinBash = { name: 'Bash', isMcp: false } as unknown as InternalTool
+    const builtinRead = { name: 'Read', isMcp: false } as unknown as InternalTool
+    const mcpBash = { name: 'Bash', isMcp: true } as unknown as InternalTool
+
+    const builtins = [builtinBash, builtinRead]
+    const mcpTools = [mcpBash]
+
+    const result = getCombinedTools(builtins, mcpTools)
+
+    expect(result).toHaveLength(2)
+    expect(result[0]).toBe(mcpBash)
+    expect(result[1]).toBe(builtinRead)
+  })
+})
+
+describe('loadReexposedMcpTools', () => {
+  it('loads tools and clients regardless of connection state (including needs-auth)', async () => {
+    // Setup the mock to simulate yielding a needs-auth server and a connected server
+    mockGetMcpToolsCommandsAndResources.mockImplementation(async (onConnectionAttempt) => {
+      const needsAuthClient = {
+        name: 'auth-server',
+        type: 'needs-auth',
+        config: {}
+      } as MCPServerConnection
+
+      const authTool = {
+        name: 'mcp__auth-server__authenticate',
+        isMcp: true
+      } as unknown as InternalTool
+
+      const connectedClient = {
+        name: 'connected-server',
+        type: 'connected',
+        config: {},
+        client: {}
+      } as MCPServerConnection
+
+      const connectedTool = {
+        name: 'mcp__connected-server__do_thing',
+        isMcp: true
+      } as unknown as InternalTool
+
+      // Simulate the callback behavior
+      onConnectionAttempt({ client: needsAuthClient, tools: [authTool], commands: [] })
+      onConnectionAttempt({ client: connectedClient, tools: [connectedTool], commands: [] })
+    })
+
+    const { mcpClients, mcpTools } = await loadReexposedMcpTools()
+
+    expect(mcpClients).toHaveLength(2)
+    expect(mcpClients[0].type).toBe('needs-auth')
+    expect(mcpClients[1].type).toBe('connected')
+
+    expect(mcpTools).toHaveLength(2)
+    expect(mcpTools[0].name).toBe('mcp__auth-server__authenticate')
+    expect(mcpTools[1].name).toBe('mcp__connected-server__do_thing')
+
+    // Reset mock for other tests
+    mockGetMcpToolsCommandsAndResources.mockReset()
+  })
+})
--- a/src/entrypoints/mcp.ts
+++ b/src/entrypoints/mcp.ts
@@ -7,6 +7,7 @@ process.env.CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS ??= 'true'

 import { Server } from '@modelcontextprotocol/sdk/server/index.js'
 import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
+import { ZodError } from 'zod'
 import {
  CallToolRequestSchema,
  type CallToolResult,
@@ -17,9 +18,12 @@ import {
 import { getDefaultAppState } from 'src/state/AppStateStore.js'
 import review from '../commands/review.js'
 import type { Command } from '../commands.js'
+import { getMcpToolsCommandsAndResources } from '../services/mcp/client.js'
+import type { MCPServerConnection } from '../services/mcp/types.js'
 import {
  findToolByName,
  getEmptyToolPermissionContext,
+  type Tool as InternalTool,
  type ToolUseContext,
 } from '../Tool.js'
 import { getTools } from '../tools.js'
@@ -39,6 +43,32 @@ type ToolOutput = Tool['outputSchema']

 const MCP_COMMANDS: Command[] = [review]

+export function getCombinedTools(
+  builtins: InternalTool[],
+  mcpTools: InternalTool[],
+): InternalTool[] {
+  const mcpToolNames = new Set(mcpTools.map(t => t.name))
+  const deduplicatedBuiltins = builtins.filter(t => !mcpToolNames.has(t.name))
+
+  return [...mcpTools, ...deduplicatedBuiltins]
+}
+
+export async function loadReexposedMcpTools(): Promise<{
+  mcpClients: MCPServerConnection[]
+  mcpTools: InternalTool[]
+}> {
+  const mcpClients: MCPServerConnection[] = []
+  const mcpTools: InternalTool[] = []
+
+  // Load configured MCP clients and their tools
+  await getMcpToolsCommandsAndResources(({ client, tools: clientTools }) => {
+    mcpClients.push(client)
+    mcpTools.push(...clientTools)
+  })
+
+  return { mcpClients, mcpTools }
+}
+
 export async function startMCPServer(
  cwd: string,
  debug: boolean,
@@ -63,12 +93,13 @@ export async function startMCPServer(
    },
  )

+  const { mcpClients, mcpTools } = await loadReexposedMcpTools()
+
  server.setRequestHandler(
    ListToolsRequestSchema,
    async (): Promise<ListToolsResult> => {
-      // TODO: Also re-expose any MCP tools
      const toolPermissionContext = getEmptyToolPermissionContext()
-      const tools = getTools(toolPermissionContext)
+      const tools = getCombinedTools(getTools(toolPermissionContext), mcpTools)
      return {
        tools: await Promise.all(
          tools.map(async tool => {
@@ -94,7 +125,7 @@ export async function startMCPServer(
                tools,
                agents: [],
              }),
-              inputSchema: zodToJsonSchema(tool.inputSchema) as ToolInput,
+              inputSchema: (tool.inputJSONSchema ?? zodToJsonSchema(tool.inputSchema)) as ToolInput,
              outputSchema,
            }
          }),
@@ -107,8 +138,7 @@ export async function startMCPServer(
    CallToolRequestSchema,
    async ({ params: { name, arguments: args } }): Promise<CallToolResult> => {
      const toolPermissionContext = getEmptyToolPermissionContext()
-      // TODO: Also re-expose any MCP tools
-      const tools = getTools(toolPermissionContext)
+      const tools = getCombinedTools(getTools(toolPermissionContext), mcpTools)
      const tool = findToolByName(tools, name)
      if (!tool) {
        throw new Error(`Tool ${name} not found`)
@@ -123,7 +153,7 @@ export async function startMCPServer(
          tools,
          mainLoopModel: getMainLoopModel(),
          thinkingConfig: { type: 'disabled' },
-          mcpClients: [],
+          mcpClients,
          mcpResources: {},
          isNonInteractiveSession: true,
          debug,
@@ -140,13 +170,16 @@ export async function startMCPServer(
        updateAttributionState: () => {},
      }

-      // TODO: validate input types with zod
      try {
        if (!tool.isEnabled()) {
          throw new Error(`Tool ${name} is not enabled`)
        }
+
+        // Validate input types with zod
+        const parsedArgs = tool.inputSchema.parse(args ?? {})
+
        const validationResult = await tool.validateInput?.(
-          (args as never) ?? {},
+          (parsedArgs as never) ?? {},
          toolUseContext,
        )
        if (validationResult && !validationResult.result) {
@@ -155,7 +188,7 @@ export async function startMCPServer(
          )
        }
        const finalResult = await tool.call(
-          (args ?? {}) as never,
+          (parsedArgs ?? {}) as never,
          toolUseContext,
          hasPermissionsToUseTool,
          createAssistantMessage({
@@ -163,20 +196,50 @@ export async function startMCPServer(
          }),
        )

+        let content: CallToolResult['content']
+        const data = finalResult.data as string | { type: string; text?: string; source?: { type: string; media_type: string; data: string } }[] | unknown
+
+        if (typeof data === 'string') {
+          content = [{ type: 'text', text: data }]
+        } else if (Array.isArray(data)) {
+          content = data.map((block: any) => {
+            if (block.type === 'text') {
+              return { type: 'text', text: block.text || '' }
+            } else if (block.type === 'image' && block.source) {
              return {
-          content: [
-            {
-              type: 'text' as const,
-              text:
-                typeof finalResult === 'string'
-                  ? finalResult
-                  : jsonStringify(finalResult.data),
-            },
-          ],
+                type: 'image',
+                data: block.source.data,
+                mimeType: block.source.media_type,
+              }
+            } else {
+              // eslint-disable-next-line custom-rules/no-top-level-side-effects, no-console
+              console.warn(`Unmapped content block type from tool ${name}: ${block.type || 'unknown'}`)
+              return { type: 'text', text: jsonStringify(block) }
+            }
+          }) as CallToolResult['content']
+        } else {
+          content = [{ type: 'text', text: jsonStringify(data) }]
+        }
+
+        return {
+          content,
+          isError: !!(finalResult as any).isError,
        }
      } catch (error) {
        logError(error)

+        if (error instanceof ZodError) {
+          return {
+            isError: true,
+            content: [
+              {
+                type: 'text',
+                text: `Tool ${name} input is invalid:\n${error.errors.map(e => `- ${e.path.join('.')}: ${e.message}`).join('\n')}`,
+              },
+            ],
+          }
+        }
+
        const parts =
          error instanceof Error ? getErrorParts(error) : [String(error)]
        const errorText = parts.filter(Boolean).join('\n').trim() || 'Error'
@@ -201,3 +264,4 @@ export async function startMCPServer(

  return await runServer()
 }
+
--- a/src/entrypoints/sandboxTypes.ts
+++ b/src/entrypoints/sandboxTypes.ts
@@ -114,8 +114,8 @@ export const SandboxSettingsSchema = lazySchema(() =>
        .boolean()
        .optional()
        .describe(
-          'Allow commands to run outside the sandbox via the dangerouslyDisableSandbox parameter. ' +
-            'When false, the dangerouslyDisableSandbox parameter is completely ignored and all commands must run sandboxed. ' +
+          'Allow trusted, user-initiated commands to run outside the sandbox. ' +
+            'When false, sandbox override requests are ignored and all commands must run sandboxed. ' +
            'Default: true.',
        ),
      network: SandboxNetworkConfigSchema(),
--- a/src/hooks/useOfficialMarketplaceNotification.tsx
+++ b/src/hooks/useOfficialMarketplaceNotification.tsx
@@ -19,7 +19,7 @@ async function _temp() {
    logForDebugging("Showing marketplace config save failure notification");
    notifs.push({
      key: "marketplace-config-save-failed",
-      jsx: <Text color="error">Failed to save marketplace retry info · Check ~/.claude.json permissions</Text>,
+      jsx: <Text color="error">Failed to save marketplace retry info · Check ~/.openclaude.json permissions</Text>,
      priority: "immediate",
      timeoutMs: 10000
    });
--- a/src/hooks/usePasteHandler.test.ts
+++ b/src/hooks/usePasteHandler.test.ts
@@ -1,5 +1,8 @@
 import { expect, test } from 'bun:test'
-import { supportsClipboardImageFallback } from './usePasteHandler.ts'
+import {
+  shouldHandleInputAsPaste,
+  supportsClipboardImageFallback,
+} from './usePasteHandler.ts'

 test('supports clipboard image fallback on Windows', () => {
  expect(supportsClipboardImageFallback('windows')).toBe(true)
@@ -20,3 +23,42 @@ test('does not support clipboard image fallback on WSL', () => {
 test('does not support clipboard image fallback on unknown platforms', () => {
  expect(supportsClipboardImageFallback('unknown')).toBe(false)
 })
+
+test('does not treat a bracketed paste as pending when no paste handlers are provided', () => {
+  expect(
+    shouldHandleInputAsPaste({
+      hasTextPasteHandler: false,
+      hasImagePasteHandler: false,
+      inputLength: 'kimi-k2.5'.length,
+      pastePending: false,
+      hasImageFilePath: false,
+      isFromPaste: true,
+    }),
+  ).toBe(false)
+})
+
+test('treats bracketed text paste as pending when a text paste handler exists', () => {
+  expect(
+    shouldHandleInputAsPaste({
+      hasTextPasteHandler: true,
+      hasImagePasteHandler: false,
+      inputLength: 'kimi-k2.5'.length,
+      pastePending: false,
+      hasImageFilePath: false,
+      isFromPaste: true,
+    }),
+  ).toBe(true)
+})
+
+test('treats image path paste as pending when only an image handler exists', () => {
+  expect(
+    shouldHandleInputAsPaste({
+      hasTextPasteHandler: false,
+      hasImagePasteHandler: true,
+      inputLength: 'C:\\Users\\jat\\image.png'.length,
+      pastePending: false,
+      hasImageFilePath: true,
+      isFromPaste: false,
+    }),
+  ).toBe(true)
+})
--- a/src/hooks/usePasteHandler.ts
+++ b/src/hooks/usePasteHandler.ts
@@ -35,6 +35,24 @@ type PasteHandlerProps = {
  ) => void
 }

+export function shouldHandleInputAsPaste(options: {
+  hasTextPasteHandler: boolean
+  hasImagePasteHandler: boolean
+  inputLength: number
+  pastePending: boolean
+  hasImageFilePath: boolean
+  isFromPaste: boolean
+}): boolean {
+  return (
+    (options.hasTextPasteHandler &&
+      (options.inputLength > PASTE_THRESHOLD ||
+        options.pastePending ||
+        options.hasImageFilePath ||
+        options.isFromPaste)) ||
+    (options.hasImagePasteHandler && options.hasImageFilePath)
+  )
+}
+
 export function usePasteHandler({
  onPaste,
  onInput,
@@ -236,11 +254,6 @@ export function usePasteHandler({
    // The keypress parser sets isPasted=true for content within bracketed paste.
    const isFromPaste = event.keypress.isPasted

-    // If this is pasted content, set isPasting state for UI feedback
-    if (isFromPaste) {
-      setIsPasting(true)
-    }
-
    // Handle large pastes (>PASTE_THRESHOLD chars)
    // Usually we get one or two input characters at a time. If we
    // get more than the threshold, the user has probably pasted.
@@ -268,6 +281,7 @@ export function usePasteHandler({
      canFallbackToClipboardImage &&
      onImagePaste
    ) {
+      setIsPasting(true)
      checkClipboardForImage()
      // Reset isPasting since there's no text content to process
      setIsPasting(false)
@@ -275,14 +289,17 @@ export function usePasteHandler({
    }

    // Check if we should handle as paste (from bracketed paste, large input, or continuation)
-    const shouldHandleAsPaste =
-      onPaste &&
-      (input.length > PASTE_THRESHOLD ||
-        pastePendingRef.current ||
-        hasImageFilePath ||
-        isFromPaste)
+    const shouldHandleAsPaste = shouldHandleInputAsPaste({
+      hasTextPasteHandler: Boolean(onPaste),
+      hasImagePasteHandler: Boolean(onImagePaste),
+      inputLength: input.length,
+      pastePending: pastePendingRef.current,
+      hasImageFilePath,
+      isFromPaste,
+    })

    if (shouldHandleAsPaste) {
+      setIsPasting(true)
      pastePendingRef.current = true
      setPasteState(({ chunks, timeoutId }) => {
        return {
--- a/src/hooks/useSwarmPermissionPoller.ts
+++ b/src/hooks/useSwarmPermissionPoller.ts
@@ -1,34 +1,23 @@
 /**
- * Swarm Permission Poller Hook
+ * Swarm Permission Callback Registry
 *
- * This hook polls for permission responses from the team leader when running
- * as a worker agent in a swarm. When a response is received, it calls the
- * appropriate callback (onAllow/onReject) to continue execution.
+ * Manages callback registrations for permission requests and responses
+ * in agent swarms. Responses are delivered exclusively via the mailbox
+ * system (useInboxPoller → processMailboxPermissionResponse).
 *
- * This hook should be used in conjunction with the worker-side integration
- * in useCanUseTool.ts, which creates pending requests that this hook monitors.
+ * The legacy file-based polling (resolved/ directory) has been removed
+ * because it created an unauthenticated attack surface — any local process
+ * could forge approval files. The mailbox path is the sole active channel.
 */

-import { useCallback, useEffect, useRef } from 'react'
-import { useInterval } from 'usehooks-ts'
 import { logForDebugging } from '../utils/debug.js'
-import { errorMessage } from '../utils/errors.js'
 import {
  type PermissionUpdate,
  permissionUpdateSchema,
 } from '../utils/permissions/PermissionUpdateSchema.js'
-import {
-  isSwarmWorker,
-  type PermissionResponse,
-  pollForResponse,
-  removeWorkerResponse,
-} from '../utils/swarm/permissionSync.js'
-import { getAgentName, getTeamName } from '../utils/teammate.js'
-
-const POLL_INTERVAL_MS = 500

 /**
- * Validate permissionUpdates from external sources (mailbox IPC, disk polling).
+ * Validate permissionUpdates from external sources (mailbox IPC).
 * Malformed entries from buggy/old teammate processes are filtered out rather
 * than propagated unchecked into callback.onAllow().
 */
@@ -225,106 +214,9 @@ export function processSandboxPermissionResponse(params: {
  return true
 }

-/**
- * Process a permission response by invoking the registered callback
- */
-function processResponse(response: PermissionResponse): boolean {
-  const callback = pendingCallbacks.get(response.requestId)
-
-  if (!callback) {
-    logForDebugging(
-      `[SwarmPermissionPoller] No callback registered for request ${response.requestId}`,
-    )
-    return false
-  }
-
-  logForDebugging(
-    `[SwarmPermissionPoller] Processing response for request ${response.requestId}: ${response.decision}`,
-  )
-
-  // Remove from registry before invoking callback
-  pendingCallbacks.delete(response.requestId)
-
-  if (response.decision === 'approved') {
-    const permissionUpdates = parsePermissionUpdates(response.permissionUpdates)
-    const updatedInput = response.updatedInput
-    callback.onAllow(updatedInput, permissionUpdates)
-  } else {
-    callback.onReject(response.feedback)
-  }
-
-  return true
-}
-
-/**
- * Hook that polls for permission responses when running as a swarm worker.
- *
- * This hook:
- * 1. Only activates when isSwarmWorker() returns true
- * 2. Polls every 500ms for responses
- * 3. When a response is found, invokes the registered callback
- * 4. Cleans up the response file after processing
- */
-export function useSwarmPermissionPoller(): void {
-  const isProcessingRef = useRef(false)
-
-  const poll = useCallback(async () => {
-    // Don't poll if not a swarm worker
-    if (!isSwarmWorker()) {
-      return
-    }
-
-    // Prevent concurrent polling
-    if (isProcessingRef.current) {
-      return
-    }
-
-    // Don't poll if no callbacks are registered
-    if (pendingCallbacks.size === 0) {
-      return
-    }
-
-    isProcessingRef.current = true
-
-    try {
-      const agentName = getAgentName()
-      const teamName = getTeamName()
-
-      if (!agentName || !teamName) {
-        return
-      }
-
-      // Check each pending request for a response
-      for (const [requestId, _callback] of pendingCallbacks) {
-        const response = await pollForResponse(requestId, agentName, teamName)
-
-        if (response) {
-          // Process the response
-          const processed = processResponse(response)
-
-          if (processed) {
-            // Clean up the response from the worker's inbox
-            await removeWorkerResponse(requestId, agentName, teamName)
-          }
-        }
-      }
-    } catch (error) {
-      logForDebugging(
-        `[SwarmPermissionPoller] Error during poll: ${errorMessage(error)}`,
-      )
-    } finally {
-      isProcessingRef.current = false
-    }
-  }, [])
-
-  // Only poll if we're a swarm worker
-  const shouldPoll = isSwarmWorker()
-  useInterval(() => void poll(), shouldPoll ? POLL_INTERVAL_MS : null)
-
-  // Initial poll on mount
-  useEffect(() => {
-    if (isSwarmWorker()) {
-      void poll()
-    }
-  }, [poll])
-}
+// Legacy file-based polling (useSwarmPermissionPoller, processResponse)
+// has been removed. Permission responses are now delivered exclusively
+// via the mailbox system:
+//   Leader: sendPermissionResponseViaMailbox() → writeToMailbox()
+//   Worker: useInboxPoller → processMailboxPermissionResponse()
+// See: fix(security) — remove unauthenticated file-based permission channel
--- a/src/ink/termio/osc.test.ts
+++ b/src/ink/termio/osc.test.ts
@@ -11,14 +11,16 @@ const execFileNoThrowMock = mock(
  async () => ({ code: 0, stdout: '', stderr: '' }),
 )

-mock.module('../../utils/execFileNoThrow.js', () => ({
+function installOscMocks(): void {
+  mock.module('../../utils/execFileNoThrow.js', () => ({
    execFileNoThrow: execFileNoThrowMock,
    execFileNoThrowWithCwd: execFileNoThrowMock,
-}))
+  }))

-mock.module('../../utils/tempfile.js', () => ({
+  mock.module('../../utils/tempfile.js', () => ({
    generateTempFilePath: generateTempFilePathMock,
-}))
+  }))
+}

 async function importFreshOscModule() {
  return import(`./osc.ts?ts=${Date.now()}-${Math.random()}`)
@@ -45,6 +47,7 @@ async function waitForExecCall(

 describe('Windows clipboard fallback', () => {
  beforeEach(() => {
+    installOscMocks()
    execFileNoThrowMock.mockClear()
    generateTempFilePathMock.mockClear()
    process.env = { ...originalEnv }
@@ -62,14 +65,12 @@ describe('Windows clipboard fallback', () => {
    const { setClipboard } = await importFreshOscModule()

    await setClipboard('Привет мир')
-    await flushClipboardCopy()
+    const windowsCall = await waitForExecCall('powershell')

    expect(execFileNoThrowMock.mock.calls.some(([cmd]) => cmd === 'clip')).toBe(
      false,
    )
-    expect(
-      execFileNoThrowMock.mock.calls.some(([cmd]) => cmd === 'powershell'),
-    ).toBe(true)
+    expect(windowsCall).toBeDefined()
  })

  test('passes Windows clipboard text through a UTF-8 temp file instead of stdin', async () => {
@@ -97,6 +98,7 @@ describe('Windows clipboard fallback', () => {

 describe('clipboard path behavior remains stable', () => {
  beforeEach(() => {
+    installOscMocks()
    execFileNoThrowMock.mockClear()
    process.env = { ...originalEnv }
    delete process.env['SSH_CONNECTION']
--- a/src/migrations/resetAutoModeOptInForDefaultOffer.ts
+++ b/src/migrations/resetAutoModeOptInForDefaultOffer.ts
@@ -12,7 +12,7 @@ import {
 * One-shot migration: clear skipAutoPermissionPrompt for users who accepted
 * the old 2-option AutoModeOptInDialog but don't have auto as their default.
 * Re-surfaces the dialog so they see the new "make it my default mode" option.
- * Guard lives in GlobalConfig (~/.claude.json), not settings.json, so it
+ * Guard lives in GlobalConfig (~/.openclaude.json), not settings.json, so it
 * survives settings resets and doesn't re-arm itself.
 *
 * Only runs when tengu_auto_mode_config.enabled === 'enabled'. For 'opt-in'
--- a/src/screens/REPL.tsx
+++ b/src/screens/REPL.tsx
@@ -3873,7 +3873,7 @@ export function REPL({
  // empty to non-empty, not on every length change -- otherwise a render loop
  // (concurrent onQuery thrashing, etc.) spams saveGlobalConfig, which hits
  // ELOCKED under concurrent sessions and falls back to unlocked writes.
-  // That write storm is the primary trigger for ~/.claude.json corruption
+  // That write storm is the primary trigger for ~/.openclaude.json corruption
  // (GH #3117).
  const hasCountedQueueUseRef = useRef(false);
  useEffect(() => {
--- a/src/services/analytics/growthbook.ts
+++ b/src/services/analytics/growthbook.ts
@@ -334,7 +334,7 @@ async function processRemoteEvalPayload(
  // Empty object is truthy — without the length check, `{features: {}}`
  // (transient server bug, truncated response) would pass, clear the maps
  // below, return true, and syncRemoteEvalToDisk would wholesale-write `{}`
-  // to disk: total flag blackout for every process sharing ~/.claude.json.
+  // to disk: total flag blackout for every process sharing ~/.openclaude.json.
  if (!payload?.features || Object.keys(payload.features).length === 0) {
    return false
  }
--- a/src/services/api/claude.ts
+++ b/src/services/api/claude.ts
@@ -23,6 +23,7 @@ import { randomUUID } from 'crypto'
 import {
  getAPIProvider,
  isFirstPartyAnthropicBaseUrl,
+  isGithubNativeAnthropicMode,
 } from 'src/utils/model/providers.js'
 import {
  getAttributionHeader,
@@ -334,8 +335,13 @@ export function getPromptCachingEnabled(model: string): boolean {
  // Prompt caching is an Anthropic-specific feature. Third-party providers
  // do not understand cache_control blocks and strict backends (e.g. Azure
  // Foundry) reject or flag requests that contain them.
+  //
+  // Exception: when the GitHub provider is configured in native Anthropic API
+  // mode (CLAUDE_CODE_GITHUB_ANTHROPIC_API=1), requests are sent in Anthropic
+  // format, so cache_control blocks are supported.
  const provider = getAPIProvider()
-  if (provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex') {
+  const isNativeGithub = isGithubNativeAnthropicMode(model)
+  if (provider !== 'firstParty' && provider !== 'bedrock' && provider !== 'vertex' && !isNativeGithub) {
    return false
  }

@@ -1211,7 +1217,7 @@ async function* queryModel(
    cachedMCEnabled = featureEnabled && modelSupported
    const config = getCachedMCConfig()
    logForDebugging(
-      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config.supportedModels)}`,
+      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config?.supportedModels)}`,
    )
  }

--- a/src/services/api/client.ts
+++ b/src/services/api/client.ts
@@ -14,6 +14,7 @@ import { getSmallFastModel } from 'src/utils/model/model.js'
 import {
  getAPIProvider,
  isFirstPartyAnthropicBaseUrl,
+  isGithubNativeAnthropicMode,
 } from 'src/utils/model/providers.js'
 import { getProxyFetchOptions } from 'src/utils/proxy.js'
 import {
@@ -174,6 +175,25 @@ export async function getAnthropicClient({
      providerOverride,
    }) as unknown as Anthropic
  }
+  // GitHub provider in native Anthropic API mode: send requests in Anthropic
+  // format so cache_control blocks are honoured and prompt caching works.
+  // Requires the GitHub endpoint (OPENAI_BASE_URL) to support Anthropic's
+  // messages API — set CLAUDE_CODE_GITHUB_ANTHROPIC_API=1 to opt in.
+  if (isGithubNativeAnthropicMode(model)) {
+    const githubBaseUrl =
+      process.env.OPENAI_BASE_URL?.replace(/\/$/, '') ??
+      'https://api.githubcopilot.com'
+    const githubToken =
+      process.env.GITHUB_TOKEN ?? process.env.GH_TOKEN ?? ''
+    const nativeArgs: ConstructorParameters<typeof Anthropic>[0] = {
+      ...ARGS,
+      baseURL: githubBaseUrl,
+      authToken: githubToken,
+      // No apiKey — we authenticate via Bearer token (authToken)
+      apiKey: null,
+    }
+    return new Anthropic(nativeArgs)
+  }
  if (
    isEnvTruthy(process.env.CLAUDE_CODE_USE_OPENAI) ||
    isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) ||
--- a/src/services/api/codexShim.test.ts
+++ b/src/services/api/codexShim.test.ts
@@ -8,6 +8,7 @@ import {
  convertCodexResponseToAnthropicMessage,
  convertToolsToResponsesTools,
 } from './codexShim.js'
+import { __test as webSearchToolTest } from '../../tools/WebSearchTool/WebSearchTool.js'

 const tempDirs: string[] = []
 const originalEnv = {
@@ -547,7 +548,7 @@ describe('Codex request translation', () => {
    ])
  })

-  test('strips leaked reasoning preamble from completed Codex text responses', () => {
+  test('strips <think> tag block from completed Codex text responses', () => {
    const message = convertCodexResponseToAnthropicMessage(
      {
        id: 'resp_1',
@@ -560,7 +561,7 @@ describe('Codex request translation', () => {
              {
                type: 'output_text',
                text:
-                  'The user just said "hey" - a simple greeting. I should respond briefly and friendly.\n\nHey! How can I help you today?',
+                  '<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?',
              },
            ],
          },
@@ -578,6 +579,195 @@ describe('Codex request translation', () => {
    ])
  })

+  test('strips unterminated <think> tag at block boundary in Codex completed response', () => {
+    const message = convertCodexResponseToAnthropicMessage(
+      {
+        id: 'resp_1',
+        model: 'gpt-5.4',
+        output: [
+          {
+            type: 'message',
+            role: 'assistant',
+            content: [
+              {
+                type: 'output_text',
+                text:
+                  'Here is the answer.\n<think>wait, let me reconsider the user request',
+              },
+            ],
+          },
+        ],
+        usage: { input_tokens: 12, output_tokens: 4 },
+      },
+      'gpt-5.4',
+    )
+
+    expect(message.content).toEqual([
+      {
+        type: 'text',
+        text: 'Here is the answer.',
+      },
+    ])
+  })
+
+  test('recovers Codex web search text and sources from sparse completed response', () => {
+    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
+      {
+        output: [
+          {
+            type: 'web_search_call',
+            sources: [
+              {
+                title: 'OpenClaude repo',
+                url: 'https://github.com/example/openclaude',
+              },
+            ],
+          },
+          {
+            type: 'message',
+            role: 'assistant',
+            content: [
+              {
+                type: 'text',
+                text: 'OpenClaude is available on GitHub.',
+                sources: [
+                  {
+                    title: 'Docs',
+                    url: 'https://docs.example.com/openclaude',
+                  },
+                ],
+              },
+            ],
+          },
+        ],
+      },
+      'OpenClaude GitHub 2026',
+      0.42,
+    )
+
+    expect(output.results).toEqual([
+      'OpenClaude is available on GitHub.',
+      {
+        tool_use_id: 'codex-web-search',
+        content: [
+          {
+            title: 'OpenClaude repo',
+            url: 'https://github.com/example/openclaude',
+          },
+          {
+            title: 'Docs',
+            url: 'https://docs.example.com/openclaude',
+          },
+        ],
+      },
+    ])
+  })
+
+  test('falls back to a non-empty Codex web search result message', () => {
+    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
+      { output: [] },
+      'OpenClaude GitHub 2026',
+      0.11,
+    )
+
+    expect(output.results).toEqual(['No results found.'])
+  })
+
+  test('surfaces Codex web search failure reason with a message', () => {
+    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
+      {
+        output: [
+          {
+            type: 'web_search_call',
+            status: 'failed',
+            error: { message: 'upstream search provider rate-limited' },
+          },
+        ],
+      },
+      'OpenClaude GitHub 2026',
+      0.05,
+    )
+
+    expect(output.results).toEqual([
+      'Web search failed: upstream search provider rate-limited',
+    ])
+  })
+
+  test('surfaces Codex web search failure reason nested under action.error', () => {
+    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
+      {
+        output: [
+          {
+            type: 'web_search_call',
+            status: 'failed',
+            action: { error: { message: 'query blocked' } },
+          },
+        ],
+      },
+      'OpenClaude GitHub 2026',
+      0.05,
+    )
+
+    expect(output.results).toEqual(['Web search failed: query blocked'])
+  })
+
+  test('handles Codex web search failure with no reason attached', () => {
+    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
+      {
+        output: [
+          {
+            type: 'web_search_call',
+            status: 'failed',
+          },
+        ],
+      },
+      'OpenClaude GitHub 2026',
+      0.05,
+    )
+
+    expect(output.results).toEqual(['Web search failed.'])
+  })
+
+  test('a failure item does not suppress sources from a later message item', () => {
+    const output = webSearchToolTest.makeOutputFromCodexWebSearchResponse(
+      {
+        output: [
+          {
+            type: 'web_search_call',
+            status: 'failed',
+            error: { message: 'partial outage' },
+          },
+          {
+            type: 'message',
+            role: 'assistant',
+            content: [
+              {
+                type: 'output_text',
+                text: 'Partial results below.',
+                sources: [
+                  { title: 'Docs', url: 'https://docs.example.com/openclaude' },
+                ],
+              },
+            ],
+          },
+        ],
+      },
+      'OpenClaude GitHub 2026',
+      0.05,
+    )
+
+    expect(output.results).toEqual([
+      'Web search failed: partial outage',
+      'Partial results below.',
+      {
+        tool_use_id: 'codex-web-search',
+        content: [
+          { title: 'Docs', url: 'https://docs.example.com/openclaude' },
+        ],
+      },
+    ])
+  })
+
  test('translates Codex SSE text stream into Anthropic events', async () => {
    const responseText = [
      'event: response.output_item.added',
@@ -609,7 +799,7 @@ describe('Codex request translation', () => {
    ])
  })

-  test('strips leaked reasoning preamble from Codex SSE text stream', async () => {
+  test('strips <think> tag block from Codex SSE text stream', async () => {
    const responseText = [
      'event: response.output_item.added',
      'data: {"type":"response.output_item.added","item":{"id":"msg_1","type":"message","status":"in_progress","content":[],"role":"assistant"},"output_index":0,"sequence_number":0}',
@@ -618,13 +808,13 @@ describe('Codex request translation', () => {
      'data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_1","output_index":0,"part":{"type":"output_text","text":""},"sequence_number":1}',
      '',
      'event: response.output_text.delta',
-      'data: {"type":"response.output_text.delta","content_index":0,"delta":"The user just said \\"hey\\" - a simple greeting. I should respond briefly and friendly.\\n\\nHey! How can I help you today?","item_id":"msg_1","output_index":0,"sequence_number":2}',
+      'data: {"type":"response.output_text.delta","content_index":0,"delta":"<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?","item_id":"msg_1","output_index":0,"sequence_number":2}',
      '',
      'event: response.output_item.done',
-      'data: {"type":"response.output_item.done","item":{"id":"msg_1","type":"message","status":"completed","content":[{"type":"output_text","text":"The user just said \\"hey\\" - a simple greeting. I should respond briefly and friendly.\\n\\nHey! How can I help you today?"}],"role":"assistant"},"output_index":0,"sequence_number":3}',
+      'data: {"type":"response.output_item.done","item":{"id":"msg_1","type":"message","status":"completed","content":[{"type":"output_text","text":"<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?"}],"role":"assistant"},"output_index":0,"sequence_number":3}',
      '',
      'event: response.completed',
-      'data: {"type":"response.completed","response":{"id":"resp_1","status":"completed","model":"gpt-5.4","output":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"The user just said \\"hey\\" - a simple greeting. I should respond briefly and friendly.\\n\\nHey! How can I help you today?"}]}],"usage":{"input_tokens":2,"output_tokens":1}},"sequence_number":4}',
+      'data: {"type":"response.completed","response":{"id":"resp_1","status":"completed","model":"gpt-5.4","output":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"<think>user wants a greeting, respond briefly</think>Hey! How can I help you today?"}]}],"usage":{"input_tokens":2,"output_tokens":1}},"sequence_number":4}',
      '',
    ].join('\n')

@@ -646,6 +836,50 @@ describe('Codex request translation', () => {
      }
    }

-    expect(textDeltas).toEqual(['Hey! How can I help you today?'])
+    expect(textDeltas.join('')).toBe('Hey! How can I help you today?')
+  })
+
+  test('preserves prose without tags (no phrase-based false positive)', async () => {
+    // Regression test: older phrase-based sanitizer would incorrectly strip text
+    // starting with "I should" or "The user". The tag-based approach leaves it alone.
+    const responseText = [
+      'event: response.output_item.added',
+      'data: {"type":"response.output_item.added","item":{"id":"msg_1","type":"message","status":"in_progress","content":[],"role":"assistant"},"output_index":0,"sequence_number":0}',
+      '',
+      'event: response.content_part.added',
+      'data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_1","output_index":0,"part":{"type":"output_text","text":""},"sequence_number":1}',
+      '',
+      'event: response.output_text.delta',
+      'data: {"type":"response.output_text.delta","content_index":0,"delta":"I should note that the user role requires a briefly concise friendly response format.","item_id":"msg_1","output_index":0,"sequence_number":2}',
+      '',
+      'event: response.output_item.done',
+      'data: {"type":"response.output_item.done","item":{"id":"msg_1","type":"message","status":"completed","content":[{"type":"output_text","text":"I should note that the user role requires a briefly concise friendly response format."}],"role":"assistant"},"output_index":0,"sequence_number":3}',
+      '',
+      'event: response.completed',
+      'data: {"type":"response.completed","response":{"id":"resp_1","status":"completed","model":"gpt-5.4","output":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"I should note that the user role requires a briefly concise friendly response format."}]}],"usage":{"input_tokens":2,"output_tokens":1}},"sequence_number":4}',
+      '',
+    ].join('\n')
+
+    const stream = new ReadableStream({
+      start(controller) {
+        controller.enqueue(new TextEncoder().encode(responseText))
+        controller.close()
+      },
+    })
+
+    const textDeltas: string[] = []
+    for await (const event of codexStreamToAnthropic(
+      new Response(stream),
+      'gpt-5.4',
+    )) {
+      const delta = (event as { delta?: { type?: string; text?: string } }).delta
+      if (delta?.type === 'text_delta' && typeof delta.text === 'string') {
+        textDeltas.push(delta.text)
+      }
+    }
+
+    expect(textDeltas.join('')).toBe(
+      'I should note that the user role requires a briefly concise friendly response format.',
+    )
  })
 })
--- a/src/services/api/codexShim.ts
+++ b/src/services/api/codexShim.ts
@@ -1,14 +1,15 @@
 import { APIError } from '@anthropic-ai/sdk'
+import { compressToolHistory } from './compressToolHistory.js'
+import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
 import type {
  ResolvedCodexCredentials,
  ResolvedProviderRequest,
 } from './providerConfig.js'
 import { sanitizeSchemaForOpenAICompat } from './openaiSchemaSanitizer.js'
 import {
-  looksLikeLeakedReasoningPrefix,
-  shouldBufferPotentialReasoningPrefix,
-  stripLeakedReasoningPreamble,
-} from './reasoningLeakSanitizer.js'
+  createThinkTagFilter,
+  stripThinkTags,
+} from './thinkTagSanitizer.js'

 export interface AnthropicUsage {
  input_tokens: number
@@ -484,13 +485,15 @@ export async function performCodexRequest(options: {
  defaultHeaders: Record<string, string>
  signal?: AbortSignal
 }): Promise<Response> {
-  const input = convertAnthropicMessagesToResponsesInput(
+  const compressedMessages = compressToolHistory(
    options.params.messages as Array<{
      role?: string
      message?: { role?: string; content?: unknown }
      content?: unknown
    }>,
+    options.request.resolvedModel,
  )
+  const input = convertAnthropicMessagesToResponsesInput(compressedMessages)
  const body: Record<string, unknown> = {
    model: options.request.resolvedModel,
    input: input.length > 0
@@ -559,12 +562,15 @@ export async function performCodexRequest(options: {
  }
  headers.originator ??= 'openclaude'

-  const response = await fetch(`${options.request.baseUrl}/responses`, {
+  const response = await fetchWithProxyRetry(
+    `${options.request.baseUrl}/responses`,
+    {
      method: 'POST',
      headers,
      body: JSON.stringify(body),
      signal: options.signal,
-  })
+    },
+  )

  if (!response.ok) {
    const errorBody = await response.text().catch(() => 'unknown error')
@@ -730,34 +736,29 @@ export async function* codexStreamToAnthropic(
    { index: number; toolUseId: string }
  >()
  let activeTextBlockIndex: number | null = null
-  let activeTextBuffer = ''
-  let textBufferMode: 'none' | 'pending' | 'strip' = 'none'
+  const thinkFilter = createThinkTagFilter()
  let nextContentBlockIndex = 0
  let sawToolUse = false
  let finalResponse: Record<string, any> | undefined

  const closeActiveTextBlock = async function* () {
    if (activeTextBlockIndex === null) return
-    if (textBufferMode !== 'none') {
-      const sanitized = stripLeakedReasoningPreamble(activeTextBuffer)
-      if (sanitized) {
+    const tail = thinkFilter.flush()
+    if (tail) {
      yield {
        type: 'content_block_delta',
        index: activeTextBlockIndex,
        delta: {
          type: 'text_delta',
-            text: sanitized,
+          text: tail,
        },
      }
    }
-    }
    yield {
      type: 'content_block_stop',
      index: activeTextBlockIndex,
    }
    activeTextBlockIndex = null
-    activeTextBuffer = ''
-    textBufferMode = 'none'
  }

  const startTextBlockIfNeeded = async function* () {
@@ -833,43 +834,17 @@ export async function* codexStreamToAnthropic(

    if (event.event === 'response.output_text.delta') {
      yield* startTextBlockIfNeeded()
-      activeTextBuffer += payload.delta ?? ''
      if (activeTextBlockIndex !== null) {
-        if (
-          textBufferMode === 'strip' ||
-          looksLikeLeakedReasoningPrefix(activeTextBuffer)
-        ) {
-          textBufferMode = 'strip'
-          continue
-        }
-
-        if (textBufferMode === 'pending') {
-          if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
-            continue
-          }
+        const visible = thinkFilter.feed(payload.delta ?? '')
+        if (visible) {
          yield {
            type: 'content_block_delta',
            index: activeTextBlockIndex,
            delta: {
              type: 'text_delta',
-              text: activeTextBuffer,
+              text: visible,
            },
          }
-          textBufferMode = 'none'
-          continue
-        }
-
-        if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
-          textBufferMode = 'pending'
-          continue
-        }
-        yield {
-          type: 'content_block_delta',
-          index: activeTextBlockIndex,
-          delta: {
-            type: 'text_delta',
-            text: payload.delta ?? '',
-          },
        }
      }
      continue
@@ -965,7 +940,7 @@ export function convertCodexResponseToAnthropicMessage(
        if (part?.type === 'output_text') {
          content.push({
            type: 'text',
-            text: stripLeakedReasoningPreamble(part.text ?? ''),
+            text: stripThinkTags(part.text ?? ''),
          })
        }
      }
--- a/src/services/api/compressToolHistory.test.ts
+++ b/src/services/api/compressToolHistory.test.ts
@@ -0,0 +1,572 @@
+import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
+import { compressToolHistory, getTiers } from './compressToolHistory.js'
+
+// Mock the two dependencies so tests are deterministic and don't read disk config.
+const mockState = {
+  enabled: true,
+  effectiveWindow: 100_000,
+}
+
+mock.module('../../utils/config.js', () => ({
+  getGlobalConfig: () => ({
+    toolHistoryCompressionEnabled: mockState.enabled,
+  }),
+}))
+
+mock.module('../compact/autoCompact.js', () => ({
+  getEffectiveContextWindowSize: () => mockState.effectiveWindow,
+}))
+
+beforeEach(() => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 100_000
+})
+
+afterEach(() => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 100_000
+})
+
+type Block = Record<string, unknown>
+type Msg = { role: string; content: Block[] | string }
+
+function bigText(n: number): string {
+  return 'x'.repeat(n)
+}
+
+function buildToolExchange(id: number, resultLength: number): Msg[] {
+  return [
+    {
+      role: 'assistant',
+      content: [
+        {
+          type: 'tool_use',
+          id: `toolu_${id}`,
+          name: 'Read',
+          input: { file_path: `/path/to/file${id}.ts` },
+        },
+      ],
+    },
+    {
+      role: 'user',
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: `toolu_${id}`,
+          content: bigText(resultLength),
+        },
+      ],
+    },
+  ]
+}
+
+function buildConversation(numToolExchanges: number, resultLength = 5_000): Msg[] {
+  const out: Msg[] = [{ role: 'user', content: 'Initial request' }]
+  for (let i = 0; i < numToolExchanges; i++) {
+    out.push(...buildToolExchange(i, resultLength))
+  }
+  return out
+}
+
+function getResultMessages(messages: Msg[]): Msg[] {
+  return messages.filter(
+    m => Array.isArray(m.content) && m.content.some((b: any) => b.type === 'tool_result'),
+  )
+}
+
+function getResultBlock(msg: Msg): Block {
+  return (msg.content as Block[]).find((b: any) => b.type === 'tool_result') as Block
+}
+
+function getResultText(msg: Msg): string {
+  const block = getResultBlock(msg)
+  const c = block.content
+  if (typeof c === 'string') return c
+  if (Array.isArray(c)) {
+    return c
+      .filter((b: any) => b.type === 'text')
+      .map((b: any) => b.text)
+      .join('\n')
+  }
+  return ''
+}
+
+// ---------- getTiers ----------
+
+test('getTiers: < 16k window → recent=2, mid=3', () => {
+  expect(getTiers(8_000)).toEqual({ recent: 2, mid: 3 })
+})
+
+test('getTiers: 16k–32k → recent=3, mid=5', () => {
+  expect(getTiers(20_000)).toEqual({ recent: 3, mid: 5 })
+})
+
+test('getTiers: 32k–64k → recent=4, mid=8', () => {
+  expect(getTiers(48_000)).toEqual({ recent: 4, mid: 8 })
+})
+
+test('getTiers: 64k–128k (Copilot gpt-4o) → recent=5, mid=10', () => {
+  expect(getTiers(100_000)).toEqual({ recent: 5, mid: 10 })
+})
+
+test('getTiers: 128k–256k (Copilot Claude) → recent=8, mid=15', () => {
+  expect(getTiers(200_000)).toEqual({ recent: 8, mid: 15 })
+})
+
+test('getTiers: 256k–500k → recent=12, mid=25', () => {
+  expect(getTiers(400_000)).toEqual({ recent: 12, mid: 25 })
+})
+
+test('getTiers: ≥ 500k (gpt-4.1 1M) → recent=25, mid=50', () => {
+  expect(getTiers(1_000_000)).toEqual({ recent: 25, mid: 50 })
+})
+
+// ---------- master switch ----------
+
+test('pass-through when toolHistoryCompressionEnabled is false', () => {
+  mockState.enabled = false
+  const messages = buildConversation(20)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  expect(result).toBe(messages) // same reference (no transformation)
+})
+
+test('pass-through when total tool_results <= recent tier', () => {
+  // 100k effective → recent=5; only 4 exchanges → no compression
+  const messages = buildConversation(4)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  expect(result).toBe(messages)
+})
+
+// ---------- per-tier behavior ----------
+
+test('recent tier: tool_result content untouched', () => {
+  // 100k effective → recent=5, mid=10. With 6 exchanges, only the oldest is touched.
+  const messages = buildConversation(6, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // Last 5 should be untouched (full 5000 chars)
+  for (let i = resultMsgs.length - 5; i < resultMsgs.length; i++) {
+    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
+  }
+})
+
+test('mid tier: long content truncated to MID_MAX_CHARS with marker', () => {
+  // 100k → recent=5, mid=10. 10 exchanges: 5 recent + 5 mid (none old).
+  const messages = buildConversation(10, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // First 5 are mid tier — should be truncated to ~2000 chars + marker
+  for (let i = 0; i < 5; i++) {
+    const text = getResultText(resultMsgs[i])
+    expect(text).toContain('[…truncated')
+    expect(text).toContain('chars from tool history]')
+    // Should be roughly 2000 chars + marker (under 2200)
+    expect(text.length).toBeLessThan(2_200)
+    expect(text.length).toBeGreaterThan(2_000)
+  }
+})
+
+test('mid tier: short content (< MID_MAX_CHARS) untouched', () => {
+  const messages = buildConversation(10, 500) // 500 < MID_MAX_CHARS
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  for (let i = 0; i < 5; i++) {
+    expect(getResultText(resultMsgs[i])).toBe(bigText(500))
+  }
+})
+
+test('old tier: content replaced with stub [name args={...} → N chars omitted]', () => {
+  // 100k → recent=5, mid=10, old=rest. 20 exchanges → 5 old + 10 mid + 5 recent.
+  const messages = buildConversation(20, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // First 5 are old tier — should be stubs
+  for (let i = 0; i < 5; i++) {
+    const text = getResultText(resultMsgs[i])
+    expect(text).toMatch(/^\[Read args=\{.*\} → 5000 chars omitted\]$/)
+  }
+})
+
+test('old tier: stub args truncated to 200 chars', () => {
+  const longArg = bigText(500)
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [
+        {
+          type: 'tool_use',
+          id: 'toolu_x',
+          name: 'Bash',
+          input: { command: longArg },
+        },
+      ],
+    },
+    {
+      role: 'user',
+      content: [
+        { type: 'tool_result', tool_use_id: 'toolu_x', content: 'output' },
+      ],
+    },
+    // Pad with enough recent exchanges to push the above into old tier
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+  const text = getResultText(resultMsgs[0])
+
+  // Stub format: [Bash args=<json≤200chars> → N chars omitted]
+  // The args portion (between args= and →) must be ≤ 200 chars.
+  const argsMatch = text.match(/args=(.*?) →/)
+  expect(argsMatch).not.toBeNull()
+  expect(argsMatch![1].length).toBeLessThanOrEqual(200)
+})
+
+test('old tier: orphan tool_result (no matching tool_use) falls back to "tool"', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    // Orphan: tool_result without matching tool_use in history
+    {
+      role: 'user',
+      content: [
+        { type: 'tool_result', tool_use_id: 'orphan_id', content: 'data' },
+      ],
+    },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+  const text = getResultText(resultMsgs[0])
+
+  expect(text).toMatch(/^\[tool args=\{\} → 4 chars omitted\]$/)
+})
+
+// ---------- structural preservation ----------
+
+test('tool_use blocks always preserved', () => {
+  const messages = buildConversation(20, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+
+  const useCount = (msgs: Msg[]) =>
+    msgs.reduce((sum, m) => {
+      if (!Array.isArray(m.content)) return sum
+      return sum + m.content.filter((b: any) => b.type === 'tool_use').length
+    }, 0)
+
+  expect(useCount(result as Msg[])).toBe(useCount(messages))
+})
+
+test('text blocks always preserved', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'first' },
+    {
+      role: 'assistant',
+      content: [
+        { type: 'text', text: 'reasoning before tool' },
+        { type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
+      ],
+    },
+    {
+      role: 'user',
+      content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
+    },
+    ...buildConversation(20, 5_000).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const assistantMsg = (result as Msg[])[1]
+  const textBlock = (assistantMsg.content as Block[]).find((b: any) => b.type === 'text')
+
+  expect(textBlock).toEqual({ type: 'text', text: 'reasoning before tool' })
+})
+
+test('thinking blocks always preserved', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'first' },
+    {
+      role: 'assistant',
+      content: [
+        { type: 'thinking', thinking: 'internal reasoning', signature: 'sig' },
+        { type: 'tool_use', id: 'toolu_1', name: 'Read', input: {} },
+      ],
+    },
+    {
+      role: 'user',
+      content: [{ type: 'tool_result', tool_use_id: 'toolu_1', content: bigText(5000) }],
+    },
+    ...buildConversation(20, 5_000).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const assistantMsg = (result as Msg[])[1]
+  const thinking = (assistantMsg.content as Block[]).find((b: any) => b.type === 'thinking')
+
+  expect(thinking).toEqual({
+    type: 'thinking',
+    thinking: 'internal reasoning',
+    signature: 'sig',
+  })
+})
+
+test('non-array content (string) handled gracefully', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'plain string content' },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  expect((result as Msg[])[0].content).toBe('plain string content')
+})
+
+test('empty content array handled gracefully', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: [] },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  expect(() => compressToolHistory(messages, 'gpt-4o')).not.toThrow()
+})
+
+// ---------- message shape compatibility ----------
+
+test('wrapped shape ({ message: { role, content } }) handled', () => {
+  type WrappedMsg = { message: { role: string; content: Block[] | string } }
+  const wrap = (m: Msg): WrappedMsg => ({ message: { role: m.role, content: m.content } })
+  const messages = buildConversation(20, 5_000).map(wrap)
+  const result = compressToolHistory(messages as any, 'gpt-4o')
+
+  // First wrapped tool-result message should have stub content (old tier)
+  const firstResultMsg = (result as WrappedMsg[]).find(
+    m =>
+      Array.isArray(m.message.content) &&
+      m.message.content.some((b: any) => b.type === 'tool_result'),
+  )
+  const block = (firstResultMsg!.message.content as Block[]).find(
+    (b: any) => b.type === 'tool_result',
+  ) as Block
+  const text = ((block.content as Block[])[0] as any).text
+  expect(text).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
+})
+
+test('flat shape ({ role, content }) handled', () => {
+  const messages = buildConversation(20, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  expect(getResultText(resultMsgs[0])).toMatch(/^\[Read args=.*→ 5000 chars omitted\]$/)
+})
+
+// ---------- tier boundary correctness ----------
+
+test('tier boundaries: 6 exchanges → 1 mid + 5 recent (recent=5)', () => {
+  const messages = buildConversation(6, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // Oldest: mid (truncated)
+  expect(getResultText(resultMsgs[0])).toContain('[…truncated')
+  // Last 5: untouched
+  for (let i = 1; i < 6; i++) {
+    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
+  }
+})
+
+test('tier boundaries: 16 exchanges → 1 old + 10 mid + 5 recent', () => {
+  const messages = buildConversation(16, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // Oldest 1: stub (old tier)
+  expect(getResultText(resultMsgs[0])).toMatch(/^\[Read .*chars omitted\]$/)
+  // Next 10: mid (truncated)
+  for (let i = 1; i < 11; i++) {
+    expect(getResultText(resultMsgs[i])).toContain('[…truncated')
+  }
+  // Last 5: untouched
+  for (let i = 11; i < 16; i++) {
+    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
+  }
+})
+
+test('large window (1M) with 30 exchanges: all untouched (recent=25 ≥ 30 - 5)', () => {
+  // ≥500k → recent=25, mid=50. 30 exchanges → 5 mid + 25 recent. None old.
+  mockState.effectiveWindow = 1_000_000
+  const messages = buildConversation(30, 5_000)
+  const result = compressToolHistory(messages, 'gpt-4.1')
+  const resultMsgs = getResultMessages(result)
+
+  // Last 25: untouched
+  for (let i = 5; i < 30; i++) {
+    expect(getResultText(resultMsgs[i]).length).toBe(5_000)
+  }
+})
+
+// ---------- attribute preservation ----------
+
+test('is_error flag preserved in mid tier', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
+    },
+    {
+      role: 'user',
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: 'toolu_err',
+          is_error: true,
+          content: bigText(5_000),
+        },
+      ],
+    },
+    // Pad with enough recent exchanges to push the above into MID tier
+    ...buildConversation(10, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+  const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
+
+  expect(block.is_error).toBe(true)
+  expect(getResultText(resultMsgs[0])).toContain('[…truncated')
+})
+
+test('is_error flag preserved in old tier (stub)', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [{ type: 'tool_use', id: 'toolu_err', name: 'Bash', input: {} }],
+    },
+    {
+      role: 'user',
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: 'toolu_err',
+          is_error: true,
+          content: bigText(5_000),
+        },
+      ],
+    },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+  const block = getResultBlock(resultMsgs[0]) as { is_error?: boolean; content: unknown }
+
+  expect(block.is_error).toBe(true)
+  expect(getResultText(resultMsgs[0])).toMatch(/^\[Bash .*chars omitted\]$/)
+})
+
+// ---------- COMPACTABLE_TOOLS filter ----------
+
+test('non-compactable tool (e.g. Task/Agent) is NEVER compressed', () => {
+  // Build conversation where the OLDEST exchange uses a non-compactable tool name
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [
+        { type: 'tool_use', id: 'task_1', name: 'Task', input: { goal: 'plan' } },
+      ],
+    },
+    {
+      role: 'user',
+      content: [
+        { type: 'tool_result', tool_use_id: 'task_1', content: bigText(5_000) },
+      ],
+    },
+    // Pad with 20 compactable exchanges to push Task into old tier
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // First tool_result is for Task (non-compactable) → must remain full
+  expect(getResultText(resultMsgs[0]).length).toBe(5_000)
+  expect(getResultText(resultMsgs[0])).not.toContain('chars omitted')
+  expect(getResultText(resultMsgs[0])).not.toContain('[…truncated')
+})
+
+test('mcp__ prefixed tools ARE compactable (matches microCompact behavior)', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [
+        { type: 'tool_use', id: 'mcp_1', name: 'mcp__github__get_issue', input: {} },
+      ],
+    },
+    {
+      role: 'user',
+      content: [
+        { type: 'tool_result', tool_use_id: 'mcp_1', content: bigText(5_000) },
+      ],
+    },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // MCP tool result is compressed (gets stub since it's in old tier)
+  expect(getResultText(resultMsgs[0])).toMatch(/^\[mcp__github__get_issue .*chars omitted\]$/)
+})
+
+// ---------- skip already-cleared blocks ----------
+
+test('blocks already cleared by microCompact are NOT re-compressed', () => {
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [{ type: 'tool_use', id: 'cleared_1', name: 'Read', input: {} }],
+    },
+    {
+      role: 'user',
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: 'cleared_1',
+          content: '[Old tool result content cleared]', // microCompact's marker
+        },
+      ],
+    },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+
+  // Already-cleared marker survives untouched (no double processing)
+  expect(getResultText(resultMsgs[0])).toBe('[Old tool result content cleared]')
+})
+
+test('extra block attributes (e.g. cache_control) preserved across rewrites', () => {
+  const cacheControl = { type: 'ephemeral' }
+  const messages: Msg[] = [
+    { role: 'user', content: 'start' },
+    {
+      role: 'assistant',
+      content: [{ type: 'tool_use', id: 'toolu_cc', name: 'Read', input: {} }],
+    },
+    {
+      role: 'user',
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: 'toolu_cc',
+          cache_control: cacheControl,
+          content: bigText(5_000),
+        },
+      ],
+    },
+    ...buildConversation(20, 100).slice(1),
+  ]
+  const result = compressToolHistory(messages, 'gpt-4o')
+  const resultMsgs = getResultMessages(result)
+  const block = getResultBlock(resultMsgs[0]) as { cache_control?: unknown }
+
+  // The custom attribute survived the stub rewrite via ...block spread
+  expect(block.cache_control).toEqual(cacheControl)
+})
--- a/src/services/api/compressToolHistory.ts
+++ b/src/services/api/compressToolHistory.ts
@@ -0,0 +1,255 @@
+/**
+ * Compresses old tool_result content for stateless OpenAI-compatible providers
+ * (Copilot, Mistral, Ollama). Preserves all conversation structure — tool_use,
+ * tool_result pairing, text, thinking, and is_error all survive intact. Only
+ * the BULK text of older tool_results is shrunk to delay context saturation.
+ *
+ * Tier sizes scale with the model's effective context window via
+ * getEffectiveContextWindowSize() — same calculation used by auto-compact, so
+ * the two systems stay aligned.
+ *
+ * Complements (does not replace) microCompact.ts:
+ * - microCompact: time/cache-based, runs from query.ts, binary clear/keep,
+ *   limited to Claude (cache editing) or idle gaps (time-based).
+ * - compressToolHistory: size-based, runs at the shim layer, tiered
+ *   compression, covers the gap for active sessions on non-Claude providers.
+ *
+ * Reuses isCompactableTool from microCompact to avoid touching tools the
+ * project already classifies as unsafe to compress (e.g. Task, Agent).
+ * Skips blocks already cleared by microCompact (TOOL_RESULT_CLEARED_MESSAGE).
+ *
+ * Anthropic native bypasses both shims, so it is unaffected by this module.
+ */
+import { getEffectiveContextWindowSize } from '../compact/autoCompact.js'
+import { isCompactableTool } from '../compact/microCompact.js'
+import { TOOL_RESULT_CLEARED_MESSAGE } from '../../utils/toolResultStorage.js'
+import { getGlobalConfig } from '../../utils/config.js'
+
+// Mid-tier truncation budget. 2k chars ≈ 500 tokens, enough to preserve the
+// shape of most tool outputs (file headers, command stderr, top grep hits)
+// without ballooning context. Bump too high and the tier loses its purpose.
+const MID_MAX_CHARS = 2_000
+
+// Stub args budget. JSON.stringify of a typical tool input fits in 200 chars
+// (file paths, short commands, small queries). Long inputs are rare and clamping
+// here keeps the stub size bounded even when callers pass oversized arguments.
+const STUB_ARGS_MAX_CHARS = 200
+
+type AnyMessage = {
+  role?: string
+  message?: { role?: string; content?: unknown }
+  content?: unknown
+}
+
+type ToolResultBlock = {
+  type: 'tool_result'
+  tool_use_id?: string
+  is_error?: boolean
+  content?: unknown
+}
+
+type ToolUseBlock = {
+  type: 'tool_use'
+  id?: string
+  name?: string
+  input?: unknown
+}
+
+type Tiers = { recent: number; mid: number }
+
+// Tier sizes scale with effective window. Targets roughly:
+// - recent tier stays under ~25% of available window (full fidelity kept)
+// - recent + mid tier stays under ~50% of available window (bounded bulk)
+// - everything older collapses to ~15-token stubs
+// Values assume ~5KB avg tool_result, which matches the Copilot default case
+// (parallel_tool_calls=true means multiple Read/Bash outputs per turn). For
+// ≥ 500k models the tiers are so generous that compression is effectively
+// inert for any realistic session — see compressToolHistory.test.ts.
+export function getTiers(effectiveWindow: number): Tiers {
+  if (effectiveWindow < 16_000) return { recent: 2, mid: 3 }
+  if (effectiveWindow < 32_000) return { recent: 3, mid: 5 }
+  if (effectiveWindow < 64_000) return { recent: 4, mid: 8 }
+  if (effectiveWindow < 128_000) return { recent: 5, mid: 10 }
+  if (effectiveWindow < 256_000) return { recent: 8, mid: 15 }
+  if (effectiveWindow < 500_000) return { recent: 12, mid: 25 }
+  return { recent: 25, mid: 50 }
+}
+
+function extractText(content: unknown): string {
+  if (typeof content === 'string') return content
+  if (Array.isArray(content)) {
+    return content
+      .filter(
+        (b: { type?: string; text?: string }) =>
+          b?.type === 'text' && typeof b.text === 'string',
+      )
+      .map((b: { text?: string }) => b.text ?? '')
+      .join('\n')
+  }
+  return ''
+}
+
+// Old-tier compression strategy. Replaces content entirely with a one-line
+// metadata marker ~10× more token-efficient than a 500-char truncation AND
+// unambiguous — partial truncations can look authoritative to the model. The
+// stub format encodes tool name + args so the model can re-invoke the same
+// tool if it needs the omitted output back.
+function buildStub(
+  block: ToolResultBlock,
+  toolUsesById: Map<string, ToolUseBlock>,
+): ToolResultBlock {
+  const original = extractText(block.content)
+  const toolUse = toolUsesById.get(block.tool_use_id ?? '')
+  const name = toolUse?.name ?? 'tool'
+  const args = toolUse?.input
+    ? JSON.stringify(toolUse.input).slice(0, STUB_ARGS_MAX_CHARS)
+    : '{}'
+  return {
+    ...block,
+    content: [
+      {
+        type: 'text',
+        text: `[${name} args=${args} → ${original.length} chars omitted]`,
+      },
+    ],
+  }
+}
+
+// Mid-tier compression. The trailing marker is load-bearing: without it, the
+// model can't distinguish "tool returned 2000 chars" from "tool returned 20k
+// chars that we cut to 2000". Distinguishing those matters for the model's
+// decision to re-invoke the tool.
+function truncateBlock(
+  block: ToolResultBlock,
+  maxChars: number,
+): ToolResultBlock {
+  const text = extractText(block.content)
+  if (text.length <= maxChars) return block
+  const omitted = text.length - maxChars
+  return {
+    ...block,
+    content: [
+      {
+        type: 'text',
+        text: `${text.slice(0, maxChars)}\n[…truncated ${omitted} chars from tool history]`,
+      },
+    ],
+  }
+}
+
+function getInner(msg: AnyMessage): { role?: string; content?: unknown } {
+  return (msg.message ?? msg) as { role?: string; content?: unknown }
+}
+
+function indexToolUses(messages: AnyMessage[]): Map<string, ToolUseBlock> {
+  const map = new Map<string, ToolUseBlock>()
+  for (const msg of messages) {
+    const content = getInner(msg).content
+    if (!Array.isArray(content)) continue
+    for (const b of content as Array<{ type?: string; id?: string }>) {
+      if (b?.type === 'tool_use' && b.id) {
+        map.set(b.id, b as ToolUseBlock)
+      }
+    }
+  }
+  return map
+}
+
+function indexToolResultMessages(messages: AnyMessage[]): number[] {
+  const indices: number[] = []
+  for (let i = 0; i < messages.length; i++) {
+    const inner = getInner(messages[i])
+    const role = inner.role ?? messages[i].role
+    const content = inner.content
+    if (
+      role === 'user' &&
+      Array.isArray(content) &&
+      content.some((b: { type?: string }) => b?.type === 'tool_result')
+    ) {
+      indices.push(i)
+    }
+  }
+  return indices
+}
+
+function rewriteMessage<T extends AnyMessage>(
+  msg: T,
+  newContent: unknown[],
+): T {
+  if (msg.message) {
+    return { ...msg, message: { ...msg.message, content: newContent } }
+  }
+  return { ...msg, content: newContent }
+}
+
+// microCompact.maybeTimeBasedMicrocompact may have already replaced old
+// tool_result content with TOOL_RESULT_CLEARED_MESSAGE before we see it.
+// Re-compressing produces a stub over a marker (e.g. `[Read args={} → 40
+// chars omitted]`), wasteful and less informative than the canonical marker.
+function isAlreadyCleared(block: ToolResultBlock): boolean {
+  const text = extractText(block.content)
+  return text === TOOL_RESULT_CLEARED_MESSAGE
+}
+
+function shouldCompressBlock(
+  block: ToolResultBlock,
+  toolUsesById: Map<string, ToolUseBlock>,
+): boolean {
+  if (isAlreadyCleared(block)) return false
+  const toolUse = toolUsesById.get(block.tool_use_id ?? '')
+  // Unknown tool name (orphan tool_result with no matching tool_use) falls
+  // through to compression with a generic "tool" stub. Safer default: the
+  // original tool_use vanished so there's no downstream use for the output.
+  if (!toolUse?.name) return true
+  // Respect microCompact's curated safe-to-compress set (Read/Bash/Grep/…/
+  // mcp__*) so user-facing flow tools (Task, Agent, custom) stay intact.
+  return isCompactableTool(toolUse.name)
+}
+
+export function compressToolHistory<T extends AnyMessage>(
+  messages: T[],
+  model: string,
+): T[] {
+  // Master kill-switch. Returns the original reference so callers skip a
+  // defensive copy when the feature is disabled.
+  if (!getGlobalConfig().toolHistoryCompressionEnabled) return messages
+
+  const tiers = getTiers(getEffectiveContextWindowSize(model))
+
+  const toolResultIndices = indexToolResultMessages(messages)
+  const total = toolResultIndices.length
+  // If every tool-result fits in the recent tier, no boundary crosses; return
+  // the same reference for the same copy-elision reason.
+  if (total <= tiers.recent) return messages
+
+  // O(1) lookup: messageIndex → tool-result position (0 = oldest). Replaces
+  // the naive Array.indexOf(i) that was O(n²) across the .map below.
+  const positionByIndex = new Map<number, number>()
+  for (let pos = 0; pos < toolResultIndices.length; pos++) {
+    positionByIndex.set(toolResultIndices[pos], pos)
+  }
+
+  const toolUsesById = indexToolUses(messages)
+
+  return messages.map((msg, i) => {
+    const pos = positionByIndex.get(i)
+    if (pos === undefined) return msg
+
+    const fromEnd = total - 1 - pos
+    if (fromEnd < tiers.recent) return msg
+
+    const inMidWindow = fromEnd < tiers.recent + tiers.mid
+    const content = getInner(msg).content as unknown[]
+    const newContent = content.map(block => {
+      const b = block as { type?: string }
+      if (b?.type !== 'tool_result') return block
+      const tr = block as ToolResultBlock
+      if (!shouldCompressBlock(tr, toolUsesById)) return block
+      return inMidWindow
+        ? truncateBlock(tr, MID_MAX_CHARS)
+        : buildStub(tr, toolUsesById)
+    })
+
+    return rewriteMessage(msg, newContent)
+  })
+}
--- a/src/services/api/errors.openaiCompatibility.test.ts
+++ b/src/services/api/errors.openaiCompatibility.test.ts
@@ -0,0 +1,44 @@
+import { APIError } from '@anthropic-ai/sdk'
+import { expect, test } from 'bun:test'
+
+import { getAssistantMessageFromError } from './errors.js'
+
+function getFirstText(message: ReturnType<typeof getAssistantMessageFromError>): string {
+  const first = message.message.content[0]
+  if (!first || typeof first !== 'object' || !('text' in first)) {
+    return ''
+  }
+  return typeof first.text === 'string' ? first.text : ''
+}
+
+test('maps endpoint_not_found category markers to actionable setup guidance', () => {
+  const error = APIError.generate(
+    404,
+    undefined,
+    'OpenAI API error 404: Not Found [openai_category=endpoint_not_found] Hint: Confirm OPENAI_BASE_URL includes /v1.',
+    new Headers(),
+  )
+
+  const message = getAssistantMessageFromError(error, 'qwen2.5-coder:7b')
+  const text = getFirstText(message)
+
+  expect(message.isApiErrorMessage).toBe(true)
+  expect(text).toContain('Provider endpoint was not found')
+  expect(text).toContain('OPENAI_BASE_URL')
+  expect(text).toContain('/v1')
+})
+
+test('maps tool_call_incompatible category markers to model/tool guidance', () => {
+  const error = APIError.generate(
+    400,
+    undefined,
+    'OpenAI API error 400: tool_calls are not supported [openai_category=tool_call_incompatible]',
+    new Headers(),
+  )
+
+  const message = getAssistantMessageFromError(error, 'qwen2.5-coder:7b')
+  const text = getFirstText(message)
+
+  expect(text).toContain('rejected tool-calling payloads')
+  expect(text).toContain('/model')
+})
--- a/src/services/api/errors.ts
+++ b/src/services/api/errors.ts
@@ -50,9 +50,110 @@ import {
 } from '../claudeAiLimits.js'
 import { shouldProcessRateLimits } from '../rateLimitMocking.js' // Used for /mock-limits command
 import { extractConnectionErrorDetails, formatAPIError } from './errorUtils.js'
+import {
+  extractOpenAICategoryMarker,
+  type OpenAICompatibilityFailureCategory,
+} from './openaiErrorClassification.js'

 export const API_ERROR_MESSAGE_PREFIX = 'API Error'

+function stripOpenAICompatibilityMetadata(message: string): string {
+  return message
+    .replace(/\s*\[openai_category=[a-z_]+\]\s*/g, ' ')
+    .replace(/\s{2,}/g, ' ')
+    .trim()
+}
+
+function mapOpenAICompatibilityFailureToAssistantMessage(options: {
+  category: OpenAICompatibilityFailureCategory
+  model: string
+  rawMessage: string
+}): AssistantMessage {
+  const switchCmd = getIsNonInteractiveSession() ? '--model' : '/model'
+  const compactHint = getIsNonInteractiveSession()
+    ? 'Reduce prompt size or start a new session.'
+    : 'Run /compact or start a new session with /new.'
+
+  switch (options.category) {
+    case 'localhost_resolution_failed':
+    case 'connection_refused':
+      return createAssistantAPIErrorMessage({
+        content:
+          'Could not connect to the local OpenAI-compatible provider. Ensure the local server is running, then use OPENAI_BASE_URL=http://127.0.0.1:11434/v1 for Ollama.',
+        error: 'unknown',
+      })
+
+    case 'endpoint_not_found':
+      return createAssistantAPIErrorMessage({
+        content:
+          'Provider endpoint was not found. Confirm OPENAI_BASE_URL targets an OpenAI-compatible /v1 endpoint (for Ollama: http://127.0.0.1:11434/v1).',
+        error: 'invalid_request',
+      })
+
+    case 'model_not_found':
+      return createAssistantAPIErrorMessage({
+        content: `The selected model (${options.model}) is not available on this provider. Run ${switchCmd} to choose another model, or verify installed local models (for Ollama: ollama list).`,
+        error: 'invalid_request',
+      })
+
+    case 'auth_invalid':
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: Authentication failed for your OpenAI-compatible provider. Verify OPENAI_API_KEY and endpoint-specific auth requirements.`,
+        error: 'authentication_failed',
+      })
+
+    case 'rate_limited':
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: Provider rate limit reached. Retry in a few seconds.`,
+        error: 'rate_limit',
+      })
+
+    case 'request_timeout':
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: Provider request timed out. Local models may be loading or overloaded; retry shortly or increase API_TIMEOUT_MS.`,
+        error: 'unknown',
+      })
+
+    case 'context_overflow':
+      return createAssistantAPIErrorMessage({
+        content: `The conversation exceeded the provider context limit. ${compactHint}`,
+        error: 'invalid_request',
+      })
+
+    case 'tool_call_incompatible':
+      return createAssistantAPIErrorMessage({
+        content: `The selected provider/model rejected tool-calling payloads. Try ${switchCmd} to pick a tool-capable model or continue without tools.`,
+        error: 'invalid_request',
+      })
+
+    case 'malformed_provider_response':
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: Provider returned a malformed response. Confirm endpoint compatibility and check local proxy/network middleware.`,
+        error: 'unknown',
+        errorDetails: stripOpenAICompatibilityMetadata(options.rawMessage),
+      })
+
+    case 'provider_unavailable':
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: Provider is temporarily unavailable. Retry in a moment.`,
+        error: 'unknown',
+      })
+
+    case 'network_error':
+    case 'unknown':
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: ${stripOpenAICompatibilityMetadata(options.rawMessage)}`,
+        error: 'unknown',
+      })
+
+    default:
+      return createAssistantAPIErrorMessage({
+        content: `${API_ERROR_MESSAGE_PREFIX}: ${stripOpenAICompatibilityMetadata(options.rawMessage)}`,
+        error: 'unknown',
+      })
+  }
+}
+
 export function startsWithApiErrorPrefix(text: string): boolean {
  return (
    text.startsWith(API_ERROR_MESSAGE_PREFIX) ||
@@ -457,6 +558,19 @@ export function getAssistantMessageFromError(
    })
  }

+  // OpenAI-compatible transport and HTTP failures include structured category
+  // markers from openaiShim.ts for actionable end-user remediation.
+  if (error instanceof APIError) {
+    const openaiCategory = extractOpenAICategoryMarker(error.message)
+    if (openaiCategory) {
+      return mapOpenAICompatibilityFailureToAssistantMessage({
+        category: openaiCategory,
+        model,
+        rawMessage: error.message,
+      })
+    }
+  }
+
  // Check for emergency capacity off switch for Opus PAYG users
  if (
    error instanceof Error &&
--- a/src/services/api/fetchWithProxyRetry.test.ts
+++ b/src/services/api/fetchWithProxyRetry.test.ts
@@ -0,0 +1,86 @@
+import { afterEach, beforeEach, expect, test } from 'bun:test'
+
+import { _resetKeepAliveForTesting } from '../../utils/proxy.js'
+import {
+  fetchWithProxyRetry,
+  isRetryableFetchError,
+} from './fetchWithProxyRetry.js'
+
+type FetchType = typeof globalThis.fetch
+
+const originalFetch = globalThis.fetch
+const originalEnv = {
+  HTTP_PROXY: process.env.HTTP_PROXY,
+  HTTPS_PROXY: process.env.HTTPS_PROXY,
+}
+
+function restoreEnv(key: 'HTTP_PROXY' | 'HTTPS_PROXY', value: string | undefined): void {
+  if (value === undefined) {
+    delete process.env[key]
+  } else {
+    process.env[key] = value
+  }
+}
+
+beforeEach(() => {
+  process.env.HTTP_PROXY = 'http://127.0.0.1:15236'
+  delete process.env.HTTPS_PROXY
+  _resetKeepAliveForTesting()
+})
+
+afterEach(() => {
+  globalThis.fetch = originalFetch
+  restoreEnv('HTTP_PROXY', originalEnv.HTTP_PROXY)
+  restoreEnv('HTTPS_PROXY', originalEnv.HTTPS_PROXY)
+  _resetKeepAliveForTesting()
+})
+
+test('isRetryableFetchError matches Bun socket-closed failures', () => {
+  expect(
+    isRetryableFetchError(
+      new Error(
+        'The socket connection was closed unexpectedly. For more information, pass `verbose: true` in the second argument to fetch()',
+      ),
+    ),
+  ).toBe(true)
+})
+
+test('fetchWithProxyRetry retries once with keepalive disabled after socket closure', async () => {
+  const calls: Array<RequestInit | undefined> = []
+
+  globalThis.fetch = (async (_input, init) => {
+    calls.push(init)
+    if (calls.length === 1) {
+      throw new Error(
+        'The socket connection was closed unexpectedly. For more information, pass `verbose: true` in the second argument to fetch()',
+      )
+    }
+    return new Response('ok')
+  }) as FetchType
+
+  const response = await fetchWithProxyRetry('https://example.com/search', {
+    method: 'POST',
+  })
+
+  expect(await response.text()).toBe('ok')
+  expect(calls).toHaveLength(2)
+  expect((calls[0] as RequestInit & { proxy?: string }).proxy).toBe(
+    'http://127.0.0.1:15236',
+  )
+  expect((calls[0] as RequestInit).keepalive).toBeUndefined()
+  expect((calls[1] as RequestInit).keepalive).toBe(false)
+})
+
+test('fetchWithProxyRetry does not retry non-network errors', async () => {
+  let attempts = 0
+
+  globalThis.fetch = (async () => {
+    attempts += 1
+    throw new Error('400 bad request')
+  }) as FetchType
+
+  await expect(fetchWithProxyRetry('https://example.com')).rejects.toThrow(
+    '400 bad request',
+  )
+  expect(attempts).toBe(1)
+})
--- a/src/services/api/fetchWithProxyRetry.ts
+++ b/src/services/api/fetchWithProxyRetry.ts
@@ -0,0 +1,44 @@
+import { disableKeepAlive, getProxyFetchOptions } from '../../utils/proxy.js'
+
+const RETRYABLE_FETCH_ERROR_PATTERN =
+  /socket connection was closed unexpectedly|ECONNRESET|EPIPE|socket hang up|Connection reset by peer|fetch failed/i
+
+export function isRetryableFetchError(error: unknown): boolean {
+  if (!(error instanceof Error)) {
+    return false
+  }
+  if (error.name === 'AbortError') {
+    return false
+  }
+  return RETRYABLE_FETCH_ERROR_PATTERN.test(error.message)
+}
+
+export async function fetchWithProxyRetry(
+  input: string | URL | Request,
+  init?: RequestInit,
+  options?: { forAnthropicAPI?: boolean; maxAttempts?: number },
+): Promise<Response> {
+  const maxAttempts = Math.max(1, options?.maxAttempts ?? 2)
+  let lastError: unknown
+
+  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
+    try {
+      return await fetch(input, {
+        ...init,
+        ...getProxyFetchOptions({
+          forAnthropicAPI: options?.forAnthropicAPI,
+        }),
+      })
+    } catch (error) {
+      lastError = error
+      if (attempt >= maxAttempts || !isRetryableFetchError(error)) {
+        throw error
+      }
+      disableKeepAlive()
+    }
+  }
+
+  throw lastError instanceof Error
+    ? lastError
+    : new Error('Fetch failed without an error object')
+}
--- a/src/services/api/openaiErrorClassification.test.ts
+++ b/src/services/api/openaiErrorClassification.test.ts
@@ -0,0 +1,97 @@
+import { expect, test } from 'bun:test'
+
+import {
+  buildOpenAICompatibilityErrorMessage,
+  classifyOpenAIHttpFailure,
+  classifyOpenAINetworkFailure,
+  extractOpenAICategoryMarker,
+  formatOpenAICategoryMarker,
+} from './openaiErrorClassification.js'
+
+test('classifies localhost ECONNREFUSED as connection_refused', () => {
+  const error = Object.assign(new TypeError('fetch failed'), {
+    code: 'ECONNREFUSED',
+  })
+
+  const failure = classifyOpenAINetworkFailure(error, {
+    url: 'http://localhost:11434/v1/chat/completions',
+  })
+
+  expect(failure.category).toBe('connection_refused')
+  expect(failure.retryable).toBe(true)
+  expect(failure.code).toBe('ECONNREFUSED')
+  expect(failure.hint).toContain('local server is running')
+})
+
+test('classifies localhost ENOTFOUND as localhost_resolution_failed', () => {
+  const error = Object.assign(new TypeError('getaddrinfo ENOTFOUND localhost'), {
+    code: 'ENOTFOUND',
+  })
+
+  const failure = classifyOpenAINetworkFailure(error, {
+    url: 'http://localhost:11434/v1/chat/completions',
+  })
+
+  expect(failure.category).toBe('localhost_resolution_failed')
+  expect(failure.retryable).toBe(true)
+  expect(failure.code).toBe('ENOTFOUND')
+  expect(failure.hint).toContain('127.0.0.1')
+})
+
+test('classifies model-not-found 404 responses', () => {
+  const failure = classifyOpenAIHttpFailure({
+    status: 404,
+    body: 'The model qwen2.5-coder:7b was not found',
+  })
+
+  expect(failure.category).toBe('model_not_found')
+  expect(failure.retryable).toBe(false)
+})
+
+test('classifies generic 404 responses as endpoint_not_found', () => {
+  const failure = classifyOpenAIHttpFailure({
+    status: 404,
+    body: 'Not Found',
+  })
+
+  expect(failure.category).toBe('endpoint_not_found')
+  expect(failure.hint).toContain('/v1')
+})
+
+test('classifies context-overflow responses', () => {
+  const failure = classifyOpenAIHttpFailure({
+    status: 500,
+    body: 'request too large: maximum context length exceeded',
+  })
+
+  expect(failure.category).toBe('context_overflow')
+  expect(failure.retryable).toBe(false)
+})
+
+test('classifies tool compatibility failures', () => {
+  const failure = classifyOpenAIHttpFailure({
+    status: 400,
+    body: 'tool_calls are not supported by this model',
+  })
+
+  expect(failure.category).toBe('tool_call_incompatible')
+})
+
+test('embeds and extracts category markers in formatted messages', () => {
+  const marker = formatOpenAICategoryMarker('endpoint_not_found')
+  expect(marker).toBe('[openai_category=endpoint_not_found]')
+
+  const formatted = buildOpenAICompatibilityErrorMessage('OpenAI API error 404: Not Found', {
+    category: 'endpoint_not_found',
+    hint: 'Confirm OPENAI_BASE_URL includes /v1.',
+  })
+
+  expect(formatted).toContain('[openai_category=endpoint_not_found]')
+  expect(formatted).toContain('Hint: Confirm OPENAI_BASE_URL includes /v1.')
+  expect(extractOpenAICategoryMarker(formatted)).toBe('endpoint_not_found')
+})
+
+test('ignores unknown category markers during extraction', () => {
+  const malformed = 'OpenAI API error 500 [openai_category=totally_fake_category]'
+  expect(extractOpenAICategoryMarker(malformed)).toBeUndefined()
+})
--- a/src/services/api/openaiErrorClassification.ts
+++ b/src/services/api/openaiErrorClassification.ts
@@ -0,0 +1,352 @@
+export type OpenAICompatibilityFailureCategory =
+  | 'connection_refused'
+  | 'localhost_resolution_failed'
+  | 'request_timeout'
+  | 'network_error'
+  | 'auth_invalid'
+  | 'rate_limited'
+  | 'model_not_found'
+  | 'endpoint_not_found'
+  | 'context_overflow'
+  | 'tool_call_incompatible'
+  | 'malformed_provider_response'
+  | 'provider_unavailable'
+  | 'unknown'
+
+export type OpenAICompatibilityFailure = {
+  source: 'network' | 'http'
+  category: OpenAICompatibilityFailureCategory
+  retryable: boolean
+  message: string
+  hint?: string
+  code?: string
+  status?: number
+}
+
+const OPENAI_CATEGORY_MARKER_PREFIX = '[openai_category='
+
+const LOCALHOST_HOSTNAMES = new Set(['localhost', '127.0.0.1', '::1'])
+
+const OPENAI_COMPATIBILITY_FAILURE_CATEGORIES: ReadonlySet<OpenAICompatibilityFailureCategory> =
+  new Set<OpenAICompatibilityFailureCategory>([
+    'connection_refused',
+    'localhost_resolution_failed',
+    'request_timeout',
+    'network_error',
+    'auth_invalid',
+    'rate_limited',
+    'model_not_found',
+    'endpoint_not_found',
+    'context_overflow',
+    'tool_call_incompatible',
+    'malformed_provider_response',
+    'provider_unavailable',
+    'unknown',
+  ])
+
+function isOpenAICompatibilityFailureCategory(
+  value: string,
+): value is OpenAICompatibilityFailureCategory {
+  return OPENAI_COMPATIBILITY_FAILURE_CATEGORIES.has(
+    value as OpenAICompatibilityFailureCategory,
+  )
+}
+
+function getErrorCode(error: unknown): string | undefined {
+  let current: unknown = error
+  const maxDepth = 5
+
+  for (let depth = 0; depth < maxDepth; depth++) {
+    if (
+      current &&
+      typeof current === 'object' &&
+      'code' in current &&
+      typeof (current as { code?: unknown }).code === 'string'
+    ) {
+      return (current as { code: string }).code
+    }
+
+    if (
+      current &&
+      typeof current === 'object' &&
+      'cause' in current &&
+      (current as { cause?: unknown }).cause !== current
+    ) {
+      current = (current as { cause?: unknown }).cause
+      continue
+    }
+
+    break
+  }
+
+  return undefined
+}
+
+function getHostname(url: string): string | null {
+  try {
+    return new URL(url).hostname.toLowerCase()
+  } catch {
+    return null
+  }
+}
+
+function isLocalhostLikeHostname(hostname: string | null): boolean {
+  if (!hostname) return false
+  if (LOCALHOST_HOSTNAMES.has(hostname)) return true
+  return /^127\./.test(hostname)
+}
+
+function isContextOverflowMessage(body: string): boolean {
+  const lower = body.toLowerCase()
+  return (
+    lower.includes('too many tokens') ||
+    lower.includes('request too large') ||
+    lower.includes('context length') ||
+    lower.includes('maximum context') ||
+    lower.includes('input length') ||
+    lower.includes('payload too large') ||
+    lower.includes('prompt is too long')
+  )
+}
+
+function isToolCompatibilityMessage(body: string): boolean {
+  const lower = body.toLowerCase()
+  return (
+    lower.includes('tool_calls') ||
+    lower.includes('tool_call') ||
+    lower.includes('tool_use') ||
+    lower.includes('tool_result') ||
+    lower.includes('function calling') ||
+    lower.includes('function call')
+  )
+}
+
+function isMalformedProviderResponse(body: string): boolean {
+  const lower = body.toLowerCase()
+  return (
+    lower.includes('<!doctype html') ||
+    lower.includes('<html') ||
+    lower.includes('invalid json') ||
+    lower.includes('malformed') ||
+    lower.includes('unexpected token') ||
+    lower.includes('cannot parse') ||
+    lower.includes('not valid json')
+  )
+}
+
+function isModelNotFoundMessage(body: string): boolean {
+  const lower = body.toLowerCase()
+  return (
+    lower.includes('model') &&
+    (
+      lower.includes('not found') ||
+      lower.includes('does not exist') ||
+      lower.includes('unknown model') ||
+      lower.includes('unavailable model')
+    )
+  )
+}
+
+export function formatOpenAICategoryMarker(
+  category: OpenAICompatibilityFailureCategory,
+): string {
+  return `${OPENAI_CATEGORY_MARKER_PREFIX}${category}]`
+}
+
+export function extractOpenAICategoryMarker(
+  message: string,
+): OpenAICompatibilityFailureCategory | undefined {
+  const match = message.match(/\[openai_category=([a-z_]+)]/)
+  const category = match?.[1]
+
+  if (!category || !isOpenAICompatibilityFailureCategory(category)) {
+    return undefined
+  }
+
+  return category
+}
+
+export function buildOpenAICompatibilityErrorMessage(
+  baseMessage: string,
+  failure: Pick<OpenAICompatibilityFailure, 'category' | 'hint'>,
+): string {
+  const marker = formatOpenAICategoryMarker(failure.category)
+  const hint = failure.hint ? ` Hint: ${failure.hint}` : ''
+  return `${baseMessage} ${marker}${hint}`
+}
+
+export function classifyOpenAINetworkFailure(
+  error: unknown,
+  options: { url: string },
+): OpenAICompatibilityFailure {
+  const message = error instanceof Error ? error.message : String(error)
+  const lowerMessage = message.toLowerCase()
+  const code = getErrorCode(error)
+  const hostname = getHostname(options.url)
+  const isLocalHost = isLocalhostLikeHostname(hostname)
+
+  if (
+    code === 'ETIMEDOUT' ||
+    code === 'UND_ERR_CONNECT_TIMEOUT' ||
+    lowerMessage.includes('timeout') ||
+    lowerMessage.includes('timed out') ||
+    lowerMessage.includes('aborterror')
+  ) {
+    return {
+      source: 'network',
+      category: 'request_timeout',
+      retryable: true,
+      message,
+      code,
+      hint: 'The provider took too long to respond. Check local model load time or increase API timeout.',
+    }
+  }
+
+  if (
+    isLocalHost &&
+    (
+      code === 'ENOTFOUND' ||
+      code === 'EAI_AGAIN' ||
+      lowerMessage.includes('getaddrinfo') ||
+      (code === undefined && lowerMessage.includes('fetch failed'))
+    )
+  ) {
+    return {
+      source: 'network',
+      category: 'localhost_resolution_failed',
+      retryable: true,
+      message,
+      code,
+      hint: 'Localhost failed for this request. Retry with 127.0.0.1 and confirm Ollama is serving on the configured port.',
+    }
+  }
+
+  if (code === 'ECONNREFUSED') {
+    return {
+      source: 'network',
+      category: 'connection_refused',
+      retryable: true,
+      message,
+      code,
+      hint: isLocalHost
+        ? 'Connection to the local provider was refused. Ensure the local server is running and listening on the configured port.'
+        : 'Connection was refused by the provider endpoint. Ensure the server is running and the port is correct.',
+    }
+  }
+
+  return {
+    source: 'network',
+    category: 'network_error',
+    retryable: true,
+    message,
+    code,
+    hint: 'Network transport failed before a provider response was received.',
+  }
+}
+
+export function classifyOpenAIHttpFailure(options: {
+  status: number
+  body: string
+}): OpenAICompatibilityFailure {
+  const body = options.body ?? ''
+
+  if (options.status === 401 || options.status === 403) {
+    return {
+      source: 'http',
+      category: 'auth_invalid',
+      retryable: false,
+      status: options.status,
+      message: body,
+      hint: 'Authentication failed. Verify API key, token source, and endpoint-specific auth headers.',
+    }
+  }
+
+  if (options.status === 429) {
+    return {
+      source: 'http',
+      category: 'rate_limited',
+      retryable: true,
+      status: options.status,
+      message: body,
+      hint: 'Provider rate-limited the request. Retry after backoff.',
+    }
+  }
+
+  if (options.status === 404 && isModelNotFoundMessage(body)) {
+    return {
+      source: 'http',
+      category: 'model_not_found',
+      retryable: false,
+      status: options.status,
+      message: body,
+      hint: 'The selected model is not installed or not available on this endpoint.',
+    }
+  }
+
+  if (options.status === 404) {
+    return {
+      source: 'http',
+      category: 'endpoint_not_found',
+      retryable: false,
+      status: options.status,
+      message: body,
+      hint: 'Endpoint was not found. Confirm OPENAI_BASE_URL includes /v1 for OpenAI-compatible local providers.',
+    }
+  }
+
+  if (
+    options.status === 413 ||
+    ((options.status === 400 || options.status >= 500) &&
+      isContextOverflowMessage(body))
+  ) {
+    return {
+      source: 'http',
+      category: 'context_overflow',
+      retryable: false,
+      status: options.status,
+      message: body,
+      hint: 'Prompt context exceeded model/server limits. Reduce context or increase provider context length.',
+    }
+  }
+
+  if (options.status === 400 && isToolCompatibilityMessage(body)) {
+    return {
+      source: 'http',
+      category: 'tool_call_incompatible',
+      retryable: false,
+      status: options.status,
+      message: body,
+      hint: 'Provider/model rejected tool-calling payload. Retry without tools or use a tool-capable model.',
+    }
+  }
+
+  if (options.status >= 400 && isMalformedProviderResponse(body)) {
+    return {
+      source: 'http',
+      category: 'malformed_provider_response',
+      retryable: false,
+      status: options.status,
+      message: body,
+      hint: 'Provider returned malformed or non-JSON response where JSON was expected.',
+    }
+  }
+
+  if (options.status >= 500) {
+    return {
+      source: 'http',
+      category: 'provider_unavailable',
+      retryable: true,
+      status: options.status,
+      message: body,
+      hint: 'Provider reported a server-side failure. Retry after a short delay.',
+    }
+  }
+
+  return {
+    source: 'http',
+    category: 'unknown',
+    retryable: false,
+    status: options.status,
+    message: body,
+  }
+}
--- a/src/services/api/openaiShim.compression.test.ts
+++ b/src/services/api/openaiShim.compression.test.ts
@@ -0,0 +1,317 @@
+import { afterEach, beforeEach, expect, mock, test } from 'bun:test'
+import { createOpenAIShimClient } from './openaiShim.js'
+
+type FetchType = typeof globalThis.fetch
+const originalFetch = globalThis.fetch
+
+const originalEnv = {
+  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
+  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
+  OPENAI_MODEL: process.env.OPENAI_MODEL,
+}
+
+// Mock config + autoCompact so the shim sees deterministic state.
+const mockState = {
+  enabled: true,
+  effectiveWindow: 100_000, // Copilot gpt-4o tier
+}
+
+mock.module('../../utils/config.js', () => ({
+  getGlobalConfig: () => ({
+    toolHistoryCompressionEnabled: mockState.enabled,
+    autoCompactEnabled: false,
+  }),
+}))
+
+mock.module('../compact/autoCompact.js', () => ({
+  getEffectiveContextWindowSize: () => mockState.effectiveWindow,
+}))
+
+type OpenAIShimClient = {
+  beta: {
+    messages: {
+      create: (
+        params: Record<string, unknown>,
+        options?: Record<string, unknown>,
+      ) => Promise<unknown>
+    }
+  }
+}
+
+function bigText(n: number): string {
+  return 'A'.repeat(n)
+}
+
+function buildToolExchange(id: number, resultLength: number) {
+  return [
+    {
+      role: 'assistant',
+      content: [
+        {
+          type: 'tool_use',
+          id: `toolu_${id}`,
+          name: 'Read',
+          input: { file_path: `/path/to/file${id}.ts` },
+        },
+      ],
+    },
+    {
+      role: 'user',
+      content: [
+        {
+          type: 'tool_result',
+          tool_use_id: `toolu_${id}`,
+          content: bigText(resultLength),
+        },
+      ],
+    },
+  ]
+}
+
+function buildLongConversation(numExchanges: number, resultLength = 5_000) {
+  const out: Array<{ role: string; content: unknown }> = [
+    { role: 'user', content: 'start the work' },
+  ]
+  for (let i = 0; i < numExchanges; i++) {
+    out.push(...buildToolExchange(i, resultLength))
+  }
+  return out
+}
+
+function makeFakeResponse(): Response {
+  return new Response(
+    JSON.stringify({
+      id: 'chatcmpl-1',
+      model: 'gpt-4o',
+      choices: [
+        {
+          message: { role: 'assistant', content: 'done' },
+          finish_reason: 'stop',
+        },
+      ],
+      usage: { prompt_tokens: 8, completion_tokens: 2, total_tokens: 10 },
+    }),
+    { headers: { 'Content-Type': 'application/json' } },
+  )
+}
+
+beforeEach(() => {
+  process.env.OPENAI_BASE_URL = 'http://example.test/v1'
+  process.env.OPENAI_API_KEY = 'test-key'
+  delete process.env.OPENAI_MODEL
+  mockState.enabled = true
+  mockState.effectiveWindow = 100_000
+})
+
+afterEach(() => {
+  if (originalEnv.OPENAI_BASE_URL === undefined) delete process.env.OPENAI_BASE_URL
+  else process.env.OPENAI_BASE_URL = originalEnv.OPENAI_BASE_URL
+  if (originalEnv.OPENAI_API_KEY === undefined) delete process.env.OPENAI_API_KEY
+  else process.env.OPENAI_API_KEY = originalEnv.OPENAI_API_KEY
+  if (originalEnv.OPENAI_MODEL === undefined) delete process.env.OPENAI_MODEL
+  else process.env.OPENAI_MODEL = originalEnv.OPENAI_MODEL
+  globalThis.fetch = originalFetch
+})
+
+async function captureRequestBody(
+  messages: Array<{ role: string; content: unknown }>,
+  model: string,
+): Promise<Record<string, unknown>> {
+  let captured: Record<string, unknown> | undefined
+
+  globalThis.fetch = (async (_input, init) => {
+    captured = JSON.parse(String(init?.body))
+    return makeFakeResponse()
+  }) as FetchType
+
+  const client = createOpenAIShimClient({}) as OpenAIShimClient
+  await client.beta.messages.create({
+    model,
+    system: 'system prompt',
+    messages,
+  })
+
+  if (!captured) throw new Error('request not captured')
+  return captured
+}
+
+function getToolMessages(body: Record<string, unknown>): Array<{ content: string }> {
+  const messages = body.messages as Array<{ role: string; content: string }>
+  return messages.filter(m => m.role === 'tool')
+}
+
+function getAssistantToolCalls(body: Record<string, unknown>): unknown[] {
+  const messages = body.messages as Array<{
+    role: string
+    tool_calls?: unknown[]
+  }>
+  return messages
+    .filter(m => m.role === 'assistant' && Array.isArray(m.tool_calls))
+    .flatMap(m => m.tool_calls ?? [])
+}
+
+// ============================================================================
+// BUG REPRO: without compression, full tool history is resent every turn
+// ============================================================================
+
+test('BUG REPRO: without compression, all 30 tool results are sent at full size', async () => {
+  mockState.enabled = false
+  const messages = buildLongConversation(30, 5_000)
+
+  const body = await captureRequestBody(messages, 'gpt-4o')
+  const toolMessages = getToolMessages(body)
+  const payloadSize = JSON.stringify(body).length
+
+  // All 30 tool results present, none truncated
+  expect(toolMessages.length).toBe(30)
+  for (const m of toolMessages) {
+    expect(m.content.length).toBeGreaterThanOrEqual(5_000)
+    expect(m.content).not.toContain('[…truncated')
+    expect(m.content).not.toContain('chars omitted')
+  }
+
+  // Total payload is large (~150KB raw) — this is the cost being paid every turn
+  expect(payloadSize).toBeGreaterThan(150_000)
+})
+
+// ============================================================================
+// FIX: with compression, recent kept full, mid truncated, old stubbed
+// ============================================================================
+
+test('FIX: with compression on Copilot gpt-4o (tier 5/10/rest), 30 turns shrinks dramatically', async () => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 100_000 // 64–128k → recent=5, mid=10
+  const messages = buildLongConversation(30, 5_000)
+
+  const body = await captureRequestBody(messages, 'gpt-4o')
+  const toolMessages = getToolMessages(body)
+  const payloadSize = JSON.stringify(body).length
+
+  // Structure preserved: still 30 tool messages, no orphan tool_calls
+  expect(toolMessages.length).toBe(30)
+  expect(getAssistantToolCalls(body).length).toBe(30)
+
+  // Tier breakdown (oldest → newest):
+  //   indices 0..14  → old tier (stubs)
+  //   indices 15..24 → mid tier (truncated)
+  //   indices 25..29 → recent (full)
+  for (let i = 0; i <= 14; i++) {
+    expect(toolMessages[i].content).toMatch(/^\[Read args=.*chars omitted\]$/)
+  }
+  for (let i = 15; i <= 24; i++) {
+    expect(toolMessages[i].content).toContain('[…truncated')
+  }
+  for (let i = 25; i <= 29; i++) {
+    expect(toolMessages[i].content.length).toBe(5_000)
+    expect(toolMessages[i].content).not.toContain('[…truncated')
+    expect(toolMessages[i].content).not.toContain('chars omitted')
+  }
+
+  // Significant reduction: from ~150KB to <60KB (10 mid×2KB + structure overhead)
+  expect(payloadSize).toBeLessThan(60_000)
+})
+
+// ============================================================================
+// FIX: large-context model gets generous tiers — compression effectively inert
+// ============================================================================
+
+test('FIX: gpt-4.1 (1M context) with 25 exchanges keeps all full (recent tier=25)', async () => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 1_000_000 // ≥500k → recent=25, mid=50
+  const messages = buildLongConversation(25, 5_000)
+
+  const body = await captureRequestBody(messages, 'gpt-4.1')
+  const toolMessages = getToolMessages(body)
+
+  expect(toolMessages.length).toBe(25)
+  for (const m of toolMessages) {
+    expect(m.content.length).toBe(5_000)
+    expect(m.content).not.toContain('[…truncated')
+    expect(m.content).not.toContain('chars omitted')
+  }
+})
+
+test('FIX: gpt-4.1 (1M context) with 30 exchanges → only first 5 mid-truncated', async () => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 1_000_000 // recent=25, mid=50
+  const messages = buildLongConversation(30, 5_000)
+
+  const body = await captureRequestBody(messages, 'gpt-4.1')
+  const toolMessages = getToolMessages(body)
+
+  // 30 total: indices 0..4 mid, indices 5..29 recent
+  for (let i = 0; i < 5; i++) {
+    expect(toolMessages[i].content).toContain('[…truncated')
+  }
+  for (let i = 5; i < 30; i++) {
+    expect(toolMessages[i].content.length).toBe(5_000)
+  }
+})
+
+// ============================================================================
+// FIX: stub preserves tool name and args — model can re-invoke if needed
+// ============================================================================
+
+test('FIX: stub format includes original tool name and arguments', async () => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 100_000
+  const messages = buildLongConversation(30, 5_000)
+
+  const body = await captureRequestBody(messages, 'gpt-4o')
+  const toolMessages = getToolMessages(body)
+  const oldestStub = toolMessages[0].content
+
+  // Format: [<tool_name> args=<json> → <N> chars omitted]
+  expect(oldestStub).toMatch(/^\[Read /)
+  expect(oldestStub).toMatch(/file_path/)
+  expect(oldestStub).toMatch(/→ 5000 chars omitted\]$/)
+})
+
+// ============================================================================
+// FIX: tool_use blocks (assistant tool_calls) are never modified
+// ============================================================================
+
+test('FIX: every tool_call retains its full id, name, and arguments', async () => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 100_000
+  const messages = buildLongConversation(30, 5_000)
+
+  const body = await captureRequestBody(messages, 'gpt-4o')
+  const toolCalls = getAssistantToolCalls(body) as Array<{
+    id: string
+    function: { name: string; arguments: string }
+  }>
+
+  expect(toolCalls.length).toBe(30)
+  for (let i = 0; i < toolCalls.length; i++) {
+    expect(toolCalls[i].id).toBe(`toolu_${i}`)
+    expect(toolCalls[i].function.name).toBe('Read')
+    expect(JSON.parse(toolCalls[i].function.arguments)).toEqual({
+      file_path: `/path/to/file${i}.ts`,
+    })
+  }
+})
+
+// ============================================================================
+// FIX: small-context provider (Mistral 32k) gets aggressive compression
+// ============================================================================
+
+test('FIX: 32k window (Mistral tier) → recent=3 keeps last 3 only', async () => {
+  mockState.enabled = true
+  mockState.effectiveWindow = 24_000 // 16–32k → recent=3, mid=5
+  const messages = buildLongConversation(15, 3_000)
+
+  const body = await captureRequestBody(messages, 'mistral-large-latest')
+  const toolMessages = getToolMessages(body)
+
+  // 15 total: indices 0..6 old, 7..11 mid, 12..14 recent
+  for (let i = 0; i <= 6; i++) {
+    expect(toolMessages[i].content).toContain('chars omitted')
+  }
+  for (let i = 7; i <= 11; i++) {
+    expect(toolMessages[i].content).toContain('[…truncated')
+  }
+  for (let i = 12; i <= 14; i++) {
+    expect(toolMessages[i].content.length).toBe(3_000)
+  }
+})
--- a/src/services/api/openaiShim.diagnostics.test.ts
+++ b/src/services/api/openaiShim.diagnostics.test.ts
@@ -0,0 +1,286 @@
+import { afterEach, expect, mock, test } from 'bun:test'
+
+const originalFetch = globalThis.fetch
+const originalEnv = {
+  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
+  OPENAI_API_KEY: process.env.OPENAI_API_KEY,
+  OPENAI_MODEL: process.env.OPENAI_MODEL,
+}
+
+function restoreEnv(key: string, value: string | undefined): void {
+  if (value === undefined) {
+    delete process.env[key]
+  } else {
+    process.env[key] = value
+  }
+}
+
+afterEach(() => {
+  globalThis.fetch = originalFetch
+  restoreEnv('OPENAI_BASE_URL', originalEnv.OPENAI_BASE_URL)
+  restoreEnv('OPENAI_API_KEY', originalEnv.OPENAI_API_KEY)
+  restoreEnv('OPENAI_MODEL', originalEnv.OPENAI_MODEL)
+  mock.restore()
+})
+
+test('logs classified transport diagnostics with category and code', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
+
+  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
+  process.env.OPENAI_API_KEY = 'ollama'
+
+  const transportError = Object.assign(new TypeError('fetch failed'), {
+    code: 'ECONNREFUSED',
+  })
+
+  globalThis.fetch = mock(async () => {
+    throw transportError
+  }) as typeof globalThis.fetch
+
+  const client = createOpenAIShimClient({}) as {
+    beta: {
+      messages: {
+        create: (params: Record<string, unknown>) => Promise<unknown>
+      }
+    }
+  }
+
+  await expect(
+    client.beta.messages.create({
+      model: 'qwen2.5-coder:7b',
+      messages: [{ role: 'user', content: 'hello' }],
+      max_tokens: 64,
+      stream: false,
+    }),
+  ).rejects.toThrow('openai_category=connection_refused')
+
+  const transportLog = debugSpy.mock.calls.find(call =>
+    typeof call?.[0] === 'string' && call[0].includes('transport failure'),
+  )
+
+  expect(transportLog).toBeDefined()
+  expect(String(transportLog?.[0])).toContain('category=connection_refused')
+  expect(String(transportLog?.[0])).toContain('code=ECONNREFUSED')
+  expect(transportLog?.[1]).toEqual({ level: 'warn' })
+})
+
+test('redacts credentials in transport diagnostic URL logs', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
+
+  process.env.OPENAI_BASE_URL = 'http://user:supersecret@localhost:11434/v1'
+  process.env.OPENAI_API_KEY = 'supersecret'
+
+  const transportError = Object.assign(new TypeError('fetch failed'), {
+    code: 'ECONNREFUSED',
+  })
+
+  globalThis.fetch = mock(async () => {
+    throw transportError
+  }) as typeof globalThis.fetch
+
+  const client = createOpenAIShimClient({}) as {
+    beta: {
+      messages: {
+        create: (params: Record<string, unknown>) => Promise<unknown>
+      }
+    }
+  }
+
+  await expect(
+    client.beta.messages.create({
+      model: 'qwen2.5-coder:7b',
+      messages: [{ role: 'user', content: 'hello' }],
+      max_tokens: 64,
+      stream: false,
+    }),
+  ).rejects.toThrow('openai_category=connection_refused')
+
+  const transportLog = debugSpy.mock.calls.find(call =>
+    typeof call?.[0] === 'string' && call[0].includes('transport failure'),
+  )
+
+  expect(transportLog).toBeDefined()
+  const logLine = String(transportLog?.[0])
+  expect(logLine).toContain('url=http://redacted:redacted@localhost:11434/v1/chat/completions')
+  expect(logLine).not.toContain('user:supersecret')
+  expect(logLine).not.toContain('supersecret@')
+})
+test('logs self-heal localhost fallback with redacted from/to URLs', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
+
+  process.env.OPENAI_BASE_URL = 'http://user:supersecret@localhost:11434/v1'
+  process.env.OPENAI_API_KEY = 'supersecret'
+
+  globalThis.fetch = mock(async (input: string | Request) => {
+    const url = typeof input === 'string' ? input : input.url
+    if (url.includes('localhost')) {
+      throw Object.assign(new TypeError('fetch failed'), {
+        code: 'ENOTFOUND',
+      })
+    }
+
+    return new Response(
+      JSON.stringify({
+        id: 'chatcmpl-1',
+        model: 'qwen2.5-coder:7b',
+        choices: [
+          {
+            message: {
+              role: 'assistant',
+              content: 'ok',
+            },
+            finish_reason: 'stop',
+          },
+        ],
+        usage: {
+          prompt_tokens: 5,
+          completion_tokens: 2,
+          total_tokens: 7,
+        },
+      }),
+      {
+        status: 200,
+        headers: {
+          'Content-Type': 'application/json',
+        },
+      },
+    )
+  }) as typeof globalThis.fetch
+
+  const client = createOpenAIShimClient({}) as {
+    beta: {
+      messages: {
+        create: (params: Record<string, unknown>) => Promise<unknown>
+      }
+    }
+  }
+
+  await expect(
+    client.beta.messages.create({
+      model: 'qwen2.5-coder:7b',
+      messages: [{ role: 'user', content: 'hello' }],
+      max_tokens: 64,
+      stream: false,
+    }),
+  ).resolves.toBeDefined()
+
+  const fallbackLog = debugSpy.mock.calls.find(call =>
+    typeof call?.[0] === 'string' &&
+    call[0].includes('self-heal retry reason=localhost_resolution_failed'),
+  )
+
+  expect(fallbackLog).toBeDefined()
+  const logLine = String(fallbackLog?.[0])
+  expect(logLine).toContain('from=http://redacted:redacted@localhost:11434/v1/chat/completions')
+  expect(logLine).toContain('to=http://redacted:redacted@127.0.0.1:11434/v1/chat/completions')
+  expect(logLine).not.toContain('supersecret')
+})
+
+test('logs self-heal toolless retry for local tool-call incompatibility', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { createOpenAIShimClient } = await import(`./openaiShim.ts?ts=${nonce}`)
+
+  process.env.OPENAI_BASE_URL = 'http://localhost:11434/v1'
+  process.env.OPENAI_API_KEY = 'ollama'
+
+  let callCount = 0
+  globalThis.fetch = mock(async () => {
+    callCount += 1
+    if (callCount === 1) {
+      return new Response('tool_calls are not supported', {
+        status: 400,
+        headers: {
+          'Content-Type': 'text/plain',
+        },
+      })
+    }
+
+    return new Response(
+      JSON.stringify({
+        id: 'chatcmpl-1',
+        model: 'qwen2.5-coder:7b',
+        choices: [
+          {
+            message: {
+              role: 'assistant',
+              content: 'ok',
+            },
+            finish_reason: 'stop',
+          },
+        ],
+        usage: {
+          prompt_tokens: 7,
+          completion_tokens: 3,
+          total_tokens: 10,
+        },
+      }),
+      {
+        status: 200,
+        headers: {
+          'Content-Type': 'application/json',
+        },
+      },
+    )
+  }) as typeof globalThis.fetch
+
+  const client = createOpenAIShimClient({}) as {
+    beta: {
+      messages: {
+        create: (params: Record<string, unknown>) => Promise<unknown>
+      }
+    }
+  }
+
+  await expect(
+    client.beta.messages.create({
+      model: 'qwen2.5-coder:7b',
+      messages: [{ role: 'user', content: 'hello' }],
+      tools: [
+        {
+          name: 'Read',
+          description: 'Read file',
+          input_schema: {
+            type: 'object',
+            properties: {
+              filePath: { type: 'string' },
+            },
+            required: ['filePath'],
+          },
+        },
+      ],
+      max_tokens: 64,
+      stream: false,
+    }),
+  ).resolves.toBeDefined()
+
+  const fallbackLog = debugSpy.mock.calls.find(call =>
+    typeof call?.[0] === 'string' &&
+    call[0].includes('self-heal retry reason=tool_call_incompatible mode=toolless'),
+  )
+
+  expect(fallbackLog).toBeDefined()
+  expect(fallbackLog?.[1]).toEqual({ level: 'warn' })
+})
--- a/src/services/api/openaiShim.test.ts
+++ b/src/services/api/openaiShim.test.ts
--- a/src/services/api/openaiShim.ts
+++ b/src/services/api/openaiShim.ts
@@ -32,10 +32,9 @@ import { resolveGeminiCredential } from '../../utils/geminiAuth.js'
 import { hydrateGeminiAccessTokenFromSecureStorage } from '../../utils/geminiCredentials.js'
 import { hydrateGithubModelsTokenFromSecureStorage } from '../../utils/githubModelsCredentials.js'
 import {
-  looksLikeLeakedReasoningPrefix,
-  shouldBufferPotentialReasoningPrefix,
-  stripLeakedReasoningPreamble,
-} from './reasoningLeakSanitizer.js'
+  createThinkTagFilter,
+  stripThinkTags,
+} from './thinkTagSanitizer.js'
 import {
  codexStreamToAnthropic,
  collectCodexCompletedResponse,
@@ -47,18 +46,29 @@ import {
  type AnthropicUsage,
  type ShimCreateParams,
 } from './codexShim.js'
+import { compressToolHistory } from './compressToolHistory.js'
+import { fetchWithProxyRetry } from './fetchWithProxyRetry.js'
 import {
+  getLocalProviderRetryBaseUrls,
+  getGithubEndpointType,
  isLocalProviderUrl,
  resolveRuntimeCodexCredentials,
  resolveProviderRequest,
-  getGithubEndpointType,
+  shouldAttemptLocalToollessRetry,
 } from './providerConfig.js'
+import {
+  buildOpenAICompatibilityErrorMessage,
+  classifyOpenAIHttpFailure,
+  classifyOpenAINetworkFailure,
+} from './openaiErrorClassification.js'
 import { sanitizeSchemaForOpenAICompat } from '../../utils/schemaSanitizer.js'
 import { redactSecretValueForDisplay } from '../../utils/providerProfile.js'
 import {
  normalizeToolArguments,
  hasToolFieldMapping,
 } from './toolArgumentNormalization.js'
+import { logApiCallStart, logApiCallEnd } from '../../utils/requestLogging.js'
+import { createStreamState, processStreamChunk, getStreamStats } from '../../utils/streamingOptimizer.js'

 type SecretValueSource = Partial<{
  OPENAI_API_KEY: string
@@ -74,6 +84,10 @@ const GITHUB_429_MAX_RETRIES = 3
 const GITHUB_429_BASE_DELAY_SEC = 1
 const GITHUB_429_MAX_DELAY_SEC = 32
 const GEMINI_API_HOST = 'generativelanguage.googleapis.com'
+const MOONSHOT_API_HOSTS = new Set([
+  'api.moonshot.ai',
+  'api.moonshot.cn',
+])

 const COPILOT_HEADERS: Record<string, string> = {
  'User-Agent': 'GitHubCopilotChat/0.26.7',
@@ -82,6 +96,19 @@ const COPILOT_HEADERS: Record<string, string> = {
  'Copilot-Integration-Id': 'vscode-chat',
 }

+const SENSITIVE_URL_QUERY_PARAM_NAMES = [
+  'api_key',
+  'key',
+  'token',
+  'access_token',
+  'refresh_token',
+  'signature',
+  'sig',
+  'secret',
+  'password',
+  'authorization',
+]
+
 function isGithubModelsMode(): boolean {
  return isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
 }
@@ -126,11 +153,48 @@ function hasGeminiApiHost(baseUrl: string | undefined): boolean {
  }
 }

+function isMoonshotBaseUrl(baseUrl: string | undefined): boolean {
+  if (!baseUrl) return false
+  try {
+    return MOONSHOT_API_HOSTS.has(new URL(baseUrl).hostname.toLowerCase())
+  } catch {
+    return false
+  }
+}
+
 function formatRetryAfterHint(response: Response): string {
  const ra = response.headers.get('retry-after')
  return ra ? ` (Retry-After: ${ra})` : ''
 }

+function shouldRedactUrlQueryParam(name: string): boolean {
+  const lower = name.toLowerCase()
+  return SENSITIVE_URL_QUERY_PARAM_NAMES.some(token => lower.includes(token))
+}
+
+function redactUrlForDiagnostics(url: string): string {
+  try {
+    const parsed = new URL(url)
+    if (parsed.username) {
+      parsed.username = 'redacted'
+    }
+    if (parsed.password) {
+      parsed.password = 'redacted'
+    }
+
+    for (const key of parsed.searchParams.keys()) {
+      if (shouldRedactUrlQueryParam(key)) {
+        parsed.searchParams.set(key, 'redacted')
+      }
+    }
+
+    const serialized = parsed.toString()
+    return redactSecretValueForDisplay(serialized, process.env as SecretValueSource) ?? serialized
+  } catch {
+    return redactSecretValueForDisplay(url, process.env as SecretValueSource) ?? url
+  }
+}
+
 function sleepMs(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms))
 }
@@ -154,6 +218,14 @@ interface OpenAIMessage {
  }>
  tool_call_id?: string
  name?: string
+  /**
+   * Per-assistant-message chain-of-thought, attached when echoing an
+   * assistant message back to providers that require it (notably Moonshot:
+   * "thinking is enabled but reasoning_content is missing in assistant
+   * tool call message at index N" 400). Derived from the Anthropic thinking
+   * block captured when the original response was translated.
+   */
+  reasoning_content?: string
 }

 interface OpenAITool {
@@ -229,6 +301,15 @@ function convertToolResultContent(
    const text = parts[0].text ?? ''
    return isError ? `Error: ${text}` : text
  }
+
+  // Collapse arrays of only text blocks into a single string for DeepSeek
+  // compatibility (issue #774). DeepSeek rejects arrays in role: "tool" messages.
+  const allText = parts.every(p => p.type === 'text')
+  if (allText) {
+    const text = parts.map(p => p.text ?? '').join('\n\n')
+    return isError ? `Error: ${text}` : text
+  }
+
  if (isError && parts[0]?.type === 'text') {
    parts[0] = { ...parts[0], text: `Error: ${parts[0].text ?? ''}` }
  } else if (isError) {
@@ -287,6 +368,14 @@ function convertContentBlocks(

  if (parts.length === 0) return ''
  if (parts.length === 1 && parts[0].type === 'text') return parts[0].text ?? ''
+
+  // Collapse arrays of only text blocks into a single string for DeepSeek
+  // compatibility (issue #774).
+  const allText = parts.every(p => p.type === 'text')
+  if (allText) {
+    return parts.map(p => p.text ?? '').join('\n\n')
+  }
+
  return parts
 }

@@ -298,10 +387,34 @@ function isGeminiMode(): boolean {
 }

 function convertMessages(
-  messages: Array<{ role: string; message?: { role?: string; content?: unknown }; content?: unknown }>,
+  messages: Array<{
+    role: string
+    message?: { role?: string; content?: unknown }
+    content?: unknown
+  }>,
  system: unknown,
+  options?: { preserveReasoningContent?: boolean },
 ): OpenAIMessage[] {
+  const preserveReasoningContent = options?.preserveReasoningContent === true
  const result: OpenAIMessage[] = []
+  const knownToolCallIds = new Set<string>()
+
+  // Pre-scan for all tool results in the history to identify valid tool calls
+  const toolResultIds = new Set<string>()
+  for (const msg of messages) {
+    const inner = msg.message ?? msg
+    const content = (inner as { content?: unknown }).content
+    if (Array.isArray(content)) {
+      for (const block of content) {
+        if (
+          (block as { type?: string }).type === 'tool_result' &&
+          (block as { tool_use_id?: string }).tool_use_id
+        ) {
+          toolResultIds.add((block as { tool_use_id: string }).tool_use_id)
+        }
+      }
+    }
+  }

  // System message first
  const sysText = convertSystemPrompt(system)
@@ -309,7 +422,10 @@ function convertMessages(
    result.push({ role: 'system', content: sysText })
  }

-  for (const msg of messages) {
+  for (let i = 0; i < messages.length; i++) {
+    const msg = messages[i]
+    const isLastInHistory = i === messages.length - 1
+
    // Claude Code wraps messages in { role, message: { role, content } }
    const inner = msg.message ?? msg
    const role = (inner as { role?: string }).role ?? msg.role
@@ -318,16 +434,30 @@ function convertMessages(
    if (role === 'user') {
      // Check for tool_result blocks in user messages
      if (Array.isArray(content)) {
-        const toolResults = content.filter((b: { type?: string }) => b.type === 'tool_result')
-        const otherContent = content.filter((b: { type?: string }) => b.type !== 'tool_result')
+        const toolResults = content.filter(
+          (b: { type?: string }) => b.type === 'tool_result',
+        )
+        const otherContent = content.filter(
+          (b: { type?: string }) => b.type !== 'tool_result',
+        )

-        // Emit tool results as tool messages
+        // Emit tool results as tool messages, but ONLY if we have a matching tool_use ID.
+        // Mistral/OpenAI strictly require tool messages to follow an assistant message with tool_calls.
+        // If the user interrupted (ESC) and a synthetic tool_result was generated without a recorded tool_use,
+        // emitting it here would cause a "role must alternate" or "unexpected role" error.
        for (const tr of toolResults) {
+          const id = tr.tool_use_id ?? 'unknown'
+          if (knownToolCallIds.has(id)) {
            result.push({
              role: 'tool',
-            tool_call_id: tr.tool_use_id ?? 'unknown',
+              tool_call_id: id,
              content: convertToolResultContent(tr.content, tr.is_error),
            })
+          } else {
+            logForDebugging(
+              `Dropping orphan tool_result for ID: ${id} to prevent API error`,
+            )
+          }
        }

        // Emit remaining user content
@@ -346,8 +476,12 @@ function convertMessages(
    } else if (role === 'assistant') {
      // Check for tool_use blocks
      if (Array.isArray(content)) {
-        const toolUses = content.filter((b: { type?: string }) => b.type === 'tool_use')
-        const thinkingBlock = content.find((b: { type?: string }) => b.type === 'thinking')
+        const toolUses = content.filter(
+          (b: { type?: string }) => b.type === 'tool_use',
+        )
+        const thinkingBlock = content.find(
+          (b: { type?: string }) => b.type === 'thinking',
+        )
        const textContent = content.filter(
          (b: { type?: string }) => b.type !== 'tool_use' && b.type !== 'thinking',
        )
@@ -356,21 +490,53 @@ function convertMessages(
          role: 'assistant',
          content: (() => {
            const c = convertContentBlocks(textContent)
-            return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
+            return typeof c === 'string'
+              ? c
+              : Array.isArray(c)
+                ? c.map((p: { text?: string }) => p.text ?? '').join('')
+                : ''
          })(),
        }

+        // Providers that validate reasoning continuity (Moonshot: "thinking
+        // is enabled but reasoning_content is missing in assistant tool call
+        // message at index N" 400) need the original chain-of-thought echoed
+        // back on each assistant message that carries a tool_call. We kept
+        // the thinking block on the Anthropic side; re-attach it here as the
+        // `reasoning_content` field on the outgoing OpenAI-shaped message.
+        // Gated per-provider because other endpoints either ignore the field
+        // (harmless) or strict-reject unknown fields (harmful).
+        if (preserveReasoningContent) {
+          const thinkingText = (thinkingBlock as { thinking?: string } | undefined)?.thinking
+          if (typeof thinkingText === 'string' && thinkingText.trim().length > 0) {
+            assistantMsg.reasoning_content = thinkingText
+          }
+        }
+
        if (toolUses.length > 0) {
-          assistantMsg.tool_calls = toolUses.map(
+          const mappedToolCalls = toolUses
+            .map(
              (tu: {
                id?: string
                name?: string
                input?: unknown
                extra_content?: Record<string, unknown>
                signature?: string
-            }, index) => {
-              const toolCall: NonNullable<OpenAIMessage['tool_calls']>[number] = {
-                id: tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`,
+              }) => {
+                const id = tu.id ?? `call_${crypto.randomUUID().replace(/-/g, '')}`
+
+                // Only keep tool calls that have a corresponding result in the history,
+                // or if it's the last message (prefill scenario).
+                // Orphaned tool calls (e.g. from user interruption) cause 400 errors.
+                if (!toolResultIds.has(id) && !isLastInHistory) {
+                  return null
+                }
+
+                knownToolCallIds.add(id)
+                const toolCall: NonNullable<
+                  OpenAIMessage['tool_calls']
+                >[number] = {
+                  id,
                  type: 'function' as const,
                  function: {
                    name: tu.name ?? 'unknown',
@@ -391,34 +557,56 @@ function convertMessages(
                  // If the model provided a signature in the tool_use block itself (e.g. from a previous Turn/Step)
                  // Use thinkingBlock.signature for ALL tool calls in the same assistant turn if available.
                  // The API requires the same signature on every replayed function call part in a parallel set.
-                const signature = tu.signature ?? (thinkingBlock as any)?.signature
+                  const signature =
+                    tu.signature ?? (thinkingBlock as any)?.signature

                  // Merge into existing google-specific metadata if present
-                const existingGoogle = (toolCall.extra_content?.google as Record<string, unknown>) ?? {}
-
+                  const existingGoogle =
+                    (toolCall.extra_content?.google as Record<
+                      string,
+                      unknown
+                    >) ?? {}
                  toolCall.extra_content = {
                    ...toolCall.extra_content,
                    google: {
                      ...existingGoogle,
-                    thought_signature: signature ?? "skip_thought_signature_validator"
-                  }
+                      thought_signature:
+                        signature ?? 'skip_thought_signature_validator',
+                    },
                  }
                }

                return toolCall
              },
            )
+            .filter((tc): tc is NonNullable<typeof tc> => tc !== null)
+
+          if (mappedToolCalls.length > 0) {
+            assistantMsg.tool_calls = mappedToolCalls
+          }
        }

+        // Only push assistant message if it has content or tool calls.
+        // Stripped thinking-only blocks from user interruptions are empty and cause 400s.
+        if (assistantMsg.content || assistantMsg.tool_calls?.length) {
          result.push(assistantMsg)
+        }
      } else {
-        result.push({
+        const assistantMsg: OpenAIMessage = {
          role: 'assistant',
          content: (() => {
            const c = convertContentBlocks(content)
-            return typeof c === 'string' ? c : Array.isArray(c) ? c.map((p: { text?: string }) => p.text ?? '').join('') : ''
+            return typeof c === 'string'
+              ? c
+              : Array.isArray(c)
+                ? c.map((p: { text?: string }) => p.text ?? '').join('')
+                : ''
          })(),
-        })
+        }
+
+        if (assistantMsg.content) {
+          result.push(assistantMsg)
+        }
      }
    }
  }
@@ -432,25 +620,56 @@ function convertMessages(
  for (const msg of result) {
    const prev = coalesced[coalesced.length - 1]

-    if (prev && prev.role === msg.role && msg.role !== 'tool' && msg.role !== 'system') {
-      const prevContent = prev.content
+    // Mistral/Devstral: 'tool' message must be followed by an 'assistant' message.
+    // If a 'tool' result is followed by a 'user' message, we must inject a semantic
+    // assistant response to satisfy the strict role sequence:
+    // ... -> assistant (calls) -> tool (results) -> assistant (semantic) -> user (next)
+    if (prev && prev.role === 'tool' && msg.role === 'user') {
+      coalesced.push({
+        role: 'assistant',
+        content: '[Tool execution interrupted by user]',
+      })
+    }
+
+    const lastAfterPossibleInjection = coalesced[coalesced.length - 1]
+    if (
+      lastAfterPossibleInjection &&
+      lastAfterPossibleInjection.role === msg.role &&
+      msg.role !== 'tool' &&
+      msg.role !== 'system'
+    ) {
+      const prevContent = lastAfterPossibleInjection.content
      const curContent = msg.content

      if (typeof prevContent === 'string' && typeof curContent === 'string') {
-        prev.content = prevContent + (prevContent && curContent ? '\n' : '') + curContent
+        lastAfterPossibleInjection.content =
+          prevContent + (prevContent && curContent ? '\n' : '') + curContent
      } else {
        const toArray = (
-          c: string | Array<{ type: string; text?: string; image_url?: { url: string } }> | undefined,
-        ): Array<{ type: string; text?: string; image_url?: { url: string } }> => {
+          c:
+            | string
+            | Array<{ type: string; text?: string; image_url?: { url: string } }>
+            | undefined,
+        ): Array<{
+          type: string
+          text?: string
+          image_url?: { url: string }
+        }> => {
          if (!c) return []
          if (typeof c === 'string') return c ? [{ type: 'text', text: c }] : []
          return c
        }
-        prev.content = [...toArray(prevContent), ...toArray(curContent)]
+        lastAfterPossibleInjection.content = [
+          ...toArray(prevContent),
+          ...toArray(curContent),
+        ]
      }

      if (msg.tool_calls?.length) {
-        prev.tool_calls = [...(prev.tool_calls ?? []), ...msg.tool_calls]
+        lastAfterPossibleInjection.tool_calls = [
+          ...(lastAfterPossibleInjection.tool_calls ?? []),
+          ...msg.tool_calls,
+        ]
      }
    } else {
      coalesced.push(msg)
@@ -550,7 +769,10 @@ function convertTools(
        function: {
          name: t.name,
          description: t.description ?? '',
-          parameters: normalizeSchemaForOpenAI(schema, !isGemini),
+          parameters: normalizeSchemaForOpenAI(
+            schema,
+            !isGemini && !isEnvTruthy(process.env.OPENCLAUDE_DISABLE_STRICT_TOOLS),
+          ),
        },
      }
    })
@@ -658,11 +880,11 @@ async function* openaiStreamToAnthropic(
  let hasEmittedContentStart = false
  let hasEmittedThinkingStart = false
  let hasClosedThinking = false
-  let activeTextBuffer = ''
-  let textBufferMode: 'none' | 'pending' | 'strip' = 'none'
+  const thinkFilter = createThinkTagFilter()
  let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null
  let hasEmittedFinalUsage = false
  let hasProcessedFinishReason = false
+  const streamState = createStreamState()

  // Emit message_start
  yield {
@@ -738,14 +960,12 @@ async function* openaiStreamToAnthropic(
  const closeActiveContentBlock = async function* () {
    if (!hasEmittedContentStart) return

-    if (textBufferMode !== 'none') {
-      const sanitized = stripLeakedReasoningPreamble(activeTextBuffer)
-      if (sanitized) {
+    const tail = thinkFilter.flush()
+    if (tail) {
      yield {
        type: 'content_block_delta',
        index: contentBlockIndex,
-          delta: { type: 'text_delta', text: sanitized },
-        }
+        delta: { type: 'text_delta', text: tail },
      }
    }

@@ -755,8 +975,6 @@ async function* openaiStreamToAnthropic(
    }
    contentBlockIndex++
    hasEmittedContentStart = false
-    activeTextBuffer = ''
-    textBufferMode = 'none'
  }

  try {
@@ -813,7 +1031,6 @@ async function* openaiStreamToAnthropic(
            contentBlockIndex++
            hasClosedThinking = true
          }
-          activeTextBuffer += delta.content
          if (!hasEmittedContentStart) {
            yield {
              type: 'content_block_start',
@@ -823,39 +1040,15 @@ async function* openaiStreamToAnthropic(
            hasEmittedContentStart = true
          }

-          if (
-            textBufferMode === 'strip' ||
-            looksLikeLeakedReasoningPrefix(activeTextBuffer)
-          ) {
-            textBufferMode = 'strip'
-            continue
-          }
-
-          if (textBufferMode === 'pending') {
-            if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
-              continue
-            }
+          const visible = thinkFilter.feed(delta.content)
+          if (visible) {
            yield {
              type: 'content_block_delta',
              index: contentBlockIndex,
-              delta: {
-                type: 'text_delta',
-                text: activeTextBuffer,
-              },
+              delta: { type: 'text_delta', text: visible },
            }
-            textBufferMode = 'none'
-            continue
-          }
-
-          if (shouldBufferPotentialReasoningPrefix(activeTextBuffer)) {
-            textBufferMode = 'pending'
-            continue
-          }
-          yield {
-            type: 'content_block_delta',
-            index: contentBlockIndex,
-            delta: { type: 'text_delta', text: delta.content },
          }
+          processStreamChunk(streamState, delta.content)
        }

        // Tool calls
@@ -875,6 +1068,7 @@ async function* openaiStreamToAnthropic(
              const toolBlockIndex = contentBlockIndex
              const initialArguments = tc.function.arguments ?? ''
              const normalizeAtStop = hasToolFieldMapping(tc.function.name)
+              processStreamChunk(streamState, tc.function.arguments ?? '')
              activeToolCalls.set(tc.index, {
                id: tc.id,
                name: tc.function.name,
@@ -1072,6 +1266,20 @@ async function* openaiStreamToAnthropic(
    reader.releaseLock()
  }

+  const stats = getStreamStats(streamState)
+  if (stats.totalChunks > 0) {
+    logForDebugging(
+      JSON.stringify({
+        type: 'stream_stats',
+        model,
+        total_chunks: stats.totalChunks,
+        first_token_ms: stats.firstTokenMs,
+        duration_ms: stats.durationMs,
+      }),
+      { level: 'debug' },
+    )
+  }
+
  yield { type: 'message_stop' }
 }

@@ -1269,14 +1477,20 @@ class OpenAIShimMessages {
    params: ShimCreateParams,
    options?: { signal?: AbortSignal; headers?: Record<string, string> },
  ): Promise<Response> {
-    const openaiMessages = convertMessages(
+    const compressedMessages = compressToolHistory(
      params.messages as Array<{
        role: string
        message?: { role?: string; content?: unknown }
        content?: unknown
      }>,
-      params.system,
+      request.resolvedModel,
    )
+    const openaiMessages = convertMessages(compressedMessages, params.system, {
+      // Moonshot requires every assistant tool-call message to carry
+      // reasoning_content when its thinking feature is active. Echo it back
+      // from the thinking block we captured on the inbound response.
+      preserveReasoningContent: isMoonshotBaseUrl(request.baseUrl),
+    })

    const body: Record<string, unknown> = {
      model: request.resolvedModel,
@@ -1312,14 +1526,19 @@ class OpenAIShimMessages {
    const isGithubCopilot = isGithub && githubEndpointType === 'copilot'
    const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom')

-    if ((isGithub || isMistral || isLocal) && body.max_completion_tokens !== undefined) {
+    const isMoonshot = isMoonshotBaseUrl(request.baseUrl)
+
+    if ((isGithub || isMistral || isLocal || isMoonshot) && body.max_completion_tokens !== undefined) {
      body.max_tokens = body.max_completion_tokens
      delete body.max_completion_tokens
    }

    // mistral and gemini don't recognize body.store — Gemini returns 400
    // "Invalid JSON payload received. Unknown name 'store': Cannot find field."
-    if (isMistral || isGeminiMode()) {
+    // Moonshot (api.moonshot.ai/.cn) has not published support for the
+    // parameter either; strip it preemptively to avoid the same class of
+    // error on strict-parse providers.
+    if (isMistral || isGeminiMode() || isMoonshot) {
      delete body.store
    }

@@ -1360,8 +1579,12 @@ class OpenAIShimMessages {
      ...filterAnthropicHeaders(options?.headers),
    }

-    const isGemini = isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI)
-    const apiKey = this.providerOverride?.apiKey ?? process.env.OPENAI_API_KEY ?? ''
+    const isGemini = isGeminiMode()
+    const isMiniMax = !!process.env.MINIMAX_API_KEY
+    const apiKey =
+      this.providerOverride?.apiKey ??
+      process.env.OPENAI_API_KEY ??
+      (isMiniMax ? process.env.MINIMAX_API_KEY : '')
    // Detect Azure endpoints by hostname (not raw URL) to prevent bypass via
    // path segments like https://evil.com/cognitiveservices.azure.com/
    let isAzure = false
@@ -1395,42 +1618,212 @@ class OpenAIShimMessages {
      headers['X-GitHub-Api-Version'] = '2022-11-28'
    }

-    // Build the chat completions URL
-    // Azure Cognitive Services / Azure OpenAI require a deployment-specific path
-    // and an api-version query parameter.
-    // Standard format: {base}/openai/deployments/{model}/chat/completions?api-version={version}
-    // Non-Azure: {base}/chat/completions
-    let chatCompletionsUrl: string
+    const buildChatCompletionsUrl = (baseUrl: string): string => {
+      // Azure Cognitive Services / Azure OpenAI require a deployment-specific
+      // path and an api-version query parameter.
      if (isAzure) {
        const apiVersion = process.env.AZURE_OPENAI_API_VERSION ?? '2024-12-01-preview'
        const deployment = request.resolvedModel ?? process.env.OPENAI_MODEL ?? 'gpt-4o'
-      // If base URL already contains /deployments/, use it as-is with api-version
-      if (/\/deployments\//i.test(request.baseUrl)) {
-        const base = request.baseUrl.replace(/\/+$/, '')
-        chatCompletionsUrl = `${base}/chat/completions?api-version=${apiVersion}`
-      } else {
-        // Strip trailing /v1 or /openai/v1 if present, then build Azure path
-        const base = request.baseUrl.replace(/\/(openai\/)?v1\/?$/, '').replace(/\/+$/, '')
-        chatCompletionsUrl = `${base}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`
-      }
-    } else {
-      chatCompletionsUrl = `${request.baseUrl}/chat/completions`
+
+        // If base URL already contains /deployments/, use it as-is with api-version.
+        if (/\/deployments\//i.test(baseUrl)) {
+          const normalizedBase = baseUrl.replace(/\/+$/, '')
+          return `${normalizedBase}/chat/completions?api-version=${apiVersion}`
        }

-    const fetchInit = {
+        // Strip trailing /v1 or /openai/v1 if present, then build Azure path.
+        const normalizedBase = baseUrl
+          .replace(/\/(openai\/)?v1\/?$/, '')
+          .replace(/\/+$/, '')
+
+        return `${normalizedBase}/openai/deployments/${deployment}/chat/completions?api-version=${apiVersion}`
+      }
+
+      return `${baseUrl}/chat/completions`
+    }
+
+    const localRetryBaseUrls = isLocal
+      ? getLocalProviderRetryBaseUrls(request.baseUrl)
+      : []
+
+    let activeBaseUrl = request.baseUrl
+    let chatCompletionsUrl = buildChatCompletionsUrl(activeBaseUrl)
+    const attemptedLocalBaseUrls = new Set<string>([activeBaseUrl])
+    let didRetryWithoutTools = false
+
+    const promoteNextLocalBaseUrl = (
+      reason: 'endpoint_not_found' | 'localhost_resolution_failed',
+    ): boolean => {
+      for (const candidateBaseUrl of localRetryBaseUrls) {
+        if (attemptedLocalBaseUrls.has(candidateBaseUrl)) {
+          continue
+        }
+
+        const previousUrl = chatCompletionsUrl
+        attemptedLocalBaseUrls.add(candidateBaseUrl)
+        activeBaseUrl = candidateBaseUrl
+        chatCompletionsUrl = buildChatCompletionsUrl(activeBaseUrl)
+
+        logForDebugging(
+          `[OpenAIShim] self-heal retry reason=${reason} method=POST from=${redactUrlForDiagnostics(previousUrl)} to=${redactUrlForDiagnostics(chatCompletionsUrl)} model=${request.resolvedModel}`,
+          { level: 'warn' },
+        )
+
+        return true
+      }
+
+      return false
+    }
+
+    let serializedBody = JSON.stringify(body)
+
+    const refreshSerializedBody = (): void => {
+      serializedBody = JSON.stringify(body)
+    }
+
+    const buildFetchInit = () => ({
      method: 'POST' as const,
      headers,
-      body: JSON.stringify(body),
+      body: serializedBody,
      signal: options?.signal,
+    })
+
+    const maxSelfHealAttempts = isLocal
+      ? localRetryBaseUrls.length + 1
+      : 0
+    const maxAttempts = (isGithub ? GITHUB_429_MAX_RETRIES : 1) + maxSelfHealAttempts
+
+    const throwClassifiedTransportError = (
+      error: unknown,
+      requestUrl: string,
+      preclassifiedFailure?: ReturnType<typeof classifyOpenAINetworkFailure>,
+    ): never => {
+      if (options?.signal?.aborted) {
+        throw error
+      }
+
+      const failure =
+        preclassifiedFailure ??
+        classifyOpenAINetworkFailure(error, {
+          url: requestUrl,
+        })
+      const redactedUrl = redactUrlForDiagnostics(requestUrl)
+      const safeMessage =
+        redactSecretValueForDisplay(
+          failure.message,
+          process.env as SecretValueSource,
+        ) || 'Request failed'
+
+      logForDebugging(
+        `[OpenAIShim] transport failure category=${failure.category} retryable=${failure.retryable} code=${failure.code ?? 'unknown'} method=POST url=${redactedUrl} model=${request.resolvedModel} message=${safeMessage}`,
+        { level: 'warn' },
+      )
+
+      throw APIError.generate(
+        503,
+        undefined,
+        buildOpenAICompatibilityErrorMessage(
+          `OpenAI API transport error: ${safeMessage}${failure.code ? ` (code=${failure.code})` : ''}`,
+          failure,
+        ),
+        new Headers(),
+      )
+    }
+
+    const throwClassifiedHttpError = (
+      status: number,
+      errorBody: string,
+      parsedBody: object | undefined,
+      responseHeaders: Headers,
+      requestUrl: string,
+      rateHint = '',
+      preclassifiedFailure?: ReturnType<typeof classifyOpenAIHttpFailure>,
+    ): never => {
+      const failure =
+        preclassifiedFailure ??
+        classifyOpenAIHttpFailure({
+          status,
+          body: errorBody,
+        })
+      const redactedUrl = redactUrlForDiagnostics(requestUrl)
+
+      logForDebugging(
+        `[OpenAIShim] request failed category=${failure.category} retryable=${failure.retryable} status=${status} method=POST url=${redactedUrl} model=${request.resolvedModel}`,
+        { level: 'warn' },
+      )
+
+      throw APIError.generate(
+        status,
+        parsedBody,
+        buildOpenAICompatibilityErrorMessage(
+          `OpenAI API error ${status}: ${errorBody}${rateHint}`,
+          failure,
+        ),
+        responseHeaders,
+      )
    }

-    const maxAttempts = isGithub ? GITHUB_429_MAX_RETRIES : 1
    let response: Response | undefined
+    const provider = request.baseUrl.includes('nvidia') ? 'nvidia-nim'
+      : request.baseUrl.includes('minimax') ? 'minimax'
+      : request.baseUrl.includes('localhost:11434') || request.baseUrl.includes('localhost:11435') ? 'ollama'
+      : request.baseUrl.includes('anthropic') ? 'anthropic'
+      : 'openai'
+    const { correlationId, startTime } = logApiCallStart(provider, request.resolvedModel)
    for (let attempt = 0; attempt < maxAttempts; attempt++) {
-      response = await fetch(chatCompletionsUrl, fetchInit)
+      try {
+        response = await fetchWithProxyRetry(
+          chatCompletionsUrl,
+          buildFetchInit(),
+        )
+      } catch (error) {
+        const isAbortError =
+          options?.signal?.aborted === true ||
+          (typeof DOMException !== 'undefined' &&
+            error instanceof DOMException &&
+            error.name === 'AbortError') ||
+          (typeof error === 'object' &&
+            error !== null &&
+            'name' in error &&
+            error.name === 'AbortError')
+
+        if (isAbortError) {
+          throw error
+        }
+
+        const failure = classifyOpenAINetworkFailure(error, {
+          url: chatCompletionsUrl,
+        })
+
+        if (
+          isLocal &&
+          failure.category === 'localhost_resolution_failed' &&
+          promoteNextLocalBaseUrl('localhost_resolution_failed')
+        ) {
+          continue
+        }
+
+        throwClassifiedTransportError(error, chatCompletionsUrl, failure)
+      }
+
      if (response.ok) {
+        let tokensIn = 0
+        let tokensOut = 0
+        // Skip clone() for streaming responses - it blocks until full body is received,
+        // defeating the purpose of streaming. Usage data is already sent via
+        // stream_options: { include_usage: true } and can be extracted from the stream.
+        if (!params.stream) {
+          try {
+            const clone = response.clone()
+            const data = await clone.json()
+            tokensIn = data.usage?.prompt_tokens ?? 0
+            tokensOut = data.usage?.completion_tokens ?? 0
+          } catch { /* ignore */ }
+        }
+        logApiCallEnd(correlationId, startTime, request.resolvedModel, 'success', tokensIn, tokensOut, false)
        return response
      }
+
      if (
        isGithub &&
        response.status === 429 &&
@@ -1500,34 +1893,87 @@ class OpenAIShimMessages {
            }
          }

-          const responsesResponse = await fetch(responsesUrl, {
+          let responsesResponse: Response
+          try {
+            responsesResponse = await fetchWithProxyRetry(responsesUrl, {
              method: 'POST',
              headers,
              body: JSON.stringify(responsesBody),
              signal: options?.signal,
            })
+          } catch (error) {
+            throwClassifiedTransportError(error, responsesUrl)
+          }
+
          if (responsesResponse.ok) {
            return responsesResponse
          }
          const responsesErrorBody = await responsesResponse.text().catch(() => 'unknown error')
+          const responsesFailure = classifyOpenAIHttpFailure({
+            status: responsesResponse.status,
+            body: responsesErrorBody,
+          })
          let responsesErrorResponse: object | undefined
          try { responsesErrorResponse = JSON.parse(responsesErrorBody) } catch { /* raw text */ }
-          throw APIError.generate(
+          throwClassifiedHttpError(
            responsesResponse.status,
+            responsesErrorBody,
            responsesErrorResponse,
-            `OpenAI API error ${responsesResponse.status}: ${responsesErrorBody}`,
            responsesResponse.headers,
+            responsesUrl,
+            '',
+            responsesFailure,
          )
        }
      }

+      const failure = classifyOpenAIHttpFailure({
+        status: response.status,
+        body: errorBody,
+      })
+
+      if (
+        isLocal &&
+        failure.category === 'endpoint_not_found' &&
+        promoteNextLocalBaseUrl('endpoint_not_found')
+      ) {
+        continue
+      }
+
+      const hasToolsPayload =
+        Array.isArray(body.tools) &&
+        body.tools.length > 0
+
+      if (
+        !didRetryWithoutTools &&
+        failure.category === 'tool_call_incompatible' &&
+        shouldAttemptLocalToollessRetry({
+          baseUrl: activeBaseUrl,
+          hasTools: hasToolsPayload,
+        })
+      ) {
+        didRetryWithoutTools = true
+        delete body.tools
+        delete body.tool_choice
+        refreshSerializedBody()
+
+        logForDebugging(
+          `[OpenAIShim] self-heal retry reason=tool_call_incompatible mode=toolless method=POST url=${redactUrlForDiagnostics(chatCompletionsUrl)} model=${request.resolvedModel}`,
+          { level: 'warn' },
+        )
+        continue
+      }
+
      let errorResponse: object | undefined
      try { errorResponse = JSON.parse(errorBody) } catch { /* raw text */ }
-      throw APIError.generate(
+      throwClassifiedHttpError(
        response.status,
+        errorBody,
        errorResponse,
-        `OpenAI API error ${response.status}: ${errorBody}${rateHint}`,
        response.headers as unknown as Headers,
+        chatCompletionsUrl,
+        rateHint,
+        failure,
      )
    }

@@ -1584,7 +2030,7 @@ class OpenAIShimMessages {
    if (typeof rawContent === 'string' && rawContent) {
      content.push({
        type: 'text',
-        text: stripLeakedReasoningPreamble(rawContent),
+        text: stripThinkTags(rawContent),
      })
    } else if (Array.isArray(rawContent) && rawContent.length > 0) {
      const parts: string[] = []
@@ -1602,7 +2048,7 @@ class OpenAIShimMessages {
      if (joined) {
        content.push({
          type: 'text',
-          text: stripLeakedReasoningPreamble(joined),
+          text: stripThinkTags(joined),
        })
      }
    }
--- a/src/services/api/providerConfig.envDiagnostics.test.ts
+++ b/src/services/api/providerConfig.envDiagnostics.test.ts
@@ -0,0 +1,107 @@
+import { afterEach, expect, mock, test } from 'bun:test'
+
+const originalEnv = {
+  CLAUDE_CODE_USE_OPENAI: process.env.CLAUDE_CODE_USE_OPENAI,
+  CLAUDE_CODE_USE_MISTRAL: process.env.CLAUDE_CODE_USE_MISTRAL,
+  OPENAI_BASE_URL: process.env.OPENAI_BASE_URL,
+  OPENAI_MODEL: process.env.OPENAI_MODEL,
+  OPENAI_API_BASE: process.env.OPENAI_API_BASE,
+  MISTRAL_BASE_URL: process.env.MISTRAL_BASE_URL,
+  MISTRAL_MODEL: process.env.MISTRAL_MODEL,
+}
+
+function restoreEnv(key: string, value: string | undefined): void {
+  if (value === undefined) {
+    delete process.env[key]
+  } else {
+    process.env[key] = value
+  }
+}
+
+afterEach(() => {
+  restoreEnv('CLAUDE_CODE_USE_OPENAI', originalEnv.CLAUDE_CODE_USE_OPENAI)
+  restoreEnv('CLAUDE_CODE_USE_MISTRAL', originalEnv.CLAUDE_CODE_USE_MISTRAL)
+  restoreEnv('OPENAI_BASE_URL', originalEnv.OPENAI_BASE_URL)
+  restoreEnv('OPENAI_MODEL', originalEnv.OPENAI_MODEL)
+  restoreEnv('OPENAI_API_BASE', originalEnv.OPENAI_API_BASE)
+  restoreEnv('MISTRAL_BASE_URL', originalEnv.MISTRAL_BASE_URL)
+  restoreEnv('MISTRAL_MODEL', originalEnv.MISTRAL_MODEL)
+  mock.restore()
+})
+
+test('logs a warning when OPENAI_BASE_URL is literal undefined', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  process.env.CLAUDE_CODE_USE_OPENAI = '1'
+  process.env.OPENAI_BASE_URL = 'undefined'
+  process.env.OPENAI_MODEL = 'gpt-4o'
+  delete process.env.OPENAI_API_BASE
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { resolveProviderRequest } = await import(`./providerConfig.ts?ts=${nonce}`)
+
+  const resolved = resolveProviderRequest()
+
+  expect(resolved.baseUrl).toBe('https://api.openai.com/v1')
+
+  const warningCall = debugSpy.mock.calls.find(call =>
+    typeof call?.[0] === 'string' &&
+    call[0].includes('OPENAI_BASE_URL') &&
+    call[0].includes('"undefined"'),
+  )
+
+  expect(warningCall).toBeDefined()
+  expect(warningCall?.[1]).toEqual({ level: 'warn' })
+})
+
+test('does not warn for OPENAI_API_BASE when OPENAI_BASE_URL is active', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  process.env.CLAUDE_CODE_USE_OPENAI = '1'
+  delete process.env.CLAUDE_CODE_USE_MISTRAL
+  process.env.OPENAI_BASE_URL = 'http://127.0.0.1:11434/v1'
+  process.env.OPENAI_MODEL = 'qwen2.5-coder:7b'
+  process.env.OPENAI_API_BASE = 'undefined'
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { resolveProviderRequest } = await import(`./providerConfig.ts?ts=${nonce}`)
+
+  const resolved = resolveProviderRequest()
+
+  expect(resolved.baseUrl).toBe('http://127.0.0.1:11434/v1')
+
+  const aliasWarning = debugSpy.mock.calls.find(call =>
+    typeof call?.[0] === 'string' &&
+    call[0].includes('OPENAI_API_BASE') &&
+    call[0].includes('"undefined"'),
+  )
+
+  expect(aliasWarning).toBeUndefined()
+})
+
+test('uses OPENAI_API_BASE as fallback in mistral mode when MISTRAL_BASE_URL is unset', async () => {
+  const debugSpy = mock(() => {})
+  mock.module('../../utils/debug.js', () => ({
+    logForDebugging: debugSpy,
+  }))
+
+  delete process.env.CLAUDE_CODE_USE_OPENAI
+  process.env.CLAUDE_CODE_USE_MISTRAL = '1'
+  delete process.env.MISTRAL_BASE_URL
+  process.env.MISTRAL_MODEL = 'mistral-medium-latest'
+  process.env.OPENAI_API_BASE = 'http://127.0.0.1:11434/v1'
+
+  const nonce = `${Date.now()}-${Math.random()}`
+  const { resolveProviderRequest } = await import(`./providerConfig.ts?ts=${nonce}`)
+
+  const resolved = resolveProviderRequest()
+
+  expect(resolved.baseUrl).toBe('http://127.0.0.1:11434/v1')
+  expect(debugSpy.mock.calls).toHaveLength(0)
+})
--- a/src/services/api/providerConfig.local.test.ts
+++ b/src/services/api/providerConfig.local.test.ts
@@ -2,8 +2,10 @@ import { afterEach, expect, test } from 'bun:test'

 import {
  getAdditionalModelOptionsCacheScope,
+  getLocalProviderRetryBaseUrls,
  isLocalProviderUrl,
  resolveProviderRequest,
+  shouldAttemptLocalToollessRetry,
 } from './providerConfig.js'

 const originalEnv = {
@@ -83,3 +85,42 @@ test('skips local model cache scope for remote openai-compatible providers', ()

  expect(getAdditionalModelOptionsCacheScope()).toBeNull()
 })
+
+test('derives local retry base URLs with /v1 and loopback fallback candidates', () => {
+  expect(getLocalProviderRetryBaseUrls('http://localhost:11434')).toEqual([
+    'http://localhost:11434/v1',
+    'http://127.0.0.1:11434',
+    'http://127.0.0.1:11434/v1',
+  ])
+})
+
+test('does not derive local retry base URLs for remote providers', () => {
+  expect(getLocalProviderRetryBaseUrls('https://api.openai.com/v1')).toEqual([])
+})
+
+test('enables local toolless retry for likely Ollama endpoints with tools', () => {
+  expect(
+    shouldAttemptLocalToollessRetry({
+      baseUrl: 'http://localhost:11434/v1',
+      hasTools: true,
+    }),
+  ).toBe(true)
+})
+
+test('disables local toolless retry when no tools are present', () => {
+  expect(
+    shouldAttemptLocalToollessRetry({
+      baseUrl: 'http://localhost:11434/v1',
+      hasTools: false,
+    }),
+  ).toBe(false)
+})
+
+test('disables local toolless retry for non-Ollama local endpoints', () => {
+  expect(
+    shouldAttemptLocalToollessRetry({
+      baseUrl: 'http://localhost:1234/v1',
+      hasTools: true,
+    }),
+  ).toBe(false)
+})
--- a/src/services/api/providerConfig.ts
+++ b/src/services/api/providerConfig.ts
@@ -8,17 +8,20 @@ import {
  readCodexCredentials,
  type CodexCredentialBlob,
 } from '../../utils/codexCredentials.js'
+import { logForDebugging } from '../../utils/debug.js'
 import { isEnvTruthy } from '../../utils/envUtils.js'
 import {
  asTrimmedString,
  parseChatgptAccountId,
 } from './codexOAuthShared.js'
+import { DEFAULT_GEMINI_BASE_URL } from 'src/utils/providerProfile.js'

 export const DEFAULT_OPENAI_BASE_URL = 'https://api.openai.com/v1'
 export const DEFAULT_CODEX_BASE_URL = 'https://chatgpt.com/backend-api/codex'
 export const DEFAULT_MISTRAL_BASE_URL = 'https://api.mistral.ai/v1'
 /** Default GitHub Copilot API model when user selects copilot / github:copilot */
 export const DEFAULT_GITHUB_MODELS_API_MODEL = 'gpt-4o'
+const warnedUndefinedEnvNames = new Set<string>()

 const CODEX_ALIAS_MODELS: Record<
  string,
@@ -129,7 +132,33 @@ function isPrivateIpv6Address(hostname: string): boolean {
 function asEnvUrl(value: string | undefined): string | undefined {
  if (!value) return undefined
  const trimmed = value.trim()
-  if (!trimmed || trimmed === 'undefined') return undefined
+  if (!trimmed) return undefined
+  if (trimmed === 'undefined') {
+    return undefined
+  }
+  return trimmed
+}
+
+function asNamedEnvUrl(
+  value: string | undefined,
+  envName: string,
+): string | undefined {
+  if (!value) return undefined
+
+  const trimmed = value.trim()
+  if (!trimmed) return undefined
+
+  if (trimmed === 'undefined') {
+    if (!warnedUndefinedEnvNames.has(envName)) {
+      warnedUndefinedEnvNames.add(envName)
+      logForDebugging(
+        `[provider-config] Environment variable ${envName} is the literal string "undefined"; ignoring it.`,
+        { level: 'warn' },
+      )
+    }
+    return undefined
+  }
+
  return trimmed
 }

@@ -276,6 +305,101 @@ export function isLocalProviderUrl(baseUrl: string | undefined): boolean {
  }
 }

+function trimTrailingSlash(value: string): string {
+  return value.replace(/\/+$/, '')
+}
+
+function normalizePathWithV1(pathname: string): string {
+  const trimmed = trimTrailingSlash(pathname)
+  if (!trimmed || trimmed === '/') {
+    return '/v1'
+  }
+
+  if (trimmed.toLowerCase().endsWith('/v1')) {
+    return trimmed
+  }
+
+  return `${trimmed}/v1`
+}
+
+function isLikelyOllamaEndpoint(baseUrl: string): boolean {
+  try {
+    const parsed = new URL(baseUrl)
+    const hostname = parsed.hostname.toLowerCase()
+    const pathname = parsed.pathname.toLowerCase()
+
+    if (parsed.port === '11434') {
+      return true
+    }
+
+    return (
+      hostname.includes('ollama') ||
+      pathname.includes('ollama')
+    )
+  } catch {
+    return false
+  }
+}
+
+export function getLocalProviderRetryBaseUrls(baseUrl: string): string[] {
+  if (!isLocalProviderUrl(baseUrl)) {
+    return []
+  }
+
+  try {
+    const parsed = new URL(baseUrl)
+    const original = trimTrailingSlash(parsed.toString())
+    const seen = new Set<string>([original])
+    const candidates: string[] = []
+
+    const addCandidate = (hostname: string, pathname: string): void => {
+      const next = new URL(parsed.toString())
+      next.hostname = hostname
+      next.pathname = pathname
+      next.search = ''
+      next.hash = ''
+
+      const normalized = trimTrailingSlash(next.toString())
+      if (seen.has(normalized)) {
+        return
+      }
+
+      seen.add(normalized)
+      candidates.push(normalized)
+    }
+
+    const v1Pathname = normalizePathWithV1(parsed.pathname)
+    if (v1Pathname !== trimTrailingSlash(parsed.pathname)) {
+      addCandidate(parsed.hostname, v1Pathname)
+    }
+
+    const hostname = parsed.hostname.toLowerCase().replace(/^\[|\]$/g, '')
+    if (hostname === 'localhost' || hostname === '::1') {
+      addCandidate('127.0.0.1', parsed.pathname || '/')
+      addCandidate('127.0.0.1', v1Pathname)
+    }
+
+    return candidates
+  } catch {
+    return []
+  }
+}
+
+export function shouldAttemptLocalToollessRetry(options: {
+  baseUrl: string
+  hasTools: boolean
+}): boolean {
+  if (!options.hasTools) {
+    return false
+  }
+
+  if (!isLocalProviderUrl(options.baseUrl)) {
+    return false
+  }
+
+  return isLikelyOllamaEndpoint(options.baseUrl)
+}
+
 export function isCodexBaseUrl(baseUrl: string | undefined): boolean {
  if (!baseUrl) return false
  try {
@@ -353,23 +477,55 @@ export function resolveProviderRequest(options?: {
 }): ResolvedProviderRequest {
  const isGithubMode = isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB)
  const isMistralMode = isEnvTruthy(process.env.CLAUDE_CODE_USE_MISTRAL)
+  const isGeminiMode = isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI)
  const requestedModel =
    options?.model?.trim() ||
    (isMistralMode
      ? process.env.MISTRAL_MODEL?.trim()
      : process.env.OPENAI_MODEL?.trim()) ||
+    (isGeminiMode
+      ? process.env.GEMINI_MODEL?.trim()
+      : process.env.OPENAI_MODEL?.trim()) ||
    options?.fallbackModel?.trim() ||
    (isGithubMode ? 'github:copilot' : 'gpt-4o')
  const descriptor = parseModelDescriptor(requestedModel)
  const explicitBaseUrl = asEnvUrl(options?.baseUrl)
+
+  const normalizedMistralEnvBaseUrl = asNamedEnvUrl(
+    process.env.MISTRAL_BASE_URL,
+    'MISTRAL_BASE_URL',
+  )
+
+  const normalizedGeminiEnvBaseUrl = asNamedEnvUrl(
+    process.env.GEMINI_BASE_URL,
+    'GEMINI_BASE_URL',
+  )
+
+  const primaryEnvBaseUrl = isMistralMode
+    ? normalizedMistralEnvBaseUrl
+    : isGeminiMode
+    ? normalizedGeminiEnvBaseUrl
+    : asNamedEnvUrl(process.env.OPENAI_BASE_URL, 'OPENAI_BASE_URL')
+
+  // In Mistral mode, a literal "undefined" MISTRAL_BASE_URL is treated as
+  // misconfiguration and falls back to OPENAI_API_BASE, then
+  // DEFAULT_MISTRAL_BASE_URL for a safe default endpoint.
+  const fallbackEnvBaseUrl = isMistralMode
+    ? (primaryEnvBaseUrl === undefined
+      ? asNamedEnvUrl(process.env.OPENAI_API_BASE, 'OPENAI_API_BASE') ?? DEFAULT_MISTRAL_BASE_URL
+      : undefined)
+    : isGeminiMode
+    ? (primaryEnvBaseUrl === undefined
+      ? asNamedEnvUrl(process.env.OPENAI_API_BASE, 'OPENAI_API_BASE') ?? DEFAULT_GEMINI_BASE_URL
+      : undefined)
+    : (primaryEnvBaseUrl === undefined
+      ? asNamedEnvUrl(process.env.OPENAI_API_BASE, 'OPENAI_API_BASE')
+      : undefined)
+
  const envBaseUrlRaw =
    explicitBaseUrl ??
-    asEnvUrl(
-      isMistralMode
-        ? (process.env.MISTRAL_BASE_URL ?? DEFAULT_MISTRAL_BASE_URL)
-        : process.env.OPENAI_BASE_URL
-    ) ??
-    asEnvUrl(process.env.OPENAI_API_BASE)
+    primaryEnvBaseUrl ??
+    fallbackEnvBaseUrl

  const isCodexModelForGithub = isGithubMode && isCodexAlias(requestedModel)
  const envBaseUrl =
--- a/src/services/api/reasoningLeakSanitizer.test.ts
+++ b/src/services/api/reasoningLeakSanitizer.test.ts
@@ -1,46 +0,0 @@
-import { describe, expect, test } from 'bun:test'
-
-import {
-  looksLikeLeakedReasoningPrefix,
-  shouldBufferPotentialReasoningPrefix,
-  stripLeakedReasoningPreamble,
-} from './reasoningLeakSanitizer.ts'
-
-describe('reasoning leak sanitizer', () => {
-  test('strips explicit internal reasoning preambles', () => {
-    const text =
-      'The user just said "hey" - a simple greeting. I should respond briefly and friendly.\n\nHey! How can I help you today?'
-
-    expect(looksLikeLeakedReasoningPrefix(text)).toBe(true)
-    expect(stripLeakedReasoningPreamble(text)).toBe(
-      'Hey! How can I help you today?',
-    )
-  })
-
-  test('does not strip normal user-facing advice that mentions "the user should"', () => {
-    const text =
-      'The user should reset their password immediately.\n\nHere are the steps...'
-
-    expect(looksLikeLeakedReasoningPrefix(text)).toBe(false)
-    expect(shouldBufferPotentialReasoningPrefix(text)).toBe(false)
-    expect(stripLeakedReasoningPreamble(text)).toBe(text)
-  })
-
-  test('does not strip legitimate first-person advice about responding to an incident', () => {
-    const text =
-      'I need to respond to this security incident immediately. The system is compromised.\n\nHere are the remediation steps...'
-
-    expect(looksLikeLeakedReasoningPrefix(text)).toBe(false)
-    expect(shouldBufferPotentialReasoningPrefix(text)).toBe(false)
-    expect(stripLeakedReasoningPreamble(text)).toBe(text)
-  })
-
-  test('does not strip legitimate first-person advice about answering a support ticket', () => {
-    const text =
-      'I need to answer the support ticket before end of day. The customer is waiting.\n\nHere is the response I drafted...'
-
-    expect(looksLikeLeakedReasoningPrefix(text)).toBe(false)
-    expect(shouldBufferPotentialReasoningPrefix(text)).toBe(false)
-    expect(stripLeakedReasoningPreamble(text)).toBe(text)
-  })
-})
--- a/src/services/api/reasoningLeakSanitizer.ts
+++ b/src/services/api/reasoningLeakSanitizer.ts
@@ -1,54 +0,0 @@
-const EXPLICIT_REASONING_START_RE =
-  /^\s*(i should\b|i need to\b|let me think\b|the task\b|the request\b)/i
-
-const EXPLICIT_REASONING_META_RE =
-  /\b(user|request|question|prompt|message|task|greeting|small talk|briefly|friendly|concise)\b/i
-
-const USER_META_START_RE =
-  /^\s*the user\s+(just\s+)?(said|asked|is asking|wants|wanted|mentioned|seems|appears)\b/i
-
-const USER_REASONING_RE =
-  /^\s*the user\s+(just\s+)?(said|asked|is asking|wants|wanted|mentioned|seems|appears)\b[\s\S]*\b(i should|i need to|let me think|respond|reply|answer|greeting|small talk|briefly|friendly|concise)\b/i
-
-export function shouldBufferPotentialReasoningPrefix(text: string): boolean {
-  const normalized = text.trim()
-  if (!normalized) return false
-
-  if (looksLikeLeakedReasoningPrefix(normalized)) {
-    return true
-  }
-
-  const hasParagraphBoundary = /\n\s*\n/.test(normalized)
-  if (hasParagraphBoundary) {
-    return false
-  }
-
-  return (
-    EXPLICIT_REASONING_START_RE.test(normalized) ||
-    USER_META_START_RE.test(normalized)
-  )
-}
-
-export function looksLikeLeakedReasoningPrefix(text: string): boolean {
-  const normalized = text.trim()
-  if (!normalized) return false
-  return (
-    (EXPLICIT_REASONING_START_RE.test(normalized) &&
-      EXPLICIT_REASONING_META_RE.test(normalized)) ||
-    USER_REASONING_RE.test(normalized)
-  )
-}
-
-export function stripLeakedReasoningPreamble(text: string): string {
-  const normalized = text.replace(/\r\n/g, '\n')
-  const parts = normalized.split(/\n\s*\n/)
-  if (parts.length < 2) return text
-
-  const first = parts[0]?.trim() ?? ''
-  if (!looksLikeLeakedReasoningPrefix(first)) {
-    return text
-  }
-
-  const remainder = parts.slice(1).join('\n\n').trim()
-  return remainder || text
-}
--- a/src/services/api/smartModelRouting.test.ts
+++ b/src/services/api/smartModelRouting.test.ts
@@ -0,0 +1,191 @@
+import { describe, expect, test } from 'bun:test'
+
+import {
+  routeModel,
+  type SmartRoutingConfig,
+} from './smartModelRouting.ts'
+
+const ENABLED: SmartRoutingConfig = {
+  enabled: true,
+  simpleModel: 'claude-haiku-4-5',
+  strongModel: 'claude-opus-4-7',
+}
+
+describe('routeModel — disabled / misconfigured', () => {
+  test('disabled config routes to strong', () => {
+    const decision = routeModel(
+      { userText: 'hi' },
+      { ...ENABLED, enabled: false },
+    )
+    expect(decision.model).toBe('claude-opus-4-7')
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('disabled')
+  })
+
+  test('missing simpleModel falls back to strong', () => {
+    const decision = routeModel(
+      { userText: 'hi' },
+      { ...ENABLED, simpleModel: '' },
+    )
+    expect(decision.model).toBe('claude-opus-4-7')
+    expect(decision.complexity).toBe('strong')
+  })
+
+  test('simpleModel === strongModel routes to strong (no-op)', () => {
+    const decision = routeModel(
+      { userText: 'hi' },
+      { ...ENABLED, simpleModel: 'claude-opus-4-7' },
+    )
+    expect(decision.model).toBe('claude-opus-4-7')
+    expect(decision.complexity).toBe('strong')
+  })
+})
+
+describe('routeModel — simple path', () => {
+  test('short greeting routes to simple', () => {
+    const decision = routeModel({ userText: 'thanks!', turnNumber: 5 }, ENABLED)
+    expect(decision.model).toBe('claude-haiku-4-5')
+    expect(decision.complexity).toBe('simple')
+  })
+
+  test('empty input routes to simple', () => {
+    const decision = routeModel({ userText: '   ' }, ENABLED)
+    expect(decision.model).toBe('claude-haiku-4-5')
+    expect(decision.complexity).toBe('simple')
+  })
+
+  test('mid-length chatter routes to simple', () => {
+    const decision = routeModel(
+      { userText: 'yep looks good, go ahead', turnNumber: 10 },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('simple')
+  })
+})
+
+describe('routeModel — strong path', () => {
+  test('first turn always routes to strong, even when short', () => {
+    const decision = routeModel(
+      { userText: 'fix the bug', turnNumber: 1 },
+      ENABLED,
+    )
+    expect(decision.model).toBe('claude-opus-4-7')
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('first turn')
+  })
+
+  test('code fence routes to strong', () => {
+    const decision = routeModel(
+      {
+        userText: 'change this:\n```\nfoo()\n```',
+        turnNumber: 5,
+      },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('code')
+  })
+
+  test('inline code span routes to strong', () => {
+    const decision = routeModel(
+      { userText: 'rename `foo` to `bar`', turnNumber: 5 },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+  })
+
+  test('reasoning keyword "plan" routes to strong even when short', () => {
+    const decision = routeModel(
+      { userText: 'plan the refactor', turnNumber: 5 },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('keyword')
+  })
+
+  test('reasoning keyword "debug" routes to strong', () => {
+    const decision = routeModel(
+      { userText: 'debug the test', turnNumber: 5 },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+  })
+
+  test('"root cause" multi-word keyword routes to strong', () => {
+    const decision = routeModel(
+      { userText: 'find the root cause', turnNumber: 5 },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+  })
+
+  test('multi-paragraph input routes to strong', () => {
+    const decision = routeModel(
+      {
+        userText: 'first thought.\n\nsecond thought.',
+        turnNumber: 5,
+      },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('multi-paragraph')
+  })
+
+  test('over-long input routes to strong', () => {
+    const long = 'ok '.repeat(100) // ~300 chars, 100 words
+    const decision = routeModel(
+      { userText: long, turnNumber: 5 },
+      ENABLED,
+    )
+    expect(decision.complexity).toBe('strong')
+  })
+
+  test('exactly at the boundary stays simple', () => {
+    const text = 'a'.repeat(160)
+    const decision = routeModel(
+      { userText: text, turnNumber: 5 },
+      { ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
+    )
+    expect(decision.complexity).toBe('simple')
+  })
+
+  test('one char over the boundary routes to strong', () => {
+    const text = 'a'.repeat(161)
+    const decision = routeModel(
+      { userText: text, turnNumber: 5 },
+      { ...ENABLED, simpleMaxChars: 160, simpleMaxWords: 28 },
+    )
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('160 chars')
+  })
+})
+
+describe('routeModel — config overrides', () => {
+  test('custom simpleMaxChars is honored', () => {
+    const decision = routeModel(
+      { userText: 'abcdefghijklmnop', turnNumber: 5 },
+      { ...ENABLED, simpleMaxChars: 10 },
+    )
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('10 chars')
+  })
+
+  test('custom simpleMaxWords is honored', () => {
+    const decision = routeModel(
+      { userText: 'one two three four five', turnNumber: 5 },
+      { ...ENABLED, simpleMaxWords: 3 },
+    )
+    expect(decision.complexity).toBe('strong')
+    expect(decision.reason).toContain('3 words')
+  })
+})
+
+describe('routeModel — reason strings', () => {
+  test('simple decisions include char + word counts', () => {
+    const decision = routeModel(
+      { userText: 'sounds good', turnNumber: 5 },
+      ENABLED,
+    )
+    expect(decision.reason).toMatch(/\d+ chars, \d+ words/)
+  })
+})
--- a/src/services/api/smartModelRouting.ts
+++ b/src/services/api/smartModelRouting.ts
@@ -0,0 +1,215 @@
+/**
+ * Smart model routing — cheap-for-simple, strong-for-hard.
+ *
+ * For everyday short chatter ("ok", "thanks", "what does this do?") the
+ * incremental quality of Opus/GPT-5 over Haiku/Mini is negligible while the
+ * cost and latency are an order of magnitude worse. Smart routing opts a
+ * user into routing such "obviously simple" turns to a cheaper model while
+ * keeping the strong model for the anything-non-trivial path.
+ *
+ * This module is a pure primitive: it takes a turn description (the user's
+ * text + light context) and returns which model to use, based on config.
+ * It never reads env vars or state directly — caller supplies everything.
+ *
+ * Off by default. Users opt in via settings.smartRouting.enabled. Intent:
+ * make this a copy-paste-small config block rather than a hidden heuristic,
+ * so the tradeoff is visible and the user controls it.
+ */
+
+export type SmartRoutingConfig = {
+  enabled: boolean
+  /** Model to use for turns classified as "simple". */
+  simpleModel: string
+  /** Model to use for turns classified as "strong" (or when unsure). */
+  strongModel: string
+  /** Max characters in user input to qualify as "simple". Default 160. */
+  simpleMaxChars?: number
+  /** Max whitespace-separated words to qualify as "simple". Default 28. */
+  simpleMaxWords?: number
+}
+
+export type RoutingDecision = {
+  model: string
+  complexity: 'simple' | 'strong'
+  /** Human-readable reason — useful for the UI indicator and debug logs. */
+  reason: string
+}
+
+export type RoutingInput = {
+  /** The user's message text for this turn. */
+  userText: string
+  /**
+   * Optional: how many tool-use blocks the assistant has emitted in the
+   * recent conversation. High values correlate with "continue this work"
+   * follow-ups that can still be cheap, UNLESS the user also typed code
+   * or strong-keyword text.
+   */
+  recentToolUses?: number
+  /**
+   * Optional: turn number within the current session (1-indexed). The first
+   * turn is often task-setup and benefits from the strong model even if
+   * short — a bare "build X" opens the whole task.
+   */
+  turnNumber?: number
+}
+
+const DEFAULT_SIMPLE_MAX_CHARS = 160
+const DEFAULT_SIMPLE_MAX_WORDS = 28
+
+// Keywords that strongly suggest reasoning/planning/design work.
+// Matching is word-boundary / case-insensitive. Must include enough anchors
+// that short prompts like "plan the refactor" route to strong even under
+// the char/word cutoff.
+const STRONG_KEYWORDS = [
+  'plan',
+  'design',
+  'architect',
+  'architecture',
+  'refactor',
+  'debug',
+  'investigate',
+  'analyze',
+  'analyse',
+  'implement',
+  'optimize',
+  'optimise',
+  'review',
+  'audit',
+  'diagnose',
+  'root cause',
+  'root-cause',
+  'why does',
+  'why is',
+  'how should',
+  'why did',
+  'propose',
+  'trace',
+  'reproduce',
+]
+
+const STRONG_KEYWORD_RE = new RegExp(
+  `\\b(?:${STRONG_KEYWORDS.map(k => k.replace(/[-]/g, '[-\\s]')).join('|')})\\b`,
+  'i',
+)
+
+const CODE_FENCE_RE = /```[\s\S]*?```|`[^`\n]+`/
+
+function countWords(text: string): number {
+  const trimmed = text.trim()
+  if (!trimmed) return 0
+  return trimmed.split(/\s+/).length
+}
+
+function hasMultiParagraph(text: string): boolean {
+  return /\n\s*\n/.test(text)
+}
+
+function hasCode(text: string): boolean {
+  return CODE_FENCE_RE.test(text)
+}
+
+function hasStrongKeyword(text: string): boolean {
+  return STRONG_KEYWORD_RE.test(text)
+}
+
+/**
+ * Decide whether to route to the simple or strong model based on heuristics.
+ * Returns the chosen model + a reason. When routing is disabled or both
+ * models match, the strong model is used (safe default).
+ */
+export function routeModel(
+  input: RoutingInput,
+  config: SmartRoutingConfig,
+): RoutingDecision {
+  if (!config.enabled) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'smart-routing disabled',
+    }
+  }
+  if (!config.simpleModel || !config.strongModel) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'simpleModel or strongModel missing from config',
+    }
+  }
+  if (config.simpleModel === config.strongModel) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'simpleModel equals strongModel',
+    }
+  }
+
+  const text = input.userText ?? ''
+  const trimmed = text.trim()
+
+  if (!trimmed) {
+    // Empty input (e.g. resuming a tool-use chain) — cheap by default.
+    return {
+      model: config.simpleModel,
+      complexity: 'simple',
+      reason: 'empty user text',
+    }
+  }
+
+  // First turn of a session is task-setup — always use strong.
+  if (input.turnNumber === 1) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'first turn of session',
+    }
+  }
+
+  const maxChars = config.simpleMaxChars ?? DEFAULT_SIMPLE_MAX_CHARS
+  const maxWords = config.simpleMaxWords ?? DEFAULT_SIMPLE_MAX_WORDS
+
+  if (hasCode(trimmed)) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'contains code block or inline code',
+    }
+  }
+
+  if (hasStrongKeyword(trimmed)) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'contains reasoning/planning keyword',
+    }
+  }
+
+  if (hasMultiParagraph(trimmed)) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: 'multi-paragraph input',
+    }
+  }
+
+  if (trimmed.length > maxChars) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: `input > ${maxChars} chars`,
+    }
+  }
+
+  if (countWords(trimmed) > maxWords) {
+    return {
+      model: config.strongModel,
+      complexity: 'strong',
+      reason: `input > ${maxWords} words`,
+    }
+  }
+
+  return {
+    model: config.simpleModel,
+    complexity: 'simple',
+    reason: `short (${trimmed.length} chars, ${countWords(trimmed)} words)`,
+  }
+}
--- a/src/services/api/thinkTagSanitizer.test.ts
+++ b/src/services/api/thinkTagSanitizer.test.ts
@@ -0,0 +1,183 @@
+import { describe, expect, test } from 'bun:test'
+
+import {
+  createThinkTagFilter,
+  stripThinkTags,
+} from './thinkTagSanitizer.ts'
+
+describe('stripThinkTags — whole-text cleanup', () => {
+  test('strips closed think pair', () => {
+    expect(stripThinkTags('<think>reasoning</think>Hello')).toBe('Hello')
+  })
+
+  test('strips closed thinking pair', () => {
+    expect(stripThinkTags('<thinking>x</thinking>Out')).toBe('Out')
+  })
+
+  test('strips closed reasoning pair', () => {
+    expect(stripThinkTags('<reasoning>x</reasoning>Out')).toBe('Out')
+  })
+
+  test('strips REASONING_SCRATCHPAD pair', () => {
+    expect(stripThinkTags('<REASONING_SCRATCHPAD>plan</REASONING_SCRATCHPAD>Answer'))
+      .toBe('Answer')
+  })
+
+  test('is case-insensitive', () => {
+    expect(stripThinkTags('<THINKING>x</THINKING>out')).toBe('out')
+    expect(stripThinkTags('<Think>x</Think>out')).toBe('out')
+  })
+
+  test('handles attributes on open tag', () => {
+    expect(stripThinkTags('<think id="plan-1">reason</think>ok')).toBe('ok')
+  })
+
+  test('strips unterminated open tag at block boundary', () => {
+    expect(stripThinkTags('<think>reasoning that never closes')).toBe('')
+  })
+
+  test('strips unterminated open tag after newline', () => {
+    // Block-boundary match consumes the leading newline, same as hermes.
+    expect(stripThinkTags('Answer: 42\n<think>second-guess myself'))
+      .toBe('Answer: 42')
+  })
+
+  test('strips orphan close tag', () => {
+    expect(stripThinkTags('trailing </think>done')).toBe('trailing done')
+  })
+
+  test('strips multiple blocks', () => {
+    expect(stripThinkTags('<think>a</think>B<think>c</think>D')).toBe('BD')
+  })
+
+  test('handles reasoning mid-response after content', () => {
+    expect(stripThinkTags('Answer: 42\n<think>double-check</think>\nDone'))
+      .toBe('Answer: 42\n\nDone')
+  })
+
+  test('handles nested-looking tags (lazy match + orphan cleanup)', () => {
+    expect(stripThinkTags('<think><think>x</think></think>y')).toBe('y')
+  })
+
+  test('preserves legitimate non-think tags', () => {
+    expect(stripThinkTags('use <div> and <span>')).toBe('use <div> and <span>')
+  })
+
+  test('preserves text without any tags', () => {
+    expect(stripThinkTags('Hello, world. I should respond briefly.')).toBe(
+      'Hello, world. I should respond briefly.',
+    )
+  })
+
+  test('handles empty input', () => {
+    expect(stripThinkTags('')).toBe('')
+  })
+})
+
+describe('createThinkTagFilter — streaming state machine', () => {
+  test('passes through plain text', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('Hello, ')).toBe('Hello, ')
+    expect(f.feed('world!')).toBe('world!')
+    expect(f.flush()).toBe('')
+  })
+
+  test('strips a complete think block in one chunk', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('pre<think>reason</think>post')).toBe('prepost')
+    expect(f.flush()).toBe('')
+  })
+
+  test('handles open tag split across deltas', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('before<th')).toBe('before')
+    expect(f.feed('ink>reason</think>after')).toBe('after')
+    expect(f.flush()).toBe('')
+  })
+
+  test('handles close tag split across deltas', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('<think>reason</th')).toBe('')
+    expect(f.feed('ink>keep')).toBe('keep')
+    expect(f.flush()).toBe('')
+  })
+
+  test('handles tag split on bare < boundary', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('leading <')).toBe('leading ')
+    expect(f.feed('think>inner</think>tail')).toBe('tail')
+    expect(f.flush()).toBe('')
+  })
+
+  test('preserves partial non-tag < at boundary when next char rules it out', () => {
+    const f = createThinkTagFilter()
+    // "<d" — 'd' cannot start any of our tag names, so emit immediately
+    expect(f.feed('pre<d')).toBe('pre<d')
+    expect(f.feed('iv>rest')).toBe('iv>rest')
+    expect(f.flush()).toBe('')
+  })
+
+  test('case-insensitive streaming', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('<THINKING>x</THINKING>out')).toBe('out')
+    expect(f.flush()).toBe('')
+  })
+
+  test('unterminated open tag — flush drops remainder', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('<think>reasoning with no close ')).toBe('')
+    expect(f.feed('and more reasoning')).toBe('')
+    expect(f.flush()).toBe('')
+    expect(f.isInsideBlock()).toBe(false)
+  })
+
+  test('multiple blocks in single feed', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('<think>a</think>B<think>c</think>D')).toBe('BD')
+    expect(f.flush()).toBe('')
+  })
+
+  test('flush after clean stream emits nothing extra', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('complete message')).toBe('complete message')
+    expect(f.flush()).toBe('')
+  })
+
+  test('flush of bare < at end emits it (not a tag prefix)', () => {
+    const f = createThinkTagFilter()
+    // bare '<' held back; flush emits it since it has no tag-name chars
+    expect(f.feed('x <')).toBe('x ')
+    expect(f.flush()).toBe('<')
+  })
+
+  test('flush of partial tag-name prefix at end drops it', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('x <thi')).toBe('x ')
+    expect(f.flush()).toBe('')
+  })
+
+  test('handles attributes on streaming open tag', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('<think type="plan">reason</think>ok')).toBe('ok')
+    expect(f.flush()).toBe('')
+  })
+
+  test('mid-delta transition: content, reasoning, content', () => {
+    const f = createThinkTagFilter()
+    expect(f.feed('Answer: 42\n<think>')).toBe('Answer: 42\n')
+    expect(f.feed('double-check')).toBe('')
+    expect(f.feed('</think>\nDone')).toBe('\nDone')
+    expect(f.flush()).toBe('')
+  })
+
+  test('orphan close tag mid-stream is stripped on flush via safety-net behavior', () => {
+    // Filter alone treats orphan close as "we're not inside", so it emits as-is.
+    // Safety net (stripThinkTags on final text) removes orphans.
+    const f = createThinkTagFilter()
+    const chunk1 = f.feed('trailing ')
+    const chunk2 = f.feed('</think>done')
+    const final = chunk1 + chunk2 + f.flush()
+    // Orphan close appears in stream output; safety net cleans it
+    expect(stripThinkTags(final)).toBe('trailing done')
+  })
+})
--- a/src/services/api/thinkTagSanitizer.ts
+++ b/src/services/api/thinkTagSanitizer.ts
@@ -0,0 +1,162 @@
+/**
+ * Think-tag sanitizer for reasoning content leaks.
+ *
+ * Some OpenAI-compatible reasoning models (MiniMax M2.7, GLM-4.5/5, DeepSeek, Kimi K2,
+ * self-hosted vLLM builds) emit chain-of-thought inline inside the `content` field using
+ * XML-like tags instead of the separate `reasoning_content` channel. Example:
+ *
+ *   <think>the user wants foo, let me check bar</think>Here is the answer: ...
+ *
+ * This module strips those blocks structurally (tag-based), independent of English
+ * phrasings. Three layers:
+ *
+ *   1. `createThinkTagFilter()` — streaming state machine. Feeds deltas, emits only
+ *      the visible (non-reasoning) portion, and buffers partial tags across chunk
+ *      boundaries so `</th` + `ink>` still parses correctly.
+ *
+ *   2. `stripThinkTags()` — whole-text cleanup. Removes closed pairs, unterminated
+ *      opens at block boundaries, and orphan open/close tags. Used for non-streaming
+ *      responses and as a safety net after stream close.
+ *
+ *   3. Flush discards buffered partial tags at stream end (false-negative bias —
+ *      prefer losing a partial reasoning fragment over leaking it).
+ */
+
+const TAG_NAMES = [
+  'think',
+  'thinking',
+  'reasoning',
+  'thought',
+  'reasoning_scratchpad',
+] as const
+
+const TAG_ALT = TAG_NAMES.join('|')
+
+const OPEN_TAG_RE = new RegExp(`<\\s*(?:${TAG_ALT})\\b[^>]*>`, 'i')
+const CLOSE_TAG_RE = new RegExp(`<\\s*/\\s*(?:${TAG_ALT})\\s*>`, 'i')
+
+const CLOSED_PAIR_RE_G = new RegExp(
+  `<\\s*(${TAG_ALT})\\b[^>]*>[\\s\\S]*?<\\s*/\\s*\\1\\s*>`,
+  'gi',
+)
+const UNTERMINATED_OPEN_RE = new RegExp(
+  `(?:^|\\n)[ \\t]*<\\s*(?:${TAG_ALT})\\b[^>]*>[\\s\\S]*$`,
+  'i',
+)
+const ORPHAN_TAG_RE_G = new RegExp(
+  `<\\s*/?\\s*(?:${TAG_ALT})\\b[^>]*>\\s*`,
+  'gi',
+)
+
+const MAX_PARTIAL_TAG = 64
+
+/**
+ * Remove reasoning/thinking blocks from a complete text body.
+ *
+ * Handles:
+ *   - Closed pairs: <think>...</think> (lazy match, anywhere in text)
+ *   - Unterminated open tags at a block boundary: strips from the tag to end of string
+ *   - Orphan open or close tags (no matching partner)
+ *
+ * False-negative bias: prefers leaving a few tag characters in rare edge cases over
+ * stripping legitimate content.
+ */
+export function stripThinkTags(text: string): string {
+  if (!text) return text
+  let out = text
+  out = out.replace(CLOSED_PAIR_RE_G, '')
+  out = out.replace(UNTERMINATED_OPEN_RE, '')
+  out = out.replace(ORPHAN_TAG_RE_G, '')
+  return out
+}
+
+export interface ThinkTagFilter {
+  feed(chunk: string): string
+  flush(): string
+  isInsideBlock(): boolean
+}
+
+/**
+ * Streaming state machine. Feed deltas, emits visible (non-reasoning) text.
+ * Handles tags split across chunk boundaries by holding back a short tail buffer
+ * whenever the current buffer ends with what looks like a partial tag.
+ */
+export function createThinkTagFilter(): ThinkTagFilter {
+  let inside = false
+  let buffer = ''
+
+  function findPartialTagStart(s: string): number {
+    const lastLt = s.lastIndexOf('<')
+    if (lastLt === -1) return -1
+    if (s.indexOf('>', lastLt) !== -1) return -1
+    const tail = s.slice(lastLt)
+    if (tail.length > MAX_PARTIAL_TAG) return -1
+
+    const m = /^<\s*\/?\s*([a-zA-Z_]\w*)?\s*$/.exec(tail)
+    if (!m) return -1
+    const partialName = (m[1] ?? '').toLowerCase()
+    if (!partialName) return lastLt
+    if (TAG_NAMES.some(name => name.startsWith(partialName))) return lastLt
+    return -1
+  }
+
+  function feed(chunk: string): string {
+    if (!chunk) return ''
+    buffer += chunk
+    let out = ''
+
+    while (buffer.length > 0) {
+      if (!inside) {
+        const open = OPEN_TAG_RE.exec(buffer)
+        if (open) {
+          out += buffer.slice(0, open.index)
+          buffer = buffer.slice(open.index + open[0].length)
+          inside = true
+          continue
+        }
+
+        const partialStart = findPartialTagStart(buffer)
+        if (partialStart === -1) {
+          out += buffer
+          buffer = ''
+        } else {
+          out += buffer.slice(0, partialStart)
+          buffer = buffer.slice(partialStart)
+        }
+        return out
+      }
+
+      const close = CLOSE_TAG_RE.exec(buffer)
+      if (close) {
+        buffer = buffer.slice(close.index + close[0].length)
+        inside = false
+        continue
+      }
+
+      const partialStart = findPartialTagStart(buffer)
+      if (partialStart === -1) {
+        buffer = ''
+      } else {
+        buffer = buffer.slice(partialStart)
+      }
+      return out
+    }
+
+    return out
+  }
+
+  function flush(): string {
+    const held = buffer
+    const wasInside = inside
+    buffer = ''
+    inside = false
+
+    if (wasInside) return ''
+    if (!held) return ''
+
+    if (/^<\s*\/?\s*[a-zA-Z_]/.test(held)) return ''
+    return held
+  }
+
+  return { feed, flush, isInsideBlock: () => inside }
+}
--- a/src/services/autoFix/autoFixRunner.test.ts
+++ b/src/services/autoFix/autoFixRunner.test.ts
@@ -70,7 +70,7 @@ describe('runAutoFixCheck', () => {

  test('handles timeout gracefully', async () => {
    const result = await runAutoFixCheck({
-      lint: 'sleep 10',
+      lint: 'node -e "setTimeout(() => {}, 10000)"',
      timeout: 100,

      cwd: '/tmp',
--- a/src/services/autoFix/autoFixRunner.ts
+++ b/src/services/autoFix/autoFixRunner.ts
@@ -46,14 +46,31 @@ async function runCommand(

    const killTree = () => {
      try {
-        if (!isWindows && proc.pid) {
+        if (isWindows && proc.pid) {
+          // shell=true on Windows can leave child commands running unless we
+          // terminate the full process tree.
+          const killer = spawn('taskkill', ['/pid', String(proc.pid), '/T', '/F'], {
+            windowsHide: true,
+            stdio: 'ignore',
+          })
+          killer.unref()
+          return
+        }
+
+        if (proc.pid) {
          // Kill the entire process group
          process.kill(-proc.pid, 'SIGTERM')
-        } else {
-          proc.kill('SIGTERM')
+          return
        }
+
+        proc.kill('SIGTERM')
      } catch {
-        // Process may have already exited
+        // Process may have already exited; fallback to direct child kill.
+        try {
+          proc.kill('SIGTERM')
+        } catch {
+          // Ignore final fallback errors.
+        }
      }
    }

--- a/src/services/compact/autoCompact.test.ts
+++ b/src/services/compact/autoCompact.test.ts
@@ -16,12 +16,21 @@ describe('getEffectiveContextWindowSize', () => {
    // 8k minus 20k summary reservation = -12k, causing infinite auto-compact.
    // Now the fallback is 128k and there's a floor, so effective is always
    // at least reservedTokensForSummary + buffer.
+    //
+    // The exact floor depends on the max-output-tokens slot-reservation cap
+    // (tengu_otk_slot_v1 GrowthBook flag). With cap enabled, the model's
+    // default output cap drops to CAPPED_DEFAULT_MAX_TOKENS (8k), so the
+    // summary reservation is 8k and the floor is 8k + 13k = 21k. With cap
+    // disabled it's 20k + 13k = 33k. Assert the worst case so the test is
+    // stable regardless of flag state in CI vs local.
    process.env.CLAUDE_CODE_USE_OPENAI = '1'
    try {
      const effective = getEffectiveContextWindowSize('some-unknown-3p-model')
      expect(effective).toBeGreaterThan(0)
-      // Must be at least summary reservation (20k) + buffer (13k) = 33k
-      expect(effective).toBeGreaterThanOrEqual(33_000)
+      // 21k = CAPPED_DEFAULT_MAX_TOKENS (8k) + AUTOCOMPACT_BUFFER_TOKENS (13k).
+      // Covers the anti-regression intent of issue #635 without assuming
+      // the GrowthBook flag state.
+      expect(effective).toBeGreaterThanOrEqual(21_000)
    } finally {
      delete process.env.CLAUDE_CODE_USE_OPENAI
    }
--- a/Show More
+++ b/Show More