Feat/kimi moonshot support (#805)

* feat(provider): first-class Moonshot (Kimi) direct-API support Moonshot's direct API (api.moonshot.ai/v1) is OpenAI-compatible and works today via the generic OpenAI shim, including the reasoning_content channel that Kimi returns alongside the user-visible content. But the UX was rough: unknown context window triggered the conservative 128k fallback + a warning, and the provider displayed as "Local OpenAI-compatible". Makes Moonshot a recognized provider: - src/utils/model/openaiContextWindows.ts: add the Kimi K2 family and moonshot-v1-* variants to both the context-window and max-output tables. Values from Moonshot's model card — K2.6 and K2-thinking are 256K, K2/K2-instruct are 128K, moonshot-v1 sizes are embedded in the model id. - src/utils/providerDiscovery.ts: recognize the api.moonshot.ai hostname and label it "Moonshot (Kimi)" in the startup banner and provider UI. Users can now launch with: CLAUDE_CODE_USE_OPENAI=1 \ OPENAI_BASE_URL=https://api.moonshot.ai/v1 \ OPENAI_API_KEY=sk-... \ OPENAI_MODEL=kimi-k2.6 \ openclaude and get accurate compaction + correct labeling + correct max_tokens out of the box. Co-Authored-By: OpenClaude <openclaude@gitlawb.com> * fix(openai-shim): Moonshot API compatibility — max_tokens + strip store Moonshot's direct API (api.moonshot.ai and api.moonshot.cn) uses the classic OpenAI `max_tokens` parameter, not the newer `max_completion_tokens` that the shim defaults to. It also hasn't published support for `store` and may reject it on strict-parse — same class of error as Gemini's "Unknown name 'store': Cannot find field" 400. - Adds isMoonshotBaseUrl() that recognizes both .ai and .cn hosts. - Converts max_completion_tokens → max_tokens for Moonshot requests (alongside GitHub / Mistral / local providers). - Strips body.store for Moonshot requests (alongside Mistral / Gemini). Two shim tests cover both the .ai and .cn hostnames. Co-Authored-By: OpenClaude <openclaude@gitlawb.com> * fix: null-safe access on getCachedMCConfig() in external builds External builds stub src/services/compact/cachedMicrocompact.ts so getCachedMCConfig() returns null, but two call sites still dereferenced config.supportedModels directly. The ?. operator was in the wrong place (config.supportedModels? instead of config?.supportedModels), so the null config threw "Cannot read properties of null (reading 'supportedModels')" on every request. Reproduces with any external-build provider (notably Kimi/Moonshot just enabled in the sibling commits, but equally DeepSeek, Mistral, Groq, Ollama, etc.): ❯ hey ⏺ Cannot read properties of null (reading 'supportedModels') - prompts.ts: early-return from getFunctionResultClearingSection() when config is null, before touching .supportedModels. - claude.ts: guard the debug-log jsonStringify with ?. so the log line never throws. Co-Authored-By: OpenClaude <openclaude@gitlawb.com> * fix(startup): show "Moonshot (Kimi)" on the startup banner The startup-screen provider detector had regex branches for OpenRouter, DeepSeek, Groq, Together, Azure, etc., but nothing for Moonshot. Remote Moonshot sessions fell through to the generic "OpenAI" label — getLocalOpenAICompatibleProviderLabel() only runs for local URLs, and api.moonshot.ai / api.moonshot.cn are not local. Adds a Moonshot branch matching /moonshot/ in the base URL OR /kimi/ in the model id. Now launches with: OPENAI_BASE_URL=https://api.moonshot.ai/v1 OPENAI_MODEL=kimi-k2.6 display the Provider row as "Moonshot (Kimi)" instead of "OpenAI". Co-Authored-By: OpenClaude <openclaude@gitlawb.com> * refactor(provider): sort preset picker alphabetically; Custom at end The /provider preset picker was in ad-hoc order (Anthropic, Ollama, OpenAI, then a jumble of third-party / local / codex / Alibaba / custom / nvidia / minimax). Hard to scan when you know the provider name you want. Sorts the list alphabetically by label A→Z. Pins "Custom" to the end — it's the catch-all / escape hatch so it's scanned last, not shuffled into the alphabetical run where a user looking for a named provider might grab it by mistake. First-run-only "Skip for now" stays at the very bottom, after Custom. Test churn: - ProviderManager.test.tsx: four tests hardcoded press counts (1 or 3 'j' presses) that broke when targets moved. Replaces them with a navigateToPreset(stdin, label) helper driven from a declared PRESET_ORDER array, so future list edits only update the array. - ConsoleOAuthFlow.test.tsx: the 13-row test frame only renders the first ~13 providers. "Ollama", "OpenAI", "LM Studio" sentinels moved below the fold; swap them for alphabetically-early providers still visible in-frame ("Azure OpenAI", "DeepSeek", "Google Gemini"). Test intent (picker opened with providers listed) is preserved. Co-Authored-By: OpenClaude <openclaude@gitlawb.com> --------- Co-authored-by: OpenClaude <openclaude@gitlawb.com>
2026-04-21 21:20:54 +08:00
parent 2b15e16421
commit b95d2221df
11 changed files with 226 additions and 72 deletions
--- a/src/components/ConsoleOAuthFlow.test.tsx
+++ b/src/components/ConsoleOAuthFlow.test.tsx
@@ -112,8 +112,10 @@ test('third-party provider branch opens the first-run provider manager', async (
  )

  expect(output).toContain('Set up provider')
+  // Use alphabetically-early sentinels so they remain visible in the
+  // 13-row test frame after the provider list was sorted A→Z.
  expect(output).toContain('Anthropic')
-  expect(output).toContain('OpenAI')
-  expect(output).toContain('Ollama')
-  expect(output).toContain('LM Studio')
+  expect(output).toContain('Azure OpenAI')
+  expect(output).toContain('DeepSeek')
+  expect(output).toContain('Google Gemini')
 })
--- a/src/components/ProviderManager.test.tsx
+++ b/src/components/ProviderManager.test.tsx
@@ -97,6 +97,46 @@ async function waitForCondition(
  throw new Error('Timed out waiting for ProviderManager test condition')
 }

+// Provider list is sorted alphabetically by label in the preset picker, so
+// reaching a given provider takes more keypresses than it used to. Keep the
+// target-by-label indirection here so these tests survive future list edits
+// without further churn.
+//
+// Order matches ProviderManager.renderPresetSelection() when
+// canUseCodexOAuth === true (default in mocked tests).
+const PRESET_ORDER = [
+  'Alibaba Coding Plan',
+  'Alibaba Coding Plan (China)',
+  'Anthropic',
+  'Azure OpenAI',
+  'Codex OAuth',
+  'DeepSeek',
+  'Google Gemini',
+  'Groq',
+  'LM Studio',
+  'MiniMax',
+  'Mistral',
+  'Moonshot AI',
+  'NVIDIA NIM',
+  'Ollama',
+  'OpenAI',
+  'OpenRouter',
+  'Together AI',
+  'Custom',
+] as const
+
+async function navigateToPreset(
+  stdin: { write: (data: string) => void },
+  label: (typeof PRESET_ORDER)[number],
+): Promise<void> {
+  const index = PRESET_ORDER.indexOf(label)
+  if (index < 0) throw new Error(`Unknown preset label: ${label}`)
+  for (let i = 0; i < index; i++) {
+    stdin.write('j')
+    await Bun.sleep(25)
+  }
+}
+
 function createDeferred<T>(): {
  promise: Promise<T>
  resolve: (value: T) => void
@@ -491,11 +531,10 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as

  await waitForFrameOutput(
    mounted.getOutput,
-    frame => frame.includes('Set up provider') && frame.includes('Ollama'),
+    frame => frame.includes('Set up provider'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(50)
+  await navigateToPreset(mounted.stdin, 'Ollama')
  mounted.stdin.write('\r')

  const modelFrame = await waitForFrameOutput(
@@ -590,12 +629,7 @@ test('ProviderManager first-run Codex OAuth switches the current session after l
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -687,12 +721,7 @@ test('ProviderManager first-run Codex OAuth reports next-startup fallback when s
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -786,12 +815,7 @@ test('ProviderManager does not hijack a manual Codex profile when OAuth credenti
    frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
  )

-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
-  mounted.stdin.write('j')
-  await Bun.sleep(25)
+  await navigateToPreset(mounted.stdin, 'Codex OAuth')
  mounted.stdin.write('\r')

  await waitForCondition(() => onDone.mock.calls.length > 0)
--- a/src/components/ProviderManager.tsx
+++ b/src/components/ProviderManager.tsx
@@ -1094,21 +1094,30 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {

  function renderPresetSelection(): React.ReactNode {
    const canUseCodexOAuth = !isBareMode()
+    // Providers sorted alphabetically by label. `Custom` is pinned to the end
+    // because it's the catch-all / escape hatch — users scanning the list
+    // should always find known providers first. `Skip for now` (first-run
+    // only) comes last, after Custom.
    const options = [
+      {
+        value: 'dashscope-intl',
+        label: 'Alibaba Coding Plan',
+        description: 'Alibaba DashScope International endpoint',
+      },
+      {
+        value: 'dashscope-cn',
+        label: 'Alibaba Coding Plan (China)',
+        description: 'Alibaba DashScope China endpoint',
+      },
      {
        value: 'anthropic',
        label: 'Anthropic',
        description: 'Native Claude API (x-api-key auth)',
      },
      {
-        value: 'ollama',
-        label: 'Ollama',
-        description: 'Local or remote Ollama endpoint',
-      },
-      {
-        value: 'openai',
-        label: 'OpenAI',
-        description: 'OpenAI API with API key',
+        value: 'azure-openai',
+        label: 'Azure OpenAI',
+        description: 'Azure OpenAI endpoint (model=deployment name)',
      },
      ...(canUseCodexOAuth
        ? [
@@ -1120,11 +1129,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
            },
          ]
        : []),
-      {
-        value: 'moonshotai',
-        label: 'Moonshot AI',
-        description: 'Kimi OpenAI-compatible endpoint',
-      },
      {
        value: 'deepseek',
        label: 'DeepSeek',
@@ -1135,50 +1139,30 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        label: 'Google Gemini',
        description: 'Gemini OpenAI-compatible endpoint',
      },
-      {
-        value: 'together',
-        label: 'Together AI',
-        description: 'Together chat/completions endpoint',
-      },
      {
        value: 'groq',
        label: 'Groq',
        description: 'Groq OpenAI-compatible endpoint',
      },
-      {
-        value: 'mistral',
-        label: 'Mistral',
-        description: 'Mistral OpenAI-compatible endpoint',
-      },
-      {
-        value: 'azure-openai',
-        label: 'Azure OpenAI',
-        description: 'Azure OpenAI endpoint (model=deployment name)',
-      },
-      {
-        value: 'openrouter',
-        label: 'OpenRouter',
-        description: 'OpenRouter OpenAI-compatible endpoint',
-      },
      {
        value: 'lmstudio',
        label: 'LM Studio',
        description: 'Local LM Studio endpoint',
      },
      {
-        value: 'dashscope-cn',
-        label: 'Alibaba Coding Plan (China)',
-        description: 'Alibaba DashScope China endpoint',
+        value: 'minimax',
+        label: 'MiniMax',
+        description: 'MiniMax API endpoint',
      },
      {
-        value: 'dashscope-intl',
-        label: 'Alibaba Coding Plan',
-        description: 'Alibaba DashScope International endpoint',
+        value: 'mistral',
+        label: 'Mistral',
+        description: 'Mistral OpenAI-compatible endpoint',
      },
      {
-        value: 'custom',
-        label: 'Custom',
-        description: 'Any OpenAI-compatible provider',
+        value: 'moonshotai',
+        label: 'Moonshot AI',
+        description: 'Kimi OpenAI-compatible endpoint',
      },
      {
        value: 'nvidia-nim',
@@ -1186,9 +1170,29 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
        description: 'NVIDIA NIM endpoint',
      },
      {
-        value: 'minimax',
-        label: 'MiniMax',
-        description: 'MiniMax API endpoint',
+        value: 'ollama',
+        label: 'Ollama',
+        description: 'Local or remote Ollama endpoint',
+      },
+      {
+        value: 'openai',
+        label: 'OpenAI',
+        description: 'OpenAI API with API key',
+      },
+      {
+        value: 'openrouter',
+        label: 'OpenRouter',
+        description: 'OpenRouter OpenAI-compatible endpoint',
+      },
+      {
+        value: 'together',
+        label: 'Together AI',
+        description: 'Together chat/completions endpoint',
+      },
+      {
+        value: 'custom',
+        label: 'Custom',
+        description: 'Any OpenAI-compatible provider',
      },
      ...(mode === 'first-run'
        ? [
--- a/src/components/StartupScreen.ts
+++ b/src/components/StartupScreen.ts
@@ -123,6 +123,8 @@ function detectProvider(): { name: string; model: string; baseUrl: string; isLoc
      name = 'MiniMax'
    else if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex'))
      name = 'Codex'
+    else if (/moonshot/i.test(baseUrl) || /kimi/i.test(rawModel))
+      name = 'Moonshot (Kimi)'
    else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))
      name = 'DeepSeek'
    else if (/openrouter/i.test(baseUrl))
--- a/src/constants/prompts.ts
+++ b/src/constants/prompts.ts
@@ -823,6 +823,11 @@ function getFunctionResultClearingSection(model: string): string | null {
    return null
  }
  const config = getCachedMCConfigForFRC()
+  if (!config) {
+    // External/stub builds return null from getCachedMCConfig — abort the
+    // section rather than trying to read .supportedModels off null.
+    return null
+  }
  const isModelSupported = config.supportedModels?.some(pattern =>
    model.includes(pattern),
  )
--- a/src/services/api/claude.ts
+++ b/src/services/api/claude.ts
@@ -1217,7 +1217,7 @@ async function* queryModel(
    cachedMCEnabled = featureEnabled && modelSupported
    const config = getCachedMCConfig()
    logForDebugging(
-      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config.supportedModels)}`,
+      `Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config?.supportedModels)}`,
    )
  }

--- a/src/services/api/openaiShim.test.ts
+++ b/src/services/api/openaiShim.test.ts
@@ -3308,3 +3308,69 @@ test('injects semantic assistant message when tool result is followed by user me
  expect(semanticMsg.role).toBe('assistant')
  expect(semanticMsg.content).toBe('[Tool execution interrupted by user]')
 })
+
+test('Moonshot: uses max_tokens (not max_completion_tokens) and strips store', async () => {
+  process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
+  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
+
+  let requestBody: Record<string, unknown> | undefined
+  globalThis.fetch = (async (_input, init) => {
+    requestBody = JSON.parse(String(init?.body))
+    return new Response(
+      JSON.stringify({
+        id: 'chatcmpl-1',
+        model: 'kimi-k2.6',
+        choices: [
+          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
+        ],
+        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
+      }),
+      { headers: { 'Content-Type': 'application/json' } },
+    )
+  }) as FetchType
+
+  const client = createOpenAIShimClient({}) as OpenAIShimClient
+  await client.beta.messages.create({
+    model: 'kimi-k2.6',
+    system: 'you are kimi',
+    messages: [{ role: 'user', content: 'hi' }],
+    max_tokens: 256,
+    stream: false,
+  })
+
+  expect(requestBody?.max_tokens).toBe(256)
+  expect(requestBody?.max_completion_tokens).toBeUndefined()
+  expect(requestBody?.store).toBeUndefined()
+})
+
+test('Moonshot: cn host is also detected', async () => {
+  process.env.OPENAI_BASE_URL = 'https://api.moonshot.cn/v1'
+  process.env.OPENAI_API_KEY = 'sk-moonshot-test'
+
+  let requestBody: Record<string, unknown> | undefined
+  globalThis.fetch = (async (_input, init) => {
+    requestBody = JSON.parse(String(init?.body))
+    return new Response(
+      JSON.stringify({
+        id: 'chatcmpl-1',
+        model: 'kimi-k2.6',
+        choices: [
+          { message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
+        ],
+        usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
+      }),
+      { headers: { 'Content-Type': 'application/json' } },
+    )
+  }) as FetchType
+
+  const client = createOpenAIShimClient({}) as OpenAIShimClient
+  await client.beta.messages.create({
+    model: 'kimi-k2.6',
+    system: 'you are kimi',
+    messages: [{ role: 'user', content: 'hi' }],
+    max_tokens: 256,
+    stream: false,
+  })
+
+  expect(requestBody?.store).toBeUndefined()
+})
--- a/src/services/api/openaiShim.ts
+++ b/src/services/api/openaiShim.ts
@@ -82,6 +82,10 @@ const GITHUB_429_MAX_RETRIES = 3
 const GITHUB_429_BASE_DELAY_SEC = 1
 const GITHUB_429_MAX_DELAY_SEC = 32
 const GEMINI_API_HOST = 'generativelanguage.googleapis.com'
+const MOONSHOT_API_HOSTS = new Set([
+  'api.moonshot.ai',
+  'api.moonshot.cn',
+])

 const COPILOT_HEADERS: Record<string, string> = {
  'User-Agent': 'GitHubCopilotChat/0.26.7',
@@ -147,6 +151,15 @@ function hasGeminiApiHost(baseUrl: string | undefined): boolean {
  }
 }

+function isMoonshotBaseUrl(baseUrl: string | undefined): boolean {
+  if (!baseUrl) return false
+  try {
+    return MOONSHOT_API_HOSTS.has(new URL(baseUrl).hostname.toLowerCase())
+  } catch {
+    return false
+  }
+}
+
 function formatRetryAfterHint(response: Response): string {
  const ra = response.headers.get('retry-after')
  return ra ? ` (Retry-After: ${ra})` : ''
@@ -1447,14 +1460,19 @@ class OpenAIShimMessages {
    const isGithubCopilot = isGithub && githubEndpointType === 'copilot'
    const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom')

-    if ((isGithub || isMistral || isLocal) && body.max_completion_tokens !== undefined) {
+    const isMoonshot = isMoonshotBaseUrl(request.baseUrl)
+
+    if ((isGithub || isMistral || isLocal || isMoonshot) && body.max_completion_tokens !== undefined) {
      body.max_tokens = body.max_completion_tokens
      delete body.max_completion_tokens
    }

    // mistral and gemini don't recognize body.store — Gemini returns 400
    // "Invalid JSON payload received. Unknown name 'store': Cannot find field."
-    if (isMistral || isGeminiMode()) {
+    // Moonshot (api.moonshot.ai/.cn) has not published support for the
+    // parameter either; strip it preemptively to avoid the same class of
+    // error on strict-parse providers.
+    if (isMistral || isGeminiMode() || isMoonshot) {
      delete body.store
    }

--- a/src/utils/model/openaiContextWindows.ts
+++ b/src/utils/model/openaiContextWindows.ts
@@ -219,6 +219,17 @@ const OPENAI_CONTEXT_WINDOWS: Record<string, number> = {
  'kimi-k2.5':                262_144,
  'glm-5':                    202_752,
  'glm-4.7':                  202_752,
+
+  // Moonshot AI direct API (api.moonshot.ai/v1). Values from Moonshot's
+  // published model card — all K2 tier share 256K context. Prefix matching
+  // in lookupByKey catches variants like "kimi-k2.6-preview".
+  'kimi-k2.6':                262_144,
+  'kimi-k2':                  131_072,
+  'kimi-k2-instruct':         131_072,
+  'kimi-k2-thinking':         262_144,
+  'moonshot-v1-8k':             8_192,
+  'moonshot-v1-32k':           32_768,
+  'moonshot-v1-128k':         131_072,
 }

 /**
@@ -391,6 +402,15 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
  'kimi-k2.5':                 32_768,
  'glm-5':                     16_384,
  'glm-4.7':                   16_384,
+
+  // Moonshot AI direct API
+  'kimi-k2.6':                 32_768,
+  'kimi-k2':                   32_768,
+  'kimi-k2-instruct':          32_768,
+  'kimi-k2-thinking':          32_768,
+  'moonshot-v1-8k':             4_096,
+  'moonshot-v1-32k':           16_384,
+  'moonshot-v1-128k':          32_768,
 }

 function lookupByModel<T>(table: Record<string, T>, model: string): T | undefined {
--- a/src/utils/providerDiscovery.test.ts
+++ b/src/utils/providerDiscovery.test.ts
@@ -81,6 +81,15 @@ test('detects common local openai-compatible providers by hostname', async () =>
  ).toBe('vLLM')
 })

+test('detects Moonshot (Kimi) from api.moonshot.ai hostname', async () => {
+  const { getLocalOpenAICompatibleProviderLabel } =
+    await loadProviderDiscoveryModule()
+
+  expect(
+    getLocalOpenAICompatibleProviderLabel('https://api.moonshot.ai/v1'),
+  ).toBe('Moonshot (Kimi)')
+})
+
 test('falls back to a generic local openai-compatible label', async () => {
  const { getLocalOpenAICompatibleProviderLabel } =
    await loadProviderDiscoveryModule()
--- a/src/utils/providerDiscovery.ts
+++ b/src/utils/providerDiscovery.ts
@@ -197,6 +197,10 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
    if (host.includes('minimax') || haystack.includes('minimax')) {
      return 'MiniMax'
    }
+    // Moonshot AI (Kimi) direct API
+    if (host.includes('moonshot') || haystack.includes('moonshot') || haystack.includes('kimi')) {
+      return 'Moonshot (Kimi)'
+    }
  } catch {
    // Fall back to the generic label when the base URL is malformed.
  }