Feat/kimi moonshot support (#805)

* feat(provider): first-class Moonshot (Kimi) direct-API support

Moonshot's direct API (api.moonshot.ai/v1) is OpenAI-compatible and works
today via the generic OpenAI shim, including the reasoning_content channel
that Kimi returns alongside the user-visible content. But the UX was rough:
unknown context window triggered the conservative 128k fallback + a warning,
and the provider displayed as "Local OpenAI-compatible".

Makes Moonshot a recognized provider:

- src/utils/model/openaiContextWindows.ts: add the Kimi K2 family and
  moonshot-v1-* variants to both the context-window and max-output tables.
  Values from Moonshot's model card — K2.6 and K2-thinking are 256K,
  K2/K2-instruct are 128K, moonshot-v1 sizes are embedded in the model id.
- src/utils/providerDiscovery.ts: recognize the api.moonshot.ai hostname
  and label it "Moonshot (Kimi)" in the startup banner and provider UI.

Users can now launch with:

  CLAUDE_CODE_USE_OPENAI=1 \
  OPENAI_BASE_URL=https://api.moonshot.ai/v1 \
  OPENAI_API_KEY=sk-... \
  OPENAI_MODEL=kimi-k2.6 \
  openclaude

and get accurate compaction + correct labeling + correct max_tokens out
of the box.

Co-Authored-By: OpenClaude <openclaude@gitlawb.com>

* fix(openai-shim): Moonshot API compatibility — max_tokens + strip store

Moonshot's direct API (api.moonshot.ai and api.moonshot.cn) uses the
classic OpenAI `max_tokens` parameter, not the newer `max_completion_tokens`
that the shim defaults to. It also hasn't published support for `store`
and may reject it on strict-parse — same class of error as Gemini's
"Unknown name 'store': Cannot find field" 400.

- Adds isMoonshotBaseUrl() that recognizes both .ai and .cn hosts.
- Converts max_completion_tokens → max_tokens for Moonshot requests
  (alongside GitHub / Mistral / local providers).
- Strips body.store for Moonshot requests (alongside Mistral / Gemini).

Two shim tests cover both the .ai and .cn hostnames.

Co-Authored-By: OpenClaude <openclaude@gitlawb.com>

* fix: null-safe access on getCachedMCConfig() in external builds

External builds stub src/services/compact/cachedMicrocompact.ts so
getCachedMCConfig() returns null, but two call sites still dereferenced
config.supportedModels directly. The ?. operator was in the wrong place
(config.supportedModels? instead of config?.supportedModels), so the null
config threw "Cannot read properties of null (reading 'supportedModels')"
on every request.

Reproduces with any external-build provider (notably Kimi/Moonshot just
enabled in the sibling commits, but equally DeepSeek, Mistral, Groq,
Ollama, etc.):

  ❯ hey
  ⏺ Cannot read properties of null (reading 'supportedModels')

- prompts.ts: early-return from getFunctionResultClearingSection() when
  config is null, before touching .supportedModels.
- claude.ts: guard the debug-log jsonStringify with ?. so the log line
  never throws.

Co-Authored-By: OpenClaude <openclaude@gitlawb.com>

* fix(startup): show "Moonshot (Kimi)" on the startup banner

The startup-screen provider detector had regex branches for OpenRouter,
DeepSeek, Groq, Together, Azure, etc., but nothing for Moonshot. Remote
Moonshot sessions fell through to the generic "OpenAI" label —
getLocalOpenAICompatibleProviderLabel() only runs for local URLs, and
api.moonshot.ai / api.moonshot.cn are not local.

Adds a Moonshot branch matching /moonshot/ in the base URL OR /kimi/ in
the model id. Now launches with:

  OPENAI_BASE_URL=https://api.moonshot.ai/v1 OPENAI_MODEL=kimi-k2.6

display the Provider row as "Moonshot (Kimi)" instead of "OpenAI".

Co-Authored-By: OpenClaude <openclaude@gitlawb.com>

* refactor(provider): sort preset picker alphabetically; Custom at end

The /provider preset picker was in ad-hoc order (Anthropic, Ollama,
OpenAI, then a jumble of third-party / local / codex / Alibaba / custom /
nvidia / minimax). Hard to scan when you know the provider name you want.

Sorts the list alphabetically by label A→Z. Pins "Custom" to the end —
it's the catch-all / escape hatch so it's scanned last, not shuffled into
the alphabetical run where a user looking for a named provider might
grab it by mistake. First-run-only "Skip for now" stays at the very
bottom, after Custom.

Test churn:
- ProviderManager.test.tsx: four tests hardcoded press counts (1 or 3 'j'
  presses) that broke when targets moved. Replaces them with a
  navigateToPreset(stdin, label) helper driven from a declared
  PRESET_ORDER array, so future list edits only update the array.
- ConsoleOAuthFlow.test.tsx: the 13-row test frame only renders the first
  ~13 providers. "Ollama", "OpenAI", "LM Studio" sentinels moved below
  the fold; swap them for alphabetically-early providers still visible
  in-frame ("Azure OpenAI", "DeepSeek", "Google Gemini"). Test intent
  (picker opened with providers listed) is preserved.

Co-Authored-By: OpenClaude <openclaude@gitlawb.com>

---------

Co-authored-by: OpenClaude <openclaude@gitlawb.com>
This commit is contained in:
Kevin Codex
2026-04-21 21:20:54 +08:00
committed by GitHub
parent 2b15e16421
commit b95d2221df
11 changed files with 226 additions and 72 deletions

View File

@@ -112,8 +112,10 @@ test('third-party provider branch opens the first-run provider manager', async (
)
expect(output).toContain('Set up provider')
// Use alphabetically-early sentinels so they remain visible in the
// 13-row test frame after the provider list was sorted A→Z.
expect(output).toContain('Anthropic')
expect(output).toContain('OpenAI')
expect(output).toContain('Ollama')
expect(output).toContain('LM Studio')
expect(output).toContain('Azure OpenAI')
expect(output).toContain('DeepSeek')
expect(output).toContain('Google Gemini')
})

View File

@@ -97,6 +97,46 @@ async function waitForCondition(
throw new Error('Timed out waiting for ProviderManager test condition')
}
// Provider list is sorted alphabetically by label in the preset picker, so
// reaching a given provider takes more keypresses than it used to. Keep the
// target-by-label indirection here so these tests survive future list edits
// without further churn.
//
// Order matches ProviderManager.renderPresetSelection() when
// canUseCodexOAuth === true (default in mocked tests).
const PRESET_ORDER = [
'Alibaba Coding Plan',
'Alibaba Coding Plan (China)',
'Anthropic',
'Azure OpenAI',
'Codex OAuth',
'DeepSeek',
'Google Gemini',
'Groq',
'LM Studio',
'MiniMax',
'Mistral',
'Moonshot AI',
'NVIDIA NIM',
'Ollama',
'OpenAI',
'OpenRouter',
'Together AI',
'Custom',
] as const
async function navigateToPreset(
stdin: { write: (data: string) => void },
label: (typeof PRESET_ORDER)[number],
): Promise<void> {
const index = PRESET_ORDER.indexOf(label)
if (index < 0) throw new Error(`Unknown preset label: ${label}`)
for (let i = 0; i < index; i++) {
stdin.write('j')
await Bun.sleep(25)
}
}
function createDeferred<T>(): {
promise: Promise<T>
resolve: (value: T) => void
@@ -491,11 +531,10 @@ test('ProviderManager first-run Ollama preset auto-detects installed models', as
await waitForFrameOutput(
mounted.getOutput,
frame => frame.includes('Set up provider') && frame.includes('Ollama'),
frame => frame.includes('Set up provider'),
)
mounted.stdin.write('j')
await Bun.sleep(50)
await navigateToPreset(mounted.stdin, 'Ollama')
mounted.stdin.write('\r')
const modelFrame = await waitForFrameOutput(
@@ -590,12 +629,7 @@ test('ProviderManager first-run Codex OAuth switches the current session after l
frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
await navigateToPreset(mounted.stdin, 'Codex OAuth')
mounted.stdin.write('\r')
await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -687,12 +721,7 @@ test('ProviderManager first-run Codex OAuth reports next-startup fallback when s
frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
await navigateToPreset(mounted.stdin, 'Codex OAuth')
mounted.stdin.write('\r')
await waitForCondition(() => onDone.mock.calls.length > 0)
@@ -786,12 +815,7 @@ test('ProviderManager does not hijack a manual Codex profile when OAuth credenti
frame => frame.includes('Set up provider') && frame.includes('Codex OAuth'),
)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
mounted.stdin.write('j')
await Bun.sleep(25)
await navigateToPreset(mounted.stdin, 'Codex OAuth')
mounted.stdin.write('\r')
await waitForCondition(() => onDone.mock.calls.length > 0)

View File

@@ -1094,21 +1094,30 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
function renderPresetSelection(): React.ReactNode {
const canUseCodexOAuth = !isBareMode()
// Providers sorted alphabetically by label. `Custom` is pinned to the end
// because it's the catch-all / escape hatch — users scanning the list
// should always find known providers first. `Skip for now` (first-run
// only) comes last, after Custom.
const options = [
{
value: 'dashscope-intl',
label: 'Alibaba Coding Plan',
description: 'Alibaba DashScope International endpoint',
},
{
value: 'dashscope-cn',
label: 'Alibaba Coding Plan (China)',
description: 'Alibaba DashScope China endpoint',
},
{
value: 'anthropic',
label: 'Anthropic',
description: 'Native Claude API (x-api-key auth)',
},
{
value: 'ollama',
label: 'Ollama',
description: 'Local or remote Ollama endpoint',
},
{
value: 'openai',
label: 'OpenAI',
description: 'OpenAI API with API key',
value: 'azure-openai',
label: 'Azure OpenAI',
description: 'Azure OpenAI endpoint (model=deployment name)',
},
...(canUseCodexOAuth
? [
@@ -1120,11 +1129,6 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
},
]
: []),
{
value: 'moonshotai',
label: 'Moonshot AI',
description: 'Kimi OpenAI-compatible endpoint',
},
{
value: 'deepseek',
label: 'DeepSeek',
@@ -1135,50 +1139,30 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
label: 'Google Gemini',
description: 'Gemini OpenAI-compatible endpoint',
},
{
value: 'together',
label: 'Together AI',
description: 'Together chat/completions endpoint',
},
{
value: 'groq',
label: 'Groq',
description: 'Groq OpenAI-compatible endpoint',
},
{
value: 'mistral',
label: 'Mistral',
description: 'Mistral OpenAI-compatible endpoint',
},
{
value: 'azure-openai',
label: 'Azure OpenAI',
description: 'Azure OpenAI endpoint (model=deployment name)',
},
{
value: 'openrouter',
label: 'OpenRouter',
description: 'OpenRouter OpenAI-compatible endpoint',
},
{
value: 'lmstudio',
label: 'LM Studio',
description: 'Local LM Studio endpoint',
},
{
value: 'dashscope-cn',
label: 'Alibaba Coding Plan (China)',
description: 'Alibaba DashScope China endpoint',
value: 'minimax',
label: 'MiniMax',
description: 'MiniMax API endpoint',
},
{
value: 'dashscope-intl',
label: 'Alibaba Coding Plan',
description: 'Alibaba DashScope International endpoint',
value: 'mistral',
label: 'Mistral',
description: 'Mistral OpenAI-compatible endpoint',
},
{
value: 'custom',
label: 'Custom',
description: 'Any OpenAI-compatible provider',
value: 'moonshotai',
label: 'Moonshot AI',
description: 'Kimi OpenAI-compatible endpoint',
},
{
value: 'nvidia-nim',
@@ -1186,9 +1170,29 @@ export function ProviderManager({ mode, onDone }: Props): React.ReactNode {
description: 'NVIDIA NIM endpoint',
},
{
value: 'minimax',
label: 'MiniMax',
description: 'MiniMax API endpoint',
value: 'ollama',
label: 'Ollama',
description: 'Local or remote Ollama endpoint',
},
{
value: 'openai',
label: 'OpenAI',
description: 'OpenAI API with API key',
},
{
value: 'openrouter',
label: 'OpenRouter',
description: 'OpenRouter OpenAI-compatible endpoint',
},
{
value: 'together',
label: 'Together AI',
description: 'Together chat/completions endpoint',
},
{
value: 'custom',
label: 'Custom',
description: 'Any OpenAI-compatible provider',
},
...(mode === 'first-run'
? [

View File

@@ -123,6 +123,8 @@ function detectProvider(): { name: string; model: string; baseUrl: string; isLoc
name = 'MiniMax'
else if (resolvedRequest.transport === 'codex_responses' || baseUrl.includes('chatgpt.com/backend-api/codex'))
name = 'Codex'
else if (/moonshot/i.test(baseUrl) || /kimi/i.test(rawModel))
name = 'Moonshot (Kimi)'
else if (/deepseek/i.test(baseUrl) || /deepseek/i.test(rawModel))
name = 'DeepSeek'
else if (/openrouter/i.test(baseUrl))

View File

@@ -823,6 +823,11 @@ function getFunctionResultClearingSection(model: string): string | null {
return null
}
const config = getCachedMCConfigForFRC()
if (!config) {
// External/stub builds return null from getCachedMCConfig — abort the
// section rather than trying to read .supportedModels off null.
return null
}
const isModelSupported = config.supportedModels?.some(pattern =>
model.includes(pattern),
)

View File

@@ -1217,7 +1217,7 @@ async function* queryModel(
cachedMCEnabled = featureEnabled && modelSupported
const config = getCachedMCConfig()
logForDebugging(
`Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config.supportedModels)}`,
`Cached MC gate: enabled=${featureEnabled} modelSupported=${modelSupported} model=${options.model} supportedModels=${jsonStringify(config?.supportedModels)}`,
)
}

View File

@@ -3308,3 +3308,69 @@ test('injects semantic assistant message when tool result is followed by user me
expect(semanticMsg.role).toBe('assistant')
expect(semanticMsg.content).toBe('[Tool execution interrupted by user]')
})
test('Moonshot: uses max_tokens (not max_completion_tokens) and strips store', async () => {
process.env.OPENAI_BASE_URL = 'https://api.moonshot.ai/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'kimi-k2.6',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'kimi-k2.6',
system: 'you are kimi',
messages: [{ role: 'user', content: 'hi' }],
max_tokens: 256,
stream: false,
})
expect(requestBody?.max_tokens).toBe(256)
expect(requestBody?.max_completion_tokens).toBeUndefined()
expect(requestBody?.store).toBeUndefined()
})
test('Moonshot: cn host is also detected', async () => {
process.env.OPENAI_BASE_URL = 'https://api.moonshot.cn/v1'
process.env.OPENAI_API_KEY = 'sk-moonshot-test'
let requestBody: Record<string, unknown> | undefined
globalThis.fetch = (async (_input, init) => {
requestBody = JSON.parse(String(init?.body))
return new Response(
JSON.stringify({
id: 'chatcmpl-1',
model: 'kimi-k2.6',
choices: [
{ message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' },
],
usage: { prompt_tokens: 3, completion_tokens: 1, total_tokens: 4 },
}),
{ headers: { 'Content-Type': 'application/json' } },
)
}) as FetchType
const client = createOpenAIShimClient({}) as OpenAIShimClient
await client.beta.messages.create({
model: 'kimi-k2.6',
system: 'you are kimi',
messages: [{ role: 'user', content: 'hi' }],
max_tokens: 256,
stream: false,
})
expect(requestBody?.store).toBeUndefined()
})

View File

@@ -82,6 +82,10 @@ const GITHUB_429_MAX_RETRIES = 3
const GITHUB_429_BASE_DELAY_SEC = 1
const GITHUB_429_MAX_DELAY_SEC = 32
const GEMINI_API_HOST = 'generativelanguage.googleapis.com'
const MOONSHOT_API_HOSTS = new Set([
'api.moonshot.ai',
'api.moonshot.cn',
])
const COPILOT_HEADERS: Record<string, string> = {
'User-Agent': 'GitHubCopilotChat/0.26.7',
@@ -147,6 +151,15 @@ function hasGeminiApiHost(baseUrl: string | undefined): boolean {
}
}
function isMoonshotBaseUrl(baseUrl: string | undefined): boolean {
if (!baseUrl) return false
try {
return MOONSHOT_API_HOSTS.has(new URL(baseUrl).hostname.toLowerCase())
} catch {
return false
}
}
function formatRetryAfterHint(response: Response): string {
const ra = response.headers.get('retry-after')
return ra ? ` (Retry-After: ${ra})` : ''
@@ -1447,14 +1460,19 @@ class OpenAIShimMessages {
const isGithubCopilot = isGithub && githubEndpointType === 'copilot'
const isGithubModels = isGithub && (githubEndpointType === 'models' || githubEndpointType === 'custom')
if ((isGithub || isMistral || isLocal) && body.max_completion_tokens !== undefined) {
const isMoonshot = isMoonshotBaseUrl(request.baseUrl)
if ((isGithub || isMistral || isLocal || isMoonshot) && body.max_completion_tokens !== undefined) {
body.max_tokens = body.max_completion_tokens
delete body.max_completion_tokens
}
// mistral and gemini don't recognize body.store — Gemini returns 400
// "Invalid JSON payload received. Unknown name 'store': Cannot find field."
if (isMistral || isGeminiMode()) {
// Moonshot (api.moonshot.ai/.cn) has not published support for the
// parameter either; strip it preemptively to avoid the same class of
// error on strict-parse providers.
if (isMistral || isGeminiMode() || isMoonshot) {
delete body.store
}

View File

@@ -219,6 +219,17 @@ const OPENAI_CONTEXT_WINDOWS: Record<string, number> = {
'kimi-k2.5': 262_144,
'glm-5': 202_752,
'glm-4.7': 202_752,
// Moonshot AI direct API (api.moonshot.ai/v1). Values from Moonshot's
// published model card — all K2 tier share 256K context. Prefix matching
// in lookupByKey catches variants like "kimi-k2.6-preview".
'kimi-k2.6': 262_144,
'kimi-k2': 131_072,
'kimi-k2-instruct': 131_072,
'kimi-k2-thinking': 262_144,
'moonshot-v1-8k': 8_192,
'moonshot-v1-32k': 32_768,
'moonshot-v1-128k': 131_072,
}
/**
@@ -391,6 +402,15 @@ const OPENAI_MAX_OUTPUT_TOKENS: Record<string, number> = {
'kimi-k2.5': 32_768,
'glm-5': 16_384,
'glm-4.7': 16_384,
// Moonshot AI direct API
'kimi-k2.6': 32_768,
'kimi-k2': 32_768,
'kimi-k2-instruct': 32_768,
'kimi-k2-thinking': 32_768,
'moonshot-v1-8k': 4_096,
'moonshot-v1-32k': 16_384,
'moonshot-v1-128k': 32_768,
}
function lookupByModel<T>(table: Record<string, T>, model: string): T | undefined {

View File

@@ -81,6 +81,15 @@ test('detects common local openai-compatible providers by hostname', async () =>
).toBe('vLLM')
})
test('detects Moonshot (Kimi) from api.moonshot.ai hostname', async () => {
const { getLocalOpenAICompatibleProviderLabel } =
await loadProviderDiscoveryModule()
expect(
getLocalOpenAICompatibleProviderLabel('https://api.moonshot.ai/v1'),
).toBe('Moonshot (Kimi)')
})
test('falls back to a generic local openai-compatible label', async () => {
const { getLocalOpenAICompatibleProviderLabel } =
await loadProviderDiscoveryModule()

View File

@@ -197,6 +197,10 @@ export function getLocalOpenAICompatibleProviderLabel(baseUrl?: string): string
if (host.includes('minimax') || haystack.includes('minimax')) {
return 'MiniMax'
}
// Moonshot AI (Kimi) direct API
if (host.includes('moonshot') || haystack.includes('moonshot') || haystack.includes('kimi')) {
return 'Moonshot (Kimi)'
}
} catch {
// Fall back to the generic label when the base URL is malformed.
}