Files
orcs-code/src/services/api/client.ts
Zartris fdef4a1b4c feat: native Anthropic API mode for Claude models on GitHub Copilot (#579)
* feat: native Anthropic API mode for Claude models on GitHub Copilot

When using Claude models through GitHub Copilot, automatically switch from
the OpenAI-compatible shim to Anthropic's native messages API format.

The Copilot proxy (api.githubcopilot.com) supports Anthropic's native API
for Claude models. This enables cache_control blocks to be sent and
honoured, allowing explicit prompt caching control (as opposed to relying
solely on server-side auto-caching).

Changes:
- Add isGithubNativeAnthropicMode() in providers.ts that auto-enables when
  the resolved model starts with "claude-" and the GitHub provider is active
- Create a native Anthropic client in client.ts using the GitHub base URL
  and Bearer token authentication when native mode is detected
- Enable prompt caching in claude.ts for native GitHub mode so cache_control
  blocks are sent (previously only allowed for firstParty/bedrock/vertex)
- CLAUDE_CODE_GITHUB_ANTHROPIC_API=1 env var to force native mode for any
  model

Benefits:
- Proper Anthropic message format (no lossy OpenAI translation)
- Explicit cache_control blocks for fine-grained caching control
- Potentially better Claude model behaviour with native format

Related: #515

* fix: scope force flag to Claude models and add isGithubNativeAnthropicMode tests

- CLAUDE_CODE_GITHUB_ANTHROPIC_API=1 now returns false for non-Claude models
  (force flag still useful for aliases like 'github:copilot' with no model
  resolved yet, where it returns true when model is empty)
- Add 7 focused tests covering mode detection: off without GitHub provider,
  auto-detect via OPENAI_MODEL and resolvedModel, non-Claude model rejection,
  and force-flag behaviour for claude/non-claude/no-model cases

* fix: detect github:copilot:claude- compound format, remove force flag

OPENAI_MODEL for GitHub Copilot uses the format 'github:copilot:MODEL'
(e.g. 'github:copilot:claude-sonnet-4'), which does not start with 'claude-'.
Auto-detection now handles both bare model names and the compound format.

The CLAUDE_CODE_GITHUB_ANTHROPIC_API force flag is removed: with proper
compound-format detection there is no remaining gap it could fill, and
keeping a broad override flag without a concrete use case invites misuse.

Tests updated to cover the compound format, generic alias (false), and
non-Claude compound model (github:copilot:gpt-4o → false).

* fix: use includes('claude-') for model detection, remove force flag

Detection was broken for the standard GitHub Copilot compound format
'github:copilot:claude-sonnet-4' which does not start with 'claude-'.
Using includes('claude-') handles bare names, compound names, and any
future variants without needing updates.

The CLAUDE_CODE_GITHUB_ANTHROPIC_API force flag is removed as it was
a workaround for the broken detection, not a genuine use case.

---------

Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>
2026-04-20 16:34:58 +08:00

453 lines
18 KiB
TypeScript

import Anthropic, { type ClientOptions } from '@anthropic-ai/sdk'
import { randomUUID } from 'crypto'
import {
checkAndRefreshOAuthTokenIfNeeded,
getAnthropicApiKey,
getApiKeyFromApiKeyHelper,
getClaudeAIOAuthTokens,
isClaudeAISubscriber,
refreshAndGetAwsCredentials,
refreshGcpCredentialsIfNeeded,
} from 'src/utils/auth.js'
import { getUserAgent } from 'src/utils/http.js'
import { getSmallFastModel } from 'src/utils/model/model.js'
import {
getAPIProvider,
isFirstPartyAnthropicBaseUrl,
isGithubNativeAnthropicMode,
} from 'src/utils/model/providers.js'
import { getProxyFetchOptions } from 'src/utils/proxy.js'
import {
getIsNonInteractiveSession,
getSessionId,
} from '../../bootstrap/state.js'
import { getOauthConfig } from '../../constants/oauth.js'
import { isDebugToStdErr, logForDebugging } from '../../utils/debug.js'
import {
getAWSRegion,
getVertexRegionForModel,
isEnvTruthy,
} from '../../utils/envUtils.js'
const importRuntimeModule = new Function(
'specifier',
'return import(specifier)',
) as (specifier: string) => Promise<any>
/**
* Environment variables for different client types:
*
* Direct API:
* - ANTHROPIC_API_KEY: Required for direct API access
*
* AWS Bedrock:
* - AWS credentials configured via aws-sdk defaults
* - AWS_REGION or AWS_DEFAULT_REGION: Sets the AWS region for all models (default: us-east-1)
* - ANTHROPIC_SMALL_FAST_MODEL_AWS_REGION: Optional. Override AWS region specifically for the small fast model (Haiku)
*
* Foundry (Azure):
* - ANTHROPIC_FOUNDRY_RESOURCE: Your Azure resource name (e.g., 'my-resource')
* For the full endpoint: https://{resource}.services.ai.azure.com/anthropic/v1/messages
* - ANTHROPIC_FOUNDRY_BASE_URL: Optional. Alternative to resource - provide full base URL directly
* (e.g., 'https://my-resource.services.ai.azure.com')
*
* Authentication (one of the following):
* - ANTHROPIC_FOUNDRY_API_KEY: Your Microsoft Foundry API key (if using API key auth)
* - Azure AD authentication: If no API key is provided, uses DefaultAzureCredential
* which supports multiple auth methods (environment variables, managed identity,
* Azure CLI, etc.). See: https://docs.microsoft.com/en-us/javascript/api/@azure/identity
*
* Vertex AI:
* - Model-specific region variables (highest priority):
* - VERTEX_REGION_CLAUDE_3_5_HAIKU: Region for Claude 3.5 Haiku model
* - VERTEX_REGION_CLAUDE_HAIKU_4_5: Region for Claude Haiku 4.5 model
* - VERTEX_REGION_CLAUDE_3_5_SONNET: Region for Claude 3.5 Sonnet model
* - VERTEX_REGION_CLAUDE_3_7_SONNET: Region for Claude 3.7 Sonnet model
* - CLOUD_ML_REGION: Optional. The default GCP region to use for all models
* If specific model region not specified above
* - ANTHROPIC_VERTEX_PROJECT_ID: Required. Your GCP project ID
* - Standard GCP credentials configured via google-auth-library
*
* Priority for determining region:
* 1. Hardcoded model-specific environment variables
* 2. Global CLOUD_ML_REGION variable
* 3. Default region from config
* 4. Fallback region (us-east5)
*/
function createStderrLogger(): ClientOptions['logger'] {
return {
error: (msg, ...args) =>
// biome-ignore lint/suspicious/noConsole:: intentional console output -- SDK logger must use console
console.error('[Anthropic SDK ERROR]', msg, ...args),
// biome-ignore lint/suspicious/noConsole:: intentional console output -- SDK logger must use console
warn: (msg, ...args) => console.error('[Anthropic SDK WARN]', msg, ...args),
// biome-ignore lint/suspicious/noConsole:: intentional console output -- SDK logger must use console
info: (msg, ...args) => console.error('[Anthropic SDK INFO]', msg, ...args),
debug: (msg, ...args) =>
// biome-ignore lint/suspicious/noConsole:: intentional console output -- SDK logger must use console
console.error('[Anthropic SDK DEBUG]', msg, ...args),
}
}
export async function getAnthropicClient({
apiKey,
maxRetries,
model,
fetchOverride,
source,
providerOverride,
}: {
apiKey?: string
maxRetries: number
model?: string
fetchOverride?: ClientOptions['fetch']
source?: string
providerOverride?: { model: string; baseURL: string; apiKey: string }
}): Promise<Anthropic> {
const containerId = process.env.CLAUDE_CODE_CONTAINER_ID
const remoteSessionId = process.env.CLAUDE_CODE_REMOTE_SESSION_ID
const clientApp = process.env.CLAUDE_AGENT_SDK_CLIENT_APP
const customHeaders = getCustomHeaders()
const defaultHeaders: { [key: string]: string } = {
'x-app': 'cli',
'User-Agent': getUserAgent(),
'X-Claude-Code-Session-Id': getSessionId(),
...customHeaders,
...(containerId ? { 'x-claude-remote-container-id': containerId } : {}),
...(remoteSessionId
? { 'x-claude-remote-session-id': remoteSessionId }
: {}),
// SDK consumers can identify their app/library for backend analytics
...(clientApp ? { 'x-client-app': clientApp } : {}),
}
// Log API client configuration for HFI debugging
logForDebugging(
`[API:request] Creating client, ANTHROPIC_CUSTOM_HEADERS present: ${!!process.env.ANTHROPIC_CUSTOM_HEADERS}, has Authorization header: ${!!customHeaders['Authorization']}`,
)
// Add additional protection header if enabled via env var
const additionalProtectionEnabled = isEnvTruthy(
process.env.CLAUDE_CODE_ADDITIONAL_PROTECTION,
)
if (additionalProtectionEnabled) {
defaultHeaders['x-anthropic-additional-protection'] = 'true'
}
logForDebugging('[API:auth] OAuth token check starting')
await checkAndRefreshOAuthTokenIfNeeded()
logForDebugging('[API:auth] OAuth token check complete')
if (!isClaudeAISubscriber()) {
await configureApiKeyHeaders(defaultHeaders, getIsNonInteractiveSession())
}
const resolvedFetch = buildFetch(fetchOverride, source)
const ARGS = {
defaultHeaders,
maxRetries,
timeout: parseInt(process.env.API_TIMEOUT_MS || String(600 * 1000), 10),
dangerouslyAllowBrowser: true,
fetchOptions: getProxyFetchOptions({
forAnthropicAPI: true,
}) as ClientOptions['fetchOptions'],
...(resolvedFetch && {
fetch: resolvedFetch,
}),
}
// Agent routing override: use per-agent provider when configured.
// Strip auth-related headers to prevent leaking Anthropic credentials
// to third-party endpoints (SSRF / credential forwarding mitigation).
if (providerOverride) {
const { createOpenAIShimClient } = await import('./openaiShim.js')
const safeHeaders: Record<string, string> = {}
for (const [k, v] of Object.entries(defaultHeaders)) {
const lower = k.toLowerCase()
if (lower === 'authorization' || lower === 'x-api-key' || lower === 'api-key') continue
safeHeaders[k] = v
}
return createOpenAIShimClient({
defaultHeaders: safeHeaders,
maxRetries,
timeout: parseInt(process.env.API_TIMEOUT_MS || String(600 * 1000), 10),
providerOverride,
}) as unknown as Anthropic
}
// GitHub provider in native Anthropic API mode: send requests in Anthropic
// format so cache_control blocks are honoured and prompt caching works.
// Requires the GitHub endpoint (OPENAI_BASE_URL) to support Anthropic's
// messages API — set CLAUDE_CODE_GITHUB_ANTHROPIC_API=1 to opt in.
if (isGithubNativeAnthropicMode(model)) {
const githubBaseUrl =
process.env.OPENAI_BASE_URL?.replace(/\/$/, '') ??
'https://api.githubcopilot.com'
const githubToken =
process.env.GITHUB_TOKEN ?? process.env.GH_TOKEN ?? ''
const nativeArgs: ConstructorParameters<typeof Anthropic>[0] = {
...ARGS,
baseURL: githubBaseUrl,
authToken: githubToken,
// No apiKey — we authenticate via Bearer token (authToken)
apiKey: null,
}
return new Anthropic(nativeArgs)
}
if (
isEnvTruthy(process.env.CLAUDE_CODE_USE_OPENAI) ||
isEnvTruthy(process.env.CLAUDE_CODE_USE_GITHUB) ||
isEnvTruthy(process.env.CLAUDE_CODE_USE_GEMINI) ||
isEnvTruthy(process.env.CLAUDE_CODE_USE_MISTRAL)
) {
const { createOpenAIShimClient } = await import('./openaiShim.js')
return createOpenAIShimClient({
defaultHeaders,
maxRetries,
timeout: parseInt(process.env.API_TIMEOUT_MS || String(600 * 1000), 10),
}) as unknown as Anthropic
}
if (isEnvTruthy(process.env.CLAUDE_CODE_USE_BEDROCK)) {
const { AnthropicBedrock } = await import('@anthropic-ai/bedrock-sdk')
// Use region override for small fast model if specified
const awsRegion =
model === getSmallFastModel() &&
process.env.ANTHROPIC_SMALL_FAST_MODEL_AWS_REGION
? process.env.ANTHROPIC_SMALL_FAST_MODEL_AWS_REGION
: getAWSRegion()
const bedrockArgs: ConstructorParameters<typeof AnthropicBedrock>[0] = {
...ARGS,
awsRegion,
...(isEnvTruthy(process.env.CLAUDE_CODE_SKIP_BEDROCK_AUTH) && {
skipAuth: true,
}),
...(isDebugToStdErr() && { logger: createStderrLogger() }),
}
// Add API key authentication if available
if (process.env.AWS_BEARER_TOKEN_BEDROCK) {
bedrockArgs.skipAuth = true
// Add the Bearer token for Bedrock API key authentication
bedrockArgs.defaultHeaders = {
...bedrockArgs.defaultHeaders,
Authorization: `Bearer ${process.env.AWS_BEARER_TOKEN_BEDROCK}`,
}
} else if (!isEnvTruthy(process.env.CLAUDE_CODE_SKIP_BEDROCK_AUTH)) {
// Refresh auth and get credentials with cache clearing
const cachedCredentials = await refreshAndGetAwsCredentials()
if (cachedCredentials) {
bedrockArgs.awsAccessKey = cachedCredentials.accessKeyId
bedrockArgs.awsSecretKey = cachedCredentials.secretAccessKey
bedrockArgs.awsSessionToken = cachedCredentials.sessionToken
}
}
// we have always been lying about the return type - this doesn't support batching or models
return new AnthropicBedrock(bedrockArgs) as unknown as Anthropic
}
if (isEnvTruthy(process.env.CLAUDE_CODE_USE_FOUNDRY)) {
const { AnthropicFoundry } = await importRuntimeModule(
'@anthropic-ai/foundry-sdk',
)
// Determine Azure AD token provider based on configuration
// SDK reads ANTHROPIC_FOUNDRY_API_KEY by default
let azureADTokenProvider: (() => Promise<string>) | undefined
if (!process.env.ANTHROPIC_FOUNDRY_API_KEY) {
if (isEnvTruthy(process.env.CLAUDE_CODE_SKIP_FOUNDRY_AUTH)) {
// Mock token provider for testing/proxy scenarios (similar to Vertex mock GoogleAuth)
azureADTokenProvider = () => Promise.resolve('')
} else {
// Use real Azure AD authentication with DefaultAzureCredential
const {
DefaultAzureCredential: AzureCredential,
getBearerTokenProvider,
} = await importRuntimeModule('@azure/identity')
azureADTokenProvider = getBearerTokenProvider(
new AzureCredential(),
'https://cognitiveservices.azure.com/.default',
)
}
}
const foundryArgs = {
...ARGS,
...(azureADTokenProvider && { azureADTokenProvider }),
...(isDebugToStdErr() && { logger: createStderrLogger() }),
}
// we have always been lying about the return type - this doesn't support batching or models
return new AnthropicFoundry(foundryArgs) as unknown as Anthropic
}
if (isEnvTruthy(process.env.CLAUDE_CODE_USE_VERTEX)) {
// Refresh GCP credentials if gcpAuthRefresh is configured and credentials are expired
// This is similar to how we handle AWS credential refresh for Bedrock
if (!isEnvTruthy(process.env.CLAUDE_CODE_SKIP_VERTEX_AUTH)) {
await refreshGcpCredentialsIfNeeded()
}
const [{ AnthropicVertex }, { GoogleAuth }] = await Promise.all([
importRuntimeModule('@anthropic-ai/vertex-sdk'),
importRuntimeModule('google-auth-library'),
])
// TODO: Cache either GoogleAuth instance or AuthClient to improve performance
// Currently we create a new GoogleAuth instance for every getAnthropicClient() call
// This could cause repeated authentication flows and metadata server checks
// However, caching needs careful handling of:
// - Credential refresh/expiration
// - Environment variable changes (GOOGLE_APPLICATION_CREDENTIALS, project vars)
// - Cross-request auth state management
// See: https://github.com/googleapis/google-auth-library-nodejs/issues/390 for caching challenges
// Prevent metadata server timeout by providing projectId as fallback
// google-auth-library checks project ID in this order:
// 1. Environment variables (GCLOUD_PROJECT, GOOGLE_CLOUD_PROJECT, etc.)
// 2. Credential files (service account JSON, ADC file)
// 3. gcloud config
// 4. GCE metadata server (causes 12s timeout outside GCP)
//
// We only set projectId if user hasn't configured other discovery methods
// to avoid interfering with their existing auth setup
// Check project environment variables in same order as google-auth-library
// See: https://github.com/googleapis/google-auth-library-nodejs/blob/main/src/auth/googleauth.ts
const hasProjectEnvVar =
process.env['GCLOUD_PROJECT'] ||
process.env['GOOGLE_CLOUD_PROJECT'] ||
process.env['gcloud_project'] ||
process.env['google_cloud_project']
// Check for credential file paths (service account or ADC)
// Note: We're checking both standard and lowercase variants to be safe,
// though we should verify what google-auth-library actually checks
const hasKeyFile =
process.env['GOOGLE_APPLICATION_CREDENTIALS'] ||
process.env['google_application_credentials']
const googleAuth = isEnvTruthy(process.env.CLAUDE_CODE_SKIP_VERTEX_AUTH)
? ({
// Mock GoogleAuth for testing/proxy scenarios
getClient: () => ({
getRequestHeaders: () => ({}),
}),
} as {
getClient: () => {
getRequestHeaders: () => Record<string, string>
}
})
: new GoogleAuth({
scopes: ['https://www.googleapis.com/auth/cloud-platform'],
// Only use ANTHROPIC_VERTEX_PROJECT_ID as last resort fallback
// This prevents the 12-second metadata server timeout when:
// - No project env vars are set AND
// - No credential keyfile is specified AND
// - ADC file exists but lacks project_id field
//
// Risk: If auth project != API target project, this could cause billing/audit issues
// Mitigation: Users can set GOOGLE_CLOUD_PROJECT to override
...(hasProjectEnvVar || hasKeyFile
? {}
: {
projectId: process.env.ANTHROPIC_VERTEX_PROJECT_ID,
}),
})
const vertexArgs = {
...ARGS,
region: getVertexRegionForModel(model),
googleAuth,
...(isDebugToStdErr() && { logger: createStderrLogger() }),
}
// we have always been lying about the return type - this doesn't support batching or models
return new AnthropicVertex(vertexArgs) as unknown as Anthropic
}
// Determine authentication method based on available tokens
const clientConfig: ConstructorParameters<typeof Anthropic>[0] = {
apiKey: isClaudeAISubscriber() ? null : apiKey || getAnthropicApiKey(),
authToken: isClaudeAISubscriber()
? getClaudeAIOAuthTokens()?.accessToken
: undefined,
// Set baseURL from OAuth config when using staging OAuth
...(process.env.USER_TYPE === 'ant' &&
isEnvTruthy(process.env.USE_STAGING_OAUTH)
? { baseURL: getOauthConfig().BASE_API_URL }
: {}),
...ARGS,
...(isDebugToStdErr() && { logger: createStderrLogger() }),
}
return new Anthropic(clientConfig)
}
async function configureApiKeyHeaders(
headers: Record<string, string>,
isNonInteractiveSession: boolean,
): Promise<void> {
const token =
process.env.ANTHROPIC_AUTH_TOKEN ||
(await getApiKeyFromApiKeyHelper(isNonInteractiveSession))
if (token) {
headers['Authorization'] = `Bearer ${token}`
}
}
function getCustomHeaders(): Record<string, string> {
const customHeaders: Record<string, string> = {}
const customHeadersEnv = process.env.ANTHROPIC_CUSTOM_HEADERS
if (!customHeadersEnv) return customHeaders
// Split by newlines to support multiple headers
const headerStrings = customHeadersEnv.split(/\n|\r\n/)
for (const headerString of headerStrings) {
if (!headerString.trim()) continue
// Parse header in format "Name: Value" (curl style). Split on first `:`
// then trim — avoids regex backtracking on malformed long header lines.
const colonIdx = headerString.indexOf(':')
if (colonIdx === -1) continue
const name = headerString.slice(0, colonIdx).trim()
const value = headerString.slice(colonIdx + 1).trim()
if (name) {
customHeaders[name] = value
}
}
return customHeaders
}
export const CLIENT_REQUEST_ID_HEADER = 'x-client-request-id'
function buildFetch(
fetchOverride: ClientOptions['fetch'],
source: string | undefined,
): ClientOptions['fetch'] {
// eslint-disable-next-line eslint-plugin-n/no-unsupported-features/node-builtins
const inner = fetchOverride ?? globalThis.fetch
// Only send to the first-party API — Bedrock/Vertex/Foundry don't log it
// and unknown headers risk rejection by strict proxies (inc-4029 class).
const injectClientRequestId =
getAPIProvider() === 'firstParty' && isFirstPartyAnthropicBaseUrl()
return (input, init) => {
// eslint-disable-next-line eslint-plugin-n/no-unsupported-features/node-builtins
const headers = new Headers(init?.headers)
// Generate a client-side request ID so timeouts (which return no server
// request ID) can still be correlated with server logs by the API team.
// Callers that want to track the ID themselves can pre-set the header.
if (injectClientRequestId && !headers.has(CLIENT_REQUEST_ID_HEADER)) {
headers.set(CLIENT_REQUEST_ID_HEADER, randomUUID())
}
try {
// eslint-disable-next-line eslint-plugin-n/no-unsupported-features/node-builtins
const url = input instanceof Request ? input.url : String(input)
const id = headers.get(CLIENT_REQUEST_ID_HEADER)
logForDebugging(
`[API REQUEST] ${new URL(url).pathname}${id ? ` ${CLIENT_REQUEST_ID_HEADER}=${id}` : ''} source=${source ?? 'unknown'}`,
)
} catch {
// never let logging crash the fetch
}
return inner(input, { ...init, headers })
}
}