Commit Graph

10 Commits

Author SHA1 Message Date
hika, maeng
b750e9e97d fix: make OpenAI fallback context window configurable + support external model lookup (#861)
* fix: make OpenAI fallback context window configurable and support external lookup table

Unknown OpenAI-compatible models fell back to a hardcoded 128k constant,
causing auto-compact to fire prematurely on models with larger windows
(issue #635 follow-up). Two escape hatches are added without touching the
built-in table:

- CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW (number): overrides the 128k
  default for all unknown models.
- CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS (JSON object): per-model overrides that
  take precedence over the built-in OPENAI_CONTEXT_WINDOWS table; supports
  the same provider-qualified and prefix-matching lookup as the built-in path.
- CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS (JSON object): same pattern for output
  token limits.

This lets operators deploy new or private models without patching
openaiContextWindows.ts on every model release.

* docs: add new OpenAI context window env vars to .env.example

Document CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW,
CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS, and
CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS with usage examples.

Addresses reviewer feedback on PR #861.

---------

Co-authored-by: opencode <dev@example.com>
2026-04-24 00:34:08 +08:00
Vasanth T
aeaa658f77 fix: prevent infinite auto-compact loop for unknown 3P models (#635) (#636)
- Raise context window fallback from 8k to 128k for unknown OpenAI-compat models.
  The 8k fallback caused effective context (8k minus output reservation) to go
  negative, making auto-compact fire on every single message.
- Add safety floor in getEffectiveContextWindowSize(): effective context is
  always at least reservedTokensForSummary + 13k buffer, ensuring the
  auto-compact threshold stays positive.
- Add missing MiniMax model entries (M2.5, M2.5-highspeed, M2.1, M2.1-highspeed)
  all at 204,800 context / 131,072 max output per MiniMax docs.
- Add tests for MiniMax variants, 128k fallback, and autoCompact floor.

Fixes #635

Co-authored-by: root <root@vm7508.lumadock.com>
2026-04-13 02:03:02 +08:00
lunamonke
4c50977f3c Decouple and fix mistral (#595)
* decouple and fix mistral

* fix wrong variable for currentBaseUrl and buildAPIProviderProperties
2026-04-12 15:26:14 +08:00
Kevin Codex
69ea1f1e4a fix: restore default context window for unknown 3p models (#494)
* fix: restore default context window for unknown 3p models

* fix: add MiniMax context metadata
2026-04-08 02:45:49 +08:00
Juan Camilo Auriti
4975cfc2e0 fix: strip Anthropic params from 3P resume paths (#479)
* fix: strip Anthropic-specific params from 3P provider paths

Three silent failure modes affecting all third-party provider users:

1. Thinking blocks serialized as <thinking> text corrupt multi-turn
   context — strip them instead of converting to raw text tags.

2. Unknown models fall through to 200k context window default, so
   auto-compact never triggers — use conservative 8k for unknown
   3P models with a warning log.

3. Session resume with thinking blocks causes 400 or context corruption
   on 3P providers — strip thinking/redacted_thinking content blocks
   from deserialized messages when resuming against a non-Anthropic
   provider.

Addresses findings 2, 3, and 5 from #248.

* test: align resume stripping expectation with orphan-thinking filter

* test: isolate provider env in conversation recovery tests

* test: move provider-sensitive resume coverage behind module mocks

* test: trim extra blank lines in conversation recovery test

Keep the focused provider-resume test diff clean so the regression branch stays easy to review.

Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>

---------

Co-authored-by: Claude Opus 4.6 <noreply@openclaude.dev>
2026-04-07 23:24:10 +08:00
Anandan
2f162af60c Reduce internal-only labeling noise in source comments (#355)
This pass rewrites comment-only ANT-ONLY markers to neutral internal-only
language across the source tree without changing runtime strings, flags,
commands, or protocol identifiers. The goal is to lower obvious internal
prose leakage while keeping the diff mechanically safe and easy to review.

Constraint: Phase B is limited to comments/prose only; runtime strings and user-facing labels remain deferred
Rejected: Broad search-and-replace across strings and command descriptions | too risky for a prose-only pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Remaining ANT-ONLY hits are mostly runtime/user-facing strings and should be handled separately from comment cleanup
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)

Co-authored-by: anandh8x <test@example.com>
2026-04-04 23:26:14 +05:30
Juan Camilo
f385740bd6 fix: use isEnvTruthy() for provider detection in context window lookup
Replace raw === '1' || === 'true' comparisons with isEnvTruthy() in
context.ts for consistency with getAPIProvider() in providers.ts.
This also covers the newly added CLAUDE_CODE_USE_GITHUB provider.

Add native Gemini model entries (without google/ prefix) to both
context window and max output token tables. Corrects gemini-2.5-pro
and gemini-2.5-flash max output tokens to 65,536 (was 8,192/32,768).
2026-04-02 14:43:03 +02:00
Rithul Kamesh
25c5987276 feat: add support for GitHub Models provider
- Introduced environment variable CLAUDE_CODE_USE_GITHUB to enable GitHub Models.
- Added checks for GITHUB_TOKEN or GH_TOKEN for authentication.
- Updated base URL handling to include GitHub Models default.
- Enhanced provider detection and error handling for GitHub Models.
- Updated relevant functions and components to accommodate the new provider.
2026-04-02 11:25:28 +05:30
gnanam1990
4ca94b2454 feat: add context window guard for OpenAI-compatible models
Without this fix, getContextWindowForModel() returns 200k for all OpenAI
models (the Claude default), causing two problems:
  1. Auto-compact/warnings trigger at wrong thresholds (200k instead of 128k)
  2. getModelMaxOutputTokens() returns 32k causing 400 errors from APIs that
     cap output tokens lower (gpt-4o supports max 16384)

Fix:
- Add openaiContextWindows.ts with known context window sizes and max output
  token limits for 30+ OpenAI-compatible models (OpenAI, DeepSeek, Groq,
  Mistral, Ollama, LM Studio)
- Hook into getContextWindowForModel() so correct input limits are used
- Hook into getModelMaxOutputTokens() so correct output limits are sent,
  preventing 400 "max_tokens is too large" errors

All existing warning, blocking, and auto-compact infrastructure works
automatically once the correct limits are returned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 17:42:04 +05:30
did:key:z6MkqDnb7Siv3Cwj7pGJq4T5EsUisECqR8KpnDLwcaZq5TPr
d2542c9a62 asdf
Squash the current repository state back into one baseline commit while
preserving the README reframing and repository contents.

Constraint: User explicitly requested a single squashed commit with subject "asdf"
Confidence: high
Scope-risk: broad
Reversibility: clean
Directive: This commit intentionally rewrites published history; coordinate before future force-pushes
Tested: git status clean; local history rewritten to one commit; force-pushed main to origin and instructkr
Not-tested: Fresh clone verification after push
2026-03-31 03:34:03 -07:00