* fix: make OpenAI fallback context window configurable and support external lookup table
Unknown OpenAI-compatible models fell back to a hardcoded 128k constant,
causing auto-compact to fire prematurely on models with larger windows
(issue #635 follow-up). Two escape hatches are added without touching the
built-in table:
- CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW (number): overrides the 128k
default for all unknown models.
- CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS (JSON object): per-model overrides that
take precedence over the built-in OPENAI_CONTEXT_WINDOWS table; supports
the same provider-qualified and prefix-matching lookup as the built-in path.
- CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS (JSON object): same pattern for output
token limits.
This lets operators deploy new or private models without patching
openaiContextWindows.ts on every model release.
* docs: add new OpenAI context window env vars to .env.example
Document CLAUDE_CODE_OPENAI_FALLBACK_CONTEXT_WINDOW,
CLAUDE_CODE_OPENAI_CONTEXT_WINDOWS, and
CLAUDE_CODE_OPENAI_MAX_OUTPUT_TOKENS with usage examples.
Addresses reviewer feedback on PR #861.
---------
Co-authored-by: opencode <dev@example.com>
- Raise context window fallback from 8k to 128k for unknown OpenAI-compat models.
The 8k fallback caused effective context (8k minus output reservation) to go
negative, making auto-compact fire on every single message.
- Add safety floor in getEffectiveContextWindowSize(): effective context is
always at least reservedTokensForSummary + 13k buffer, ensuring the
auto-compact threshold stays positive.
- Add missing MiniMax model entries (M2.5, M2.5-highspeed, M2.1, M2.1-highspeed)
all at 204,800 context / 131,072 max output per MiniMax docs.
- Add tests for MiniMax variants, 128k fallback, and autoCompact floor.
Fixes#635
Co-authored-by: root <root@vm7508.lumadock.com>
* fix: strip Anthropic-specific params from 3P provider paths
Three silent failure modes affecting all third-party provider users:
1. Thinking blocks serialized as <thinking> text corrupt multi-turn
context — strip them instead of converting to raw text tags.
2. Unknown models fall through to 200k context window default, so
auto-compact never triggers — use conservative 8k for unknown
3P models with a warning log.
3. Session resume with thinking blocks causes 400 or context corruption
on 3P providers — strip thinking/redacted_thinking content blocks
from deserialized messages when resuming against a non-Anthropic
provider.
Addresses findings 2, 3, and 5 from #248.
* test: align resume stripping expectation with orphan-thinking filter
* test: isolate provider env in conversation recovery tests
* test: move provider-sensitive resume coverage behind module mocks
* test: trim extra blank lines in conversation recovery test
Keep the focused provider-resume test diff clean so the regression branch stays easy to review.
Co-Authored-By: Claude Opus 4.6 <noreply@openclaude.dev>
---------
Co-authored-by: Claude Opus 4.6 <noreply@openclaude.dev>
This pass rewrites comment-only ANT-ONLY markers to neutral internal-only
language across the source tree without changing runtime strings, flags,
commands, or protocol identifiers. The goal is to lower obvious internal
prose leakage while keeping the diff mechanically safe and easy to review.
Constraint: Phase B is limited to comments/prose only; runtime strings and user-facing labels remain deferred
Rejected: Broad search-and-replace across strings and command descriptions | too risky for a prose-only pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Remaining ANT-ONLY hits are mostly runtime/user-facing strings and should be handled separately from comment cleanup
Tested: bun run build
Tested: bun run smoke
Tested: bun run verify:privacy
Tested: bun run test:provider
Tested: bun run test:provider-recommendation
Not-tested: Full repo typecheck (upstream baseline remains noisy)
Co-authored-by: anandh8x <test@example.com>
Replace raw === '1' || === 'true' comparisons with isEnvTruthy() in
context.ts for consistency with getAPIProvider() in providers.ts.
This also covers the newly added CLAUDE_CODE_USE_GITHUB provider.
Add native Gemini model entries (without google/ prefix) to both
context window and max output token tables. Corrects gemini-2.5-pro
and gemini-2.5-flash max output tokens to 65,536 (was 8,192/32,768).
- Introduced environment variable CLAUDE_CODE_USE_GITHUB to enable GitHub Models.
- Added checks for GITHUB_TOKEN or GH_TOKEN for authentication.
- Updated base URL handling to include GitHub Models default.
- Enhanced provider detection and error handling for GitHub Models.
- Updated relevant functions and components to accommodate the new provider.
Without this fix, getContextWindowForModel() returns 200k for all OpenAI
models (the Claude default), causing two problems:
1. Auto-compact/warnings trigger at wrong thresholds (200k instead of 128k)
2. getModelMaxOutputTokens() returns 32k causing 400 errors from APIs that
cap output tokens lower (gpt-4o supports max 16384)
Fix:
- Add openaiContextWindows.ts with known context window sizes and max output
token limits for 30+ OpenAI-compatible models (OpenAI, DeepSeek, Groq,
Mistral, Ollama, LM Studio)
- Hook into getContextWindowForModel() so correct input limits are used
- Hook into getModelMaxOutputTokens() so correct output limits are sent,
preventing 400 "max_tokens is too large" errors
All existing warning, blocking, and auto-compact infrastructure works
automatically once the correct limits are returned.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Squash the current repository state back into one baseline commit while
preserving the README reframing and repository contents.
Constraint: User explicitly requested a single squashed commit with subject "asdf"
Confidence: high
Scope-risk: broad
Reversibility: clean
Directive: This commit intentionally rewrites published history; coordinate before future force-pushes
Tested: git status clean; local history rewritten to one commit; force-pushed main to origin and instructkr
Not-tested: Fresh clone verification after push