orcs-code

Author	SHA1	Message	Date
Vasanth T	aeaa658f77	fix: prevent infinite auto-compact loop for unknown 3P models (#635 ) (#636 ) - Raise context window fallback from 8k to 128k for unknown OpenAI-compat models. The 8k fallback caused effective context (8k minus output reservation) to go negative, making auto-compact fire on every single message. - Add safety floor in getEffectiveContextWindowSize(): effective context is always at least reservedTokensForSummary + 13k buffer, ensuring the auto-compact threshold stays positive. - Add missing MiniMax model entries (M2.5, M2.5-highspeed, M2.1, M2.1-highspeed) all at 204,800 context / 131,072 max output per MiniMax docs. - Add tests for MiniMax variants, 128k fallback, and autoCompact floor. Fixes #635 Co-authored-by: root <root@vm7508.lumadock.com>	2026-04-13 02:03:02 +08:00
Nourrisse Florian	2e0e14d713	fix: add LiteLLM-style aliases for GitHub Copilot context windows (#606 ) The OPENAI_CONTEXT_WINDOWS/OPENAI_MAX_OUTPUT_TOKENS tables only contained the `github:copilot:<model>` namespaced form used when talking directly to Copilot via /onboard-github. When OpenClaude is pointed at a LiteLLM proxy (which routes Copilot using the standard `github_copilot/<model>` convention), the lookup missed and fell back to the conservative 8k default — causing the compaction loop to fire repeatedly on every tick and blocking requests before they left the client with repeated "not in context window table" warnings on stderr. Mirror the 11 active Copilot models with LiteLLM-style keys in both tables. No behavior change for users of /onboard-github since namespaced entries remain untouched and `lookupByKey` picks exact matches first.	2026-04-12 21:10:17 +08:00
lunamonke	4c50977f3c	Decouple and fix mistral (#595 ) * decouple and fix mistral * fix wrong variable for currentBaseUrl and buildAPIProviderProperties	2026-04-12 15:26:14 +08:00
Zartris	a7f5982f64	fix: add GitHub Copilot model context windows and output limits (#576 ) Add context_window and max_output_tokens entries for all models available through the GitHub Copilot proxy (Claude, GPT, Gemini, Grok), sourced from https://api.githubcopilot.com/models. Models are namespaced as "github:copilot:<model>" to avoid collisions with the same model names served by other providers (which may have different limits). A new lookupByKey() helper and qualified-key lookup in lookupByModel() ensures the correct limits are selected when OPENAI_MODEL=github:copilot. Without this, Claude models on Copilot would use default context/output limits that may not match the proxy's actual constraints, causing 400 errors like "max_tokens is too large". Related: #515 Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>	2026-04-10 22:00:26 +08:00
Kevin Codex	69ea1f1e4a	fix: restore default context window for unknown 3p models (#494 ) * fix: restore default context window for unknown 3p models * fix: add MiniMax context metadata	2026-04-08 02:45:49 +08:00
Juan Camilo Auriti	60d3d8961a	fix: add missing o1-series and Ollama models to context window table (#250 ) Models not in the lookup table fall through to a 200k default, causing auto-compact to never trigger for models with smaller actual context windows. Users hit hard context_window_exceeded errors instead. Added to both context window and max output token tables: - o1, o1-mini, o1-preview, o1-pro (OpenAI reasoning models) - llama3.2:1b, qwen3:8b, codestral (common Ollama models) Relates to #248	2026-04-06 06:39:24 +08:00
Juan Camilo	b65921e8c3	fix: deterministic prefix matching and correct Llama 3.x context windows Two fixes in openaiContextWindows.ts: 1. Sort lookup keys by length descending in lookupByModel() so the most specific prefix always wins. Without this, 'gpt-4-turbo-preview' could match 'gpt-4' (8k) instead of 'gpt-4-turbo' (128k) depending on V8's object key iteration order. 2. Update Llama 3.1/3.2/3.3 context windows from 8,192 to 128,000. These models support 128k context natively (Meta official specs). The previous 8k value was Ollama's default num_ctx, not the model's actual capability, causing premature auto-compact warnings.	2026-04-02 15:50:52 +02:00
Juan Camilo	f385740bd6	fix: use isEnvTruthy() for provider detection in context window lookup Replace raw === '1' \|\| === 'true' comparisons with isEnvTruthy() in context.ts for consistency with getAPIProvider() in providers.ts. This also covers the newly added CLAUDE_CODE_USE_GITHUB provider. Add native Gemini model entries (without google/ prefix) to both context window and max output token tables. Corrects gemini-2.5-pro and gemini-2.5-flash max output tokens to 65,536 (was 8,192/32,768).	2026-04-02 14:43:03 +02:00
Kevin Codex	1ce19b9a39	Merge pull request #59 from Vasanthdev2004/gpt4o-max-tokens-test test: cover OpenAI max token caps for gpt-4o and GPT-5.4	2026-04-02 08:24:25 +08:00
Vasanthdev2004	f0f6f1b285	test: add GPT-5.4 token coverage	2026-04-01 22:07:56 +05:30
Juan Camilo	39d9616ed7	fix: update DeepSeek context window from 64k to 128k DeepSeek V3 documentation specifies 128k context window for both deepseek-chat and deepseek-reasoner. The previous 64k value caused premature compaction and underutilization of available context. Relates to #39 Co-Authored-By: Juan Camilo <juancamilo.auriti@gmail.com>	2026-04-01 17:03:57 +02:00
gnanam1990	4ca94b2454	feat: add context window guard for OpenAI-compatible models Without this fix, getContextWindowForModel() returns 200k for all OpenAI models (the Claude default), causing two problems: 1. Auto-compact/warnings trigger at wrong thresholds (200k instead of 128k) 2. getModelMaxOutputTokens() returns 32k causing 400 errors from APIs that cap output tokens lower (gpt-4o supports max 16384) Fix: - Add openaiContextWindows.ts with known context window sizes and max output token limits for 30+ OpenAI-compatible models (OpenAI, DeepSeek, Groq, Mistral, Ollama, LM Studio) - Hook into getContextWindowForModel() so correct input limits are used - Hook into getModelMaxOutputTokens() so correct output limits are sent, preventing 400 "max_tokens is too large" errors All existing warning, blocking, and auto-compact infrastructure works automatically once the correct limits are returned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 17:42:04 +05:30

12 Commits