orcs-code

Author	SHA1	Message	Date
Juan Camilo Auriti	60d3d8961a	fix: add missing o1-series and Ollama models to context window table (#250 ) Models not in the lookup table fall through to a 200k default, causing auto-compact to never trigger for models with smaller actual context windows. Users hit hard context_window_exceeded errors instead. Added to both context window and max output token tables: - o1, o1-mini, o1-preview, o1-pro (OpenAI reasoning models) - llama3.2:1b, qwen3:8b, codestral (common Ollama models) Relates to #248	2026-04-06 06:39:24 +08:00
Juan Camilo	b65921e8c3	fix: deterministic prefix matching and correct Llama 3.x context windows Two fixes in openaiContextWindows.ts: 1. Sort lookup keys by length descending in lookupByModel() so the most specific prefix always wins. Without this, 'gpt-4-turbo-preview' could match 'gpt-4' (8k) instead of 'gpt-4-turbo' (128k) depending on V8's object key iteration order. 2. Update Llama 3.1/3.2/3.3 context windows from 8,192 to 128,000. These models support 128k context natively (Meta official specs). The previous 8k value was Ollama's default num_ctx, not the model's actual capability, causing premature auto-compact warnings.	2026-04-02 15:50:52 +02:00
Juan Camilo	f385740bd6	fix: use isEnvTruthy() for provider detection in context window lookup Replace raw === '1' \|\| === 'true' comparisons with isEnvTruthy() in context.ts for consistency with getAPIProvider() in providers.ts. This also covers the newly added CLAUDE_CODE_USE_GITHUB provider. Add native Gemini model entries (without google/ prefix) to both context window and max output token tables. Corrects gemini-2.5-pro and gemini-2.5-flash max output tokens to 65,536 (was 8,192/32,768).	2026-04-02 14:43:03 +02:00
Kevin Codex	1ce19b9a39	Merge pull request #59 from Vasanthdev2004/gpt4o-max-tokens-test test: cover OpenAI max token caps for gpt-4o and GPT-5.4	2026-04-02 08:24:25 +08:00
Vasanthdev2004	f0f6f1b285	test: add GPT-5.4 token coverage	2026-04-01 22:07:56 +05:30
Juan Camilo	39d9616ed7	fix: update DeepSeek context window from 64k to 128k DeepSeek V3 documentation specifies 128k context window for both deepseek-chat and deepseek-reasoner. The previous 64k value caused premature compaction and underutilization of available context. Relates to #39 Co-Authored-By: Juan Camilo <juancamilo.auriti@gmail.com>	2026-04-01 17:03:57 +02:00
gnanam1990	4ca94b2454	feat: add context window guard for OpenAI-compatible models Without this fix, getContextWindowForModel() returns 200k for all OpenAI models (the Claude default), causing two problems: 1. Auto-compact/warnings trigger at wrong thresholds (200k instead of 128k) 2. getModelMaxOutputTokens() returns 32k causing 400 errors from APIs that cap output tokens lower (gpt-4o supports max 16384) Fix: - Add openaiContextWindows.ts with known context window sizes and max output token limits for 30+ OpenAI-compatible models (OpenAI, DeepSeek, Groq, Mistral, Ollama, LM Studio) - Hook into getContextWindowForModel() so correct input limits are used - Hook into getModelMaxOutputTokens() so correct output limits are sent, preventing 400 "max_tokens is too large" errors All existing warning, blocking, and auto-compact infrastructure works automatically once the correct limits are returned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 17:42:04 +05:30

7 Commits