orcs-code

Author	SHA1	Message	Date
lunamonke	b0d9fe7112	Provider loading fix (#623 ) * add mistral and gemini provider type for profile provider field * load latest locally selected * env variables take precedence over json save * add gemini context windows and fix gemini defaulting for env * load on startup fix * fix failing tests * clarify test message * fix variable mismatches * fix failing test * delete keys and set profile.apiKey for mistral and gemini * switch model as well when switching provider * set model when adding a new model	2026-04-18 01:46:20 +08:00
regisksc	43ac6dba75	feat: add Alibaba Coding Plan (DashScope) provider support (#509 ) * feat: add Alibaba Coding Plan provider presets * fix: add DashScope presets to ProviderManager UI selection list * feat: read DASHSCOPE_API_KEY env var for DashScope provider presets * adds regression testing for alibaba models * docs: add time descriptive comment * feat(dashscope): add qwen3.6-plus model support * fix(dashscope): remove MiniMax-M2.5 entries to prevent future key conflicts	2026-04-17 19:06:21 +08:00
ArkhAngelLifeJiggy	51191d6132	feat: add NVIDIA NIM and MiniMax provider support (#552 ) * feat: add NVIDIA NIM and MiniMax provider support - Add nvidia-nim and minimax to --provider CLI flag - Add model discovery for NVIDIA NIM (160+ models) and MiniMax - Update /model picker to show provider-specific models - Fix provider detection in startup banner - Update .env.example with new provider options Supported providers: - NVIDIA NIM: https://integrate.api.nvidia.com/v1 - MiniMax: https://api.minimax.io/v1 * fix: resolve conflict in StartupScreen (keep NVIDIA/MiniMax + add Codex detection) * fix: resolve providerProfile conflict (add imports from main, keep NVIDIA/MiniMax) * fix: revert providerSecrets to match main (NVIDIA/MiniMax handled elsewhere) * fix: add context window entries for NVIDIA NIM and new MiniMax models * fix: use GLM-5 as NVIDIA NIM default and MiniMax-M2.5 for consistency * fix: address remaining review items - add GLM/Kimi context entries, max output tokens, fix .env.example, revert to Nemotron default * fix: filter NVIDIA NIM picker to chat/instruct models only, set provider-specific API keys from saved profiles * chore: add more NVIDIA NIM context window entries for popular models * fix: address remaining non-blocking items - fix base model, clear provider API keys on profile switch	2026-04-15 20:26:13 +08:00
Vasanth T	aeaa658f77	fix: prevent infinite auto-compact loop for unknown 3P models (#635 ) (#636 ) - Raise context window fallback from 8k to 128k for unknown OpenAI-compat models. The 8k fallback caused effective context (8k minus output reservation) to go negative, making auto-compact fire on every single message. - Add safety floor in getEffectiveContextWindowSize(): effective context is always at least reservedTokensForSummary + 13k buffer, ensuring the auto-compact threshold stays positive. - Add missing MiniMax model entries (M2.5, M2.5-highspeed, M2.1, M2.1-highspeed) all at 204,800 context / 131,072 max output per MiniMax docs. - Add tests for MiniMax variants, 128k fallback, and autoCompact floor. Fixes #635 Co-authored-by: root <root@vm7508.lumadock.com>	2026-04-13 02:03:02 +08:00
Nourrisse Florian	2e0e14d713	fix: add LiteLLM-style aliases for GitHub Copilot context windows (#606 ) The OPENAI_CONTEXT_WINDOWS/OPENAI_MAX_OUTPUT_TOKENS tables only contained the `github:copilot:<model>` namespaced form used when talking directly to Copilot via /onboard-github. When OpenClaude is pointed at a LiteLLM proxy (which routes Copilot using the standard `github_copilot/<model>` convention), the lookup missed and fell back to the conservative 8k default — causing the compaction loop to fire repeatedly on every tick and blocking requests before they left the client with repeated "not in context window table" warnings on stderr. Mirror the 11 active Copilot models with LiteLLM-style keys in both tables. No behavior change for users of /onboard-github since namespaced entries remain untouched and `lookupByKey` picks exact matches first.	2026-04-12 21:10:17 +08:00
lunamonke	4c50977f3c	Decouple and fix mistral (#595 ) * decouple and fix mistral * fix wrong variable for currentBaseUrl and buildAPIProviderProperties	2026-04-12 15:26:14 +08:00
Zartris	a7f5982f64	fix: add GitHub Copilot model context windows and output limits (#576 ) Add context_window and max_output_tokens entries for all models available through the GitHub Copilot proxy (Claude, GPT, Gemini, Grok), sourced from https://api.githubcopilot.com/models. Models are namespaced as "github:copilot:<model>" to avoid collisions with the same model names served by other providers (which may have different limits). A new lookupByKey() helper and qualified-key lookup in lookupByModel() ensures the correct limits are selected when OPENAI_MODEL=github:copilot. Without this, Claude models on Copilot would use default context/output limits that may not match the proxy's actual constraints, causing 400 errors like "max_tokens is too large". Related: #515 Co-authored-by: Zartris <14197299+Zartris@users.noreply.github.com>	2026-04-10 22:00:26 +08:00
Kevin Codex	69ea1f1e4a	fix: restore default context window for unknown 3p models (#494 ) * fix: restore default context window for unknown 3p models * fix: add MiniMax context metadata	2026-04-08 02:45:49 +08:00
Juan Camilo Auriti	60d3d8961a	fix: add missing o1-series and Ollama models to context window table (#250 ) Models not in the lookup table fall through to a 200k default, causing auto-compact to never trigger for models with smaller actual context windows. Users hit hard context_window_exceeded errors instead. Added to both context window and max output token tables: - o1, o1-mini, o1-preview, o1-pro (OpenAI reasoning models) - llama3.2:1b, qwen3:8b, codestral (common Ollama models) Relates to #248	2026-04-06 06:39:24 +08:00
Juan Camilo	b65921e8c3	fix: deterministic prefix matching and correct Llama 3.x context windows Two fixes in openaiContextWindows.ts: 1. Sort lookup keys by length descending in lookupByModel() so the most specific prefix always wins. Without this, 'gpt-4-turbo-preview' could match 'gpt-4' (8k) instead of 'gpt-4-turbo' (128k) depending on V8's object key iteration order. 2. Update Llama 3.1/3.2/3.3 context windows from 8,192 to 128,000. These models support 128k context natively (Meta official specs). The previous 8k value was Ollama's default num_ctx, not the model's actual capability, causing premature auto-compact warnings.	2026-04-02 15:50:52 +02:00
Juan Camilo	f385740bd6	fix: use isEnvTruthy() for provider detection in context window lookup Replace raw === '1' \|\| === 'true' comparisons with isEnvTruthy() in context.ts for consistency with getAPIProvider() in providers.ts. This also covers the newly added CLAUDE_CODE_USE_GITHUB provider. Add native Gemini model entries (without google/ prefix) to both context window and max output token tables. Corrects gemini-2.5-pro and gemini-2.5-flash max output tokens to 65,536 (was 8,192/32,768).	2026-04-02 14:43:03 +02:00
Kevin Codex	1ce19b9a39	Merge pull request #59 from Vasanthdev2004/gpt4o-max-tokens-test test: cover OpenAI max token caps for gpt-4o and GPT-5.4	2026-04-02 08:24:25 +08:00
Vasanthdev2004	f0f6f1b285	test: add GPT-5.4 token coverage	2026-04-01 22:07:56 +05:30
Juan Camilo	39d9616ed7	fix: update DeepSeek context window from 64k to 128k DeepSeek V3 documentation specifies 128k context window for both deepseek-chat and deepseek-reasoner. The previous 64k value caused premature compaction and underutilization of available context. Relates to #39 Co-Authored-By: Juan Camilo <juancamilo.auriti@gmail.com>	2026-04-01 17:03:57 +02:00
gnanam1990	4ca94b2454	feat: add context window guard for OpenAI-compatible models Without this fix, getContextWindowForModel() returns 200k for all OpenAI models (the Claude default), causing two problems: 1. Auto-compact/warnings trigger at wrong thresholds (200k instead of 128k) 2. getModelMaxOutputTokens() returns 32k causing 400 errors from APIs that cap output tokens lower (gpt-4o supports max 16384) Fix: - Add openaiContextWindows.ts with known context window sizes and max output token limits for 30+ OpenAI-compatible models (OpenAI, DeepSeek, Groq, Mistral, Ollama, LM Studio) - Hook into getContextWindowForModel() so correct input limits are used - Hook into getModelMaxOutputTokens() so correct output limits are sent, preventing 400 "max_tokens is too large" errors All existing warning, blocking, and auto-compact infrastructure works automatically once the correct limits are returned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 17:42:04 +05:30

15 Commits