Two fixes in openaiContextWindows.ts: 1. Sort lookup keys by length descending in lookupByModel() so the most specific prefix always wins. Without this, 'gpt-4-turbo-preview' could match 'gpt-4' (8k) instead of 'gpt-4-turbo' (128k) depending on V8's object key iteration order. 2. Update Llama 3.1/3.2/3.3 context windows from 8,192 to 128,000. These models support 128k context natively (Meta official specs). The previous 8k value was Ollama's default num_ctx, not the model's actual capability, causing premature auto-compact warnings.
5.3 KiB
5.3 KiB