orcs-code

orcs-oss/orcs-code

Fork 0

Commit Graph

Author	SHA1	Message	Date
ArkhAngelLifeJiggy	92d297e50e	feat: context preloading and hybrid context strategy (#860 ) * feat: context preloading and hybrid context strategy PR 2D - Section 2.7, 2.8: - Add contextPreload.ts with pattern-based prediction - Add hybridContextStrategy.ts with cache/fresh balancing - Optimize for cost vs accuracy - Add comprehensive tests (13 passing) * feat: wire hybrid context strategy into API path - Apply hybrid strategy after normalizeMessagesForAPI - Feature-flag controlled (HYBRID_CONTEXT_STRATEGY) - Optimizes cache/fresh balance for API requests * fix: resolve PR 2D blocking issues - Fix predictContextNeeds self-assign bug (matchedCategory = category) - Add test for non-empty predictedNeed - Preserve conversation tail in hybridStrategy (never drop last 3 messages) - Add comment for hardcoded 200k cap in claude.ts Fixes reviewer feedback from gnanam1990 and Vasanthdev2004 * fix: preserve tool_use/tool_result chains in hybridStrategy - Increase MIN_TAIL to 5 (tool_use -> tool_result -> assistant -> user -> next) - Add getMessageChain() to preserve paired messages - Chains kept together in final selection * fix: PR 860 - tool_use/tool_result pairing and safe token counting Blocking: - getMessageChain() now pairs by tool_use.id (block ID) not msg.message.id - Find tool_use blocks by id, pair with tool_result having matching tool_use_id - Fixes tool_result surviving while paired tool_use dropped - Token counting now includes array content (tool_use, tool_result, thinking) - Not just string content, prevents undercounting prompt size - Deduplicate messages by UUID when combining chains + split + tail - Prevents duplicate messages in final request Non-blocking: - Add regression test for tool_use/tool_result pairing * fix: PR 860 - account for actual structured payload size in token counting Blocking: - getMessageTokenCount now calculates actual token count for structured blocks - tool_use: uses JSON.stringify(input).length / 4 + base - tool_result: counts actual content (string or array of text blocks) - thinking: counts actual thinking text length / 4 - is_error flag adds small overhead Non-blocking: - Add tests for large tool_use input and large thinking blocks	2026-04-29 15:49:46 +08:00

Author

SHA1

Message

Date

ArkhAngelLifeJiggy

92d297e50e

feat: context preloading and hybrid context strategy (#860 )

* feat: context preloading and hybrid context strategy

PR 2D - Section 2.7, 2.8:
- Add contextPreload.ts with pattern-based prediction
- Add hybridContextStrategy.ts with cache/fresh balancing
- Optimize for cost vs accuracy
- Add comprehensive tests (13 passing)

* feat: wire hybrid context strategy into API path

- Apply hybrid strategy after normalizeMessagesForAPI
- Feature-flag controlled (HYBRID_CONTEXT_STRATEGY)
- Optimizes cache/fresh balance for API requests

* fix: resolve PR 2D blocking issues

- Fix predictContextNeeds self-assign bug (matchedCategory = category)
- Add test for non-empty predictedNeed
- Preserve conversation tail in hybridStrategy (never drop last 3 messages)
- Add comment for hardcoded 200k cap in claude.ts

Fixes reviewer feedback from gnanam1990 and Vasanthdev2004

* fix: preserve tool_use/tool_result chains in hybridStrategy

- Increase MIN_TAIL to 5 (tool_use -> tool_result -> assistant -> user -> next)
- Add getMessageChain() to preserve paired messages
- Chains kept together in final selection

* fix: PR 860 - tool_use/tool_result pairing and safe token counting

Blocking:
- getMessageChain() now pairs by tool_use.id (block ID) not msg.message.id
- Find tool_use blocks by id, pair with tool_result having matching tool_use_id
- Fixes tool_result surviving while paired tool_use dropped

- Token counting now includes array content (tool_use, tool_result, thinking)
- Not just string content, prevents undercounting prompt size

- Deduplicate messages by UUID when combining chains + split + tail
- Prevents duplicate messages in final request

Non-blocking:
- Add regression test for tool_use/tool_result pairing

* fix: PR 860 - account for actual structured payload size in token counting

Blocking:
- getMessageTokenCount now calculates actual token count for structured blocks
- tool_use: uses JSON.stringify(input).length / 4 + base
- tool_result: counts actual content (string or array of text blocks)
- thinking: counts actual thinking text length / 4
- is_error flag adds small overhead

Non-blocking:
- Add tests for large tool_use input and large thinking blocks

2026-04-29 15:49:46 +08:00

1 Commits