fix: prevent duplicate responses in OpenAI streaming

When certain OpenAI-compatible APIs (LM Studio, some proxies) send multiple stream chunks with finish_reason set, the finish block ran multiple times — emitting content_block_stop and message_delta for each one. Each content_block_stop caused claude.ts to create and yield a new assistant message, making every response appear twice in the UI. Fix: add hasProcessedFinishReason flag (same pattern as the existing hasEmittedFinalUsage flag) so the finish block only executes once per response regardless of how many chunks contain finish_reason. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 18:14:41 +05:30
parent 4ca94b2454
commit cb86f73c06
1 changed files with 6 additions and 2 deletions
--- a/src/services/api/openaiShim.ts
+++ b/src/services/api/openaiShim.ts
@@ -292,6 +292,7 @@ async function* openaiStreamToAnthropic(
  let hasEmittedContentStart = false
  let lastStopReason: 'tool_use' | 'max_tokens' | 'end_turn' | null = null
  let hasEmittedFinalUsage = false
+  let hasProcessedFinishReason = false

  // Emit message_start
  yield {
@@ -422,8 +423,11 @@ async function* openaiStreamToAnthropic(
          }
        }

-        // Finish
-        if (choice.finish_reason) {
+        // Finish — guard ensures we only process finish_reason once even if
+        // multiple chunks arrive with finish_reason set (some providers do this)
+        if (choice.finish_reason && !hasProcessedFinishReason) {
+          hasProcessedFinishReason = true
+
          // Close any open content blocks
          if (hasEmittedContentStart) {
            yield {