fix: report cached tokens from OpenAI prompt_tokens_details
OpenAI returns cached token counts in usage.prompt_tokens_details.cached_tokens but the shim hardcoded cache_read_input_tokens to 0. This made prompt caching invisible to the cost tracker and session summary even when OpenAI's automatic caching was actively reducing costs. Changes: - Extend OpenAIStreamChunk usage interface with prompt_tokens_details - Map cached_tokens to cache_read_input_tokens in convertChunkUsage() - Same fix in _convertNonStreamingResponse() for non-streaming path - cache_creation_input_tokens remains 0 (OpenAI auto-caching has no creation cost — it is free and automatic)
This commit is contained in:
@@ -376,6 +376,9 @@ interface OpenAIStreamChunk {
|
|||||||
prompt_tokens?: number
|
prompt_tokens?: number
|
||||||
completion_tokens?: number
|
completion_tokens?: number
|
||||||
total_tokens?: number
|
total_tokens?: number
|
||||||
|
prompt_tokens_details?: {
|
||||||
|
cached_tokens?: number
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -392,7 +395,7 @@ function convertChunkUsage(
|
|||||||
input_tokens: usage.prompt_tokens ?? 0,
|
input_tokens: usage.prompt_tokens ?? 0,
|
||||||
output_tokens: usage.completion_tokens ?? 0,
|
output_tokens: usage.completion_tokens ?? 0,
|
||||||
cache_creation_input_tokens: 0,
|
cache_creation_input_tokens: 0,
|
||||||
cache_read_input_tokens: 0,
|
cache_read_input_tokens: usage.prompt_tokens_details?.cached_tokens ?? 0,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -920,6 +923,9 @@ class OpenAIShimMessages {
|
|||||||
usage?: {
|
usage?: {
|
||||||
prompt_tokens?: number
|
prompt_tokens?: number
|
||||||
completion_tokens?: number
|
completion_tokens?: number
|
||||||
|
prompt_tokens_details?: {
|
||||||
|
cached_tokens?: number
|
||||||
|
}
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
model: string,
|
model: string,
|
||||||
@@ -985,7 +991,7 @@ class OpenAIShimMessages {
|
|||||||
input_tokens: data.usage?.prompt_tokens ?? 0,
|
input_tokens: data.usage?.prompt_tokens ?? 0,
|
||||||
output_tokens: data.usage?.completion_tokens ?? 0,
|
output_tokens: data.usage?.completion_tokens ?? 0,
|
||||||
cache_creation_input_tokens: 0,
|
cache_creation_input_tokens: 0,
|
||||||
cache_read_input_tokens: 0,
|
cache_read_input_tokens: data.usage?.prompt_tokens_details?.cached_tokens ?? 0,
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user