79105470

Date: 2024-10-19 16:59:29
Score: 4
Natty:
Report link

I'm facing exactly the same issue. The number of tokens used doesn't make sense, even following the logic of RAG, where we need to count the tokens of the system prompt + main prompt + chunks. In my case, I'm retrieving 3 chunks of 256 tokens, asking a very small prompt and it results in ~3k tokens. More details here: https://learn.microsoft.com/en-us/answers/questions/2103832/high-token-consumption-in-azure-openai-with-your-d.

Wondering if there are additional steps running under the hood.

Reasons:
  • No code block (0.5):
  • Me too answer (2.5): I'm facing exactly the same issue
  • Low reputation (1):
Posted by: Filipa Castro