It was a problem with my decoding function, not sure what but I made my own tokenizer rather than using tiktoken and it fixed the problem.