I would suggest using QLoRA for fine tuning and try using a well defined format for the fine tuning data like :
{messages: [{"role" :"system", "content" : "......"}, {"role": "user", "content" : "...."}, {"role" : "response", "content" : "......"}]
Also try using a suitable optimizer during fine tuning like adamw
I could provide more detailed solution if you can share your fine tuning approach.