Could you provide a minimal reproducible example of the code you're running ?
On top of that, i would suggest to:
- Run the training from a python script, instead of a jupyter notebook, writing the output to a file instead of printing to the standard output and check if the issue persist
- Run he training on a reduced version of the dataset: in this way you should be able to reach the same amount of epochs in less time. If the problem persists after the same time interval, the problem may be related to the number of epochs, otherwise it may be related to the running time.
- Run the training on a different machine/on your laptop, to check if the problem is related to the remote machine you're running on. (If it's too computationally heavy for your laptop, you might use a reduced version of the dataset here as well)