79670503

Date: 2025-06-18 11:04:27
Score: 2
Natty:
Report link

If the model hasn't been trained enough, it might just learn to repeat a single token. During inference, if the decoder input isn't updated correctly at each time step, it might keep predicting the same token. Without attention, it can be harder for the model to learn long dependencies, especially in vanilla encoder-decoder setups

Reasons:
  • No code block (0.5):
  • Single line (0.5):
  • Low reputation (1):
Posted by: Adegbite Ibukunoluwa Mary