79302389

Date: 2024-12-23 07:05:12
Score: 1.5
Natty:
Report link

Batch size = 1 seems problem to me. You are giving one data point at a time and because of this the updates in the weight has high varaince and it make the convergence difficult and unstable.

And try to use Gradient Scaling Before Clipping.

Reasons:
  • Low length (0.5):
  • No code block (0.5):
  • Low reputation (0.5):
Posted by: XGB