Reports

Check out a userscript which highlights deleted posts. GitHub

79829735

Date: 2025-11-25 13:08:48

Score: 5

Natty: 5

Dude before applying the accumulated gradients should we first divide the accumulated gradient by number by size of the effective mini batch / target accumulation_count . ?

Reasons:

Low length (1):
No code block (0.5):
Ends in question mark (2):
Single line (0.5):
Low reputation (1):

Posted by: shaurya1negi