79248151

Date: 2024-12-03 15:16:31
Score: 1
Natty:
Report link

The size of batch_size should be chosen based on your preferences. If you choose a smaller batch_size, for example, batch_size=32, then your computer will not spend much resources training for such a data set, but the gradients may be more noisy, and if you choose a larger batch_size, for example, batch_size=4096, then you will obviously need more resources, but at the same time, because of the large amount of data, gradients will be calculated more smoothly, and training on a large batch_size, as a rule, is more stable. Conclusion: Set the average batch_size, for example, some batch_size=512 and do not worry, this is not the most important hyperparameter in training :)

Reasons:
  • Long answer (-0.5):
  • No code block (0.5):
  • Low reputation (1):
Posted by: Aiden