for i in range(100):
optimizer.zero_grad()
output = model(**model_inputs, labels=labels) # forward pass
loss = output.loss
loss.backward()
optimizer.step()
print('Fine-tuning ended')
I could run the above loop only once in Google Colab and notebook crashes if i increase the range even to 2 or 3. Experiencing the same problem even if I use Databricks ML instance too?. What alternatives you recommend to run the notebook with AdamW optimizer as suggested above in Cloud?.