I did it as posted above.
# Using a callback to trainer because Huggingface does not explicitly log the train accuracy
# Adding a custom callback which calls the evaluate() method with train_dataset at the end of every callback.
class CustomCallback(TrainerCallback):
def __init__(self, trainer) -> None:
super().__init__()
self._trainer = trainer
def on_epoch_end(self, args, state, control, **kwargs):
if control.should_evaluate:
control_copy = deepcopy(control) #If not deep copy control, the trainer would not evaluate the evaluation dataset
self._trainer.evaluate(eval_dataset=self._trainer.train_dataset, metric_key_prefix="train")
return control_copy
def my_compute_metrics2(eval_pred):
metrics = ["accuracy", "bleu"]
metric={}
for i in metrics:
metric[i] = evaluate.load(i)
preds, labels = eval_pred
predictions = np.argmax(preds, axis=1)
metric_results={} # Create dictionary to store Accuracy and Bleu metrics
for i in metrics:
metric_results[i]=metric[i].compute(predictions=predictions, references=labels)[i]
return metric_results
However, when running on Google Colab, the following error " out of GPU memory " occurred. Please see below:
OutOfMemoryError: CUDA out of memory. Tried to allocate 4.10 GiB. GPU 0 has a total capacity of 14.74 GiB of which 732.12 MiB is free. Process 444503 has 14.02 GiB memory in use. Of the allocated memory 8.97 GiB is allocated by PyTorch, and 4.92 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.
How to fix above OutOfMemoryError ? I look forward to hearing from you!
Thanks in advance! It is urgent for me to fix it.