I have some suggestions to improve you training results.
Avoid augmentations that make the changes the original label of the image. For example, if you have a circle at one of the image corners and you do random center crop then you will lose that circle but the image is still being labelled with circled
.
I see that you are loading all the images into numpy array. This is not memory efficient (unless you have limited data size). It is better to use a dataloader instead
These are general tips, but would help if we get more information about your use case: