I think your issue is related to Tensorflow's GradientTape usage inside the custom loss function. Issues in your code:
1.Tensoflow's GradientTape is meant to be used within a training loop, in your code you are using it inside the loss function, which keras does not support during compilation.
What changed &why it works:
2.Ensured y_true and y_pred are only used inside the loss function.
3.Used tf.function for better performance.
Fixed incorrect usage of train iinside loss function.
Added a working training example for testing