Reports

I think your issue is related to Tensorflow's GradientTape usage inside the custom loss function. Issues in your code:

1.Tensoflow's GradientTape is meant to be used within a training loop, in your code you are using it inside the loss function, which keras does not support during compilation.

keras loss function only accept y_true and y_pred but you function refernces train, which is not provided during model training. Here is a fixed code 1st part of code

2nd part of code

What changed &why it works:

Moved GradientTape to separate function(compute_gradient_norm).

2.Ensured y_true and y_pred are only used inside the loss function.

3.Used tf.function for better performance.

Fixed incorrect usage of train iinside loss function.
Added a working training example for testing

79435217