I also asked on github (https://github.com/keras-team/keras/issues/20369). It seems there is no simple solution to this problem, only workarounds. XLA-compilation would mean too much coding in my case, as this has to be implemented at very low abstraction level. Using buckets is something I already tried with somewhat feasible results and will be my go to approach: I pad the flexible input dimension to have a size, of a multiple of 10. Thus, I am reducing the retracing by a factor of 10 times, making the GPU somewhat feasible.