Well, truns out i forget that inputs are treated as datasets and my masks had the wrong shape.
x_train
has 60 Datapoint, with a sequence length of 577 and a dimension of 1.
dummy_mask
has a shape of 577 times 577 of dimension 1, which is obviously wrong.
The right shape for dummy_musk
is (60, 577, 577)
or more general (x_train.shape[0], sequence_size, sequence_size)
in case of fitting the model.