You should check in your code the dimensions of the target that you give to fit() and the dimensions of your model output (why 49). How is defined your train_dataset? Why not use one dense layer for the final layer of your model?