Reports

\> I'm using the network with a batch size of 1 in eval mode (although I think that shouldn't make a difference in eval mode).

What does your batchsize in train mode? Unfortunately you can't just adjust your weight/bias once to simulate the effect of BatchNorm in training mode.
\> The mathematical part of my brain thinks it must be possible to adjust the weights/biases to simmulate being in train model in eval mode as it's (more or less) two different linear functions of the input (X).

The biggest problem is that these functions depend on the batchsize and data themselves, so you must adjust the weight and bias for each input differently. In eval mode, the BatchNorm uses the weighted average of the input during training, so they are constant, and can be absorb into weight and bias of the next linear/convolution layer. But it is not possible for training case.

If you have the pytorch model, I would suggest you could try to reestimate the mean and var using your data (as you said the result on train mode is quite good, suggesting that the mean and var of your dataset might be good enough for the model). You will need to estimate each layer sequentially (meaning that you will need to estimate the mean and var of a layer using the whole and use the features normalize by that mean and var to do the same in the latter layers)

79822944