In the creation of MultiscaleCNN
, you want to divide the embeddings dim to 3 parts, but 4096
is not divisible by 3
, instead each dimension of a subnetwork is cast to 4096//3 = 1365
, then multiply by 3
which give out 1365 * 3 = 4095
. For a quick fix, to inititalize DeepCNN
, you can pass out_dim - (out_dim // 3) * 2
as the residual dimension.