Im also doing kind of similar work with BERT. However I want true probabilities instead of using softmax on logits. Anyone know how to get the true probabilities using BERT.