2025 answer, not directly answering the question but can be a good solution for many.
TL;DR
BERT’s 512-token limit has historically meant you either had to truncate long text or split it into multiple 512-token chunks. However, ModernBERT (released in December 2024) now supports sequences up to 8,192 tokens, making it a drop-in replacement for long-form text without chunking.
That’s it — ModernBERT is basically BERT with a bigger window and better performance. Even for shorter text it's just a better model (backed by many benchmarks).