What I had in mind when we talked on Wed was item 3 in this answer to the Stackoverflow question How to use Bert for long text classification?
This refers to https://github.com/google-research/bert/issues/27 which has a tip by Jacob Devlin (who is the inventor of BERT).
How to Fine-Tune BERT for Text Classification? Section 5.3.1.