"Large Batch Optimization for Deep Learning: Training BERT in 76 minutes."

Yang You et al. (2020)
a service of Schloss Dagstuhl - Leibniz Center for Informatics