"Efficient Large-Scale Language Model Training on GPU Clusters."

Deepak Narayanan et al. (2021)
a service of  Schloss Dagstuhl - Leibniz Center for Informatics