"Automatic Thread-Block Size Adjustment for Memory-Bound BLAS Kernels on GPUs."

Daichi Mukunoki, Toshiyuki Imamura, Daisuke Takahashi (2016)
a service of Schloss Dagstuhl - Leibniz Center for Informatics