Huanxin Lin, Cho-Li Wang: On-GPU thread-data remapping for nested branch divergence. J. Parallel Distributed Comput. 139: 75-86 (2020)