"Distilling Task-Specific Knowledge from BERT into Simple Neural Networks."

Raphael Tang et al. (2019)
a service of Schloss Dagstuhl - Leibniz Center for Informatics