"Average-Hard Attention Transformers are Constant-Depth Uniform Threshold ..."

Lena Strobl (2023)

Details and statistics

DOI: 10.48550/ARXIV.2308.03212

access: open

type: Informal or Other Publication

metadata version: 2023-08-21