"Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU ..."

Vasudev Shyam et al. (2024)

Details and statistics

DOI: 10.48550/ARXIV.2408.04093

access: open

type: Informal or Other Publication

metadata version: 2024-09-13