Abstract: Vision transformers have significantly advanced the field of computer vision in recent years. The cornerstone of these transformers is the multi-head attention mechanism, which models ...
To speed up computation, deep neural networks (DNNs) usually rely on highly optimized tensor operators. Despite the effectiveness, tensor operators are often defined empirically with ad hoc semantics.