ET: Re-Thinking Self-Attention for Transformer Models on GPUs

Published in SC 21 (to appear), 2021