Flash-Style Causal Attention

by sophiam.96•Apr 30, 2026•👁 3468 views

PyTorch 2.x scaled_dot_product_attention auto-dispatches to FlashAttention on supported GPUs. 4x16x1024x64 fp16, causal mask. Reports measured TFLOPS.

#pytorch#attention#transformers#fp16

Terminal Output

Press "Run" to execute on a real GPU.

Comments (0)

to post comments and vote

No comments yet. Be the first to share your thoughts!