GPU Challenges — Test Your CUDA & Python GPU Skills

GPU Challenges

Solve GPU programming challenges in CUDA C++ or Python (PyTorch).

Challenge Details

SAXPY: Single Precision A*X Plus Y

easy

Implement the BLAS SAXPY pattern y = a*x + y, one element per thread, with a clean 1D launch and minimal kernel logic.

Your Goal
  • Compute the global index once and reuse it.
  • Apply the SAXPY formula only when the index is valid.
  • Leave the host code's launch configuration and data flow intact.
Focus Areas
  • Embarrassingly parallel kernels
  • Scalar-plus-vector operations
  • Simple correctness validation from printed results
What Success Looks Like
  • The first printed values should match the expected SAXPY math.
  • The kernel should handle N values without skipping the tail.
  • No extra temporaries are needed beyond the index and formula.
saxpy.cuPractice Mode
Terminal Output
Select a challenge and write your solution, then run it.
Need more credits? Upgrade your plan →