№ 02 / SUMMARIES

#gpu-optimization

Every summary, chronological. Filter by category, tag, or source from the rail.

Tag · #gpu-optimization
DAY 01May 6, 2026 MAY 6 · 20261 SUMMARIES
Level Up CodingSoftware Engineering

CUDA Matrix Transpose: Naive to Swizzled Optimization

Matrix transpose on GPU pits coalesced reads against writes; solve via shared memory tiling, then fix bank conflicts with padding or XOR swizzling, plus float4 vectorization for peak bandwidth.

Level Up Coding

Showing 1 of 1