Note on CUDA programming

###### tags: `CUDA` Note on CUDA programming === CUDA execution model --- - [Programming Model](/3xOWlRxkQ8iWea_bqXhHww) - [Compilation Workflow](/smfOoZv1QKKagZ3oXSauaQ) Global memory --- - [Memory architecture](/1Rpaxd0BQFWRkRnhZZ3x6g) - [Memory management](/MSxX1LiAQJSGz3I7HaeIVQ) - [Memory access pattern](/TrwAccEwTiq3pwFrwAi0Cw) Shared memory and constant memory --- - [Configurations of Shared Memory](/tT7xl4vcSU-LFUda0Ev71Q) - [Bank conflict](/wET73P1BQhCXYPDqnHbwUw) - [Synchronization of Shared Memory](/pSuPuzNUQeuepkEpRYvFCA) - [Warp Shuffles](/Lr3-m6jiReydZjZPlQjsaQ) Texture memory --- - [Texture Memory](/bpKNNQcoTXCsnyyO-iU-Sw) Streams and concurrency --- - [Stream and Concurrency](/-VsFMt9dS2OS5UoLZHU2mA) Worked Examples --- - [Parallel Reduction](/vo-4fPtRTqugpJYLu-T_6A) - [Matrix Addition](/k1w1yiRFQvqtJpLmFZaMiQ) - [Matrix Transpose](/FVL8I6CCQeqOTmSvB6qLTg) - [Matrix Multiplcation](/kzsnoeYKS8qvgh8N6yfsdQ) Tensor core --- - [Tensor core](/CnHZ8KhCQvGxdS4lUEarnA) Profiling --- - [Calculation of Warp Occupancy](/nxg0xlLqQlyeJ5TE_b_YBQ) - [Performance tuning](/sXD0tx8DTuyEeWJF-YGEbA) - [Nsight compute](/Paa9DzzGTGOzcd9QuoxnTA) - [Nsight system](/zOywJzjYQhabhP2ezJ9-9w) Debugging --- - [cuda-gdb](/-KNLPssAQNO5xpR6baZj8Q) Mannual --- [nvcc](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/) [CUDA C++ Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) [CUDA C++ Best Practices Guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html) [Parallel Thread Execution ISA Version 8.5](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html) [CUDA runtime API](https://docs.nvidia.com/cuda/cuda-runtime-api/index.html) [CUDA driver API](https://docs.nvidia.com/cuda/cuda-driver-api/index.html) [Thrust](https://developer.nvidia.com/thrust) [CUDA-GDB](https://docs.nvidia.com/cuda/cuda-gdb/index.html) [Compute Sanitizer](https://docs.nvidia.com/compute-sanitizer/index.html) [Nsight System](https://docs.nvidia.com/nsight-systems/index.html) [Nsight compute](https://docs.nvidia.com/nsight-compute/index.html) [Profiler](https://docs.nvidia.com/cuda/profiler-users-guide/index.html) [CUDA binary utilities](https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html) Miscellaneous --- - [Nvidia Technical Blog](/TImNuvCaS_ubrl5eFJW2Jw) - [CUDA FAQ](/1TwlHGzqRoGDxmM_HfCtkA) - [Syntax](/n0uzCKQ1SJ-_thzKY4Kf-Q) - [Thread indexing cheatsheet](/zLhfeBSZRSGCmojhXOjLdw) - [Trouble shooting](/PuZLlMCXRiWLJYpc85qjEg) - [NVIDIA A10](/wn4rhyicTa2YiV8-M-2E5g) - [kriegalex](https://github.com/kriegalex/wrox-pro-cuda-c/tree/master) - [HangJie720 ](https://github.com/HangJie720/Professional-CUDA-C-Programming/tree/master) - [deeplearning](https://github.com/deeperlearning/professional-cuda-c-programming) - [Good articles](https://siboehm.com/)