---
tags: teaching,n8cir
---
# N8 GPU Python Workshop 2023-04-18
Hackpad for N8 CIR GPU Python workshop based on [Carpentries Incubator lesson](https://carpentries-incubator.github.io/lesson-gpu-programming/).
## Usage of this hackpad
You can use this hackpad by clicking the [edit button](https://hackmd.io/@research-computing-leeds/n8-gpu-py/edit?both).
This will present the hackpad in a split screen view with editable [markdown](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax) on the left hand side and rendered markdown on the right hand side.
## Useful links
- [Course notes](https://arc.leeds.ac.uk/lesson-gpu-programming/index.html)
- [Google Colaboratory](https://colab.research.google.com/)
- Join the slack channel for this workshop in the N8 CIR slack workspace
- [Code of Conduct](https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html)
- [Session 1 hackpad](https://hackmd.io/@research-computing-leeds/n8-gpu-py)
- [Session slides](https://hackmd.io/@research-computing-leeds/n8cir-gpu-py#/)
- [Lesson GitHub repository](https://github.com/ARCLeeds/lesson-gpu-programming)
## Agenda
| Time | Agenda |
| --------- | ----------------------------------------------------------------- |
| 10:00 | Welcome and recap |
| 10:20 | [Your First GPU Kernel](https://carpentries-incubator.github.io/lesson-gpu-programming/05-first_program/index.html)
| 11:00 | Break |
| 11:10 | [Registers, Global, and Local Memory](https://carpentries-incubator.github.io/lesson-gpu-programming/06-global_local_memory/index.html) |
| 12:00 | Lunch |
| 13:00 | [Shared Memory and Synchronization](https://carpentries-incubator.github.io/lesson-gpu-programming/07-shared_memory_and_synchronization/index.html) |
| 14:30 | Break |
| 14:40 | [Constant Memory](https://carpentries-incubator.github.io/lesson-gpu-programming/08-constant_memory/index.html) |
| 15:00 | [Concurrent access to the GPU](https://carpentries-incubator.github.io/lesson-gpu-programming/09-streams/index.html) |
| 16:00 | Close |
## Google Colab
We'll be working through todays examples using Google Colaboratory.
You'll need to register a google account to use the service but once registered you can create a new can create a new colab notebook via colab.research.google.com/.
## Misc
Code to fetch radio astronomy data:
```bash!
wget https://github.com/ARCLeeds/lesson-gpu-programming/raw/gh-pages/data/GMRT_image_of_Galactic_Center.fits
```
## Who you are and where do you come from?
1. Alex Coleman, research software engineer from University of Leeds. Likes Python, R, Rust🦀
2. Samantha Finnigan, Research Software Engineer at Durham University. Maker of sensors, platforms, and 3D printed things 🏺
3. Chenzi Xu, Postdoc at University of York, working on person-specific automatic speaker recognition
4. Poppy Welch, Research Assistant at York University, working on person-specific automatic speaker recognition
5. Andrew Broad, Postgraduate Researcher, CDT for medical diagnosis and care, Leeds. Final year of PhD. Using ARC and Bede in developing attention-based neural networks.
6. Josh Reukauf, PhD-Student in Behaviour-Informatics, Newcastle University, interested in automation of welfare measures in primates.
7. Michael McCorkindale - Bioinformatics Support Unit at Newcastle University. I specialise in HPC pipelining.
8. Clelia Middleton - PhD student with Penfold Group Newcastle University. My PhD is in machine learning for spectroscopy and the code makes use of tensorflow.
9. Cong Zhang, Lecturer in Phonetics and Phonology, Newcastle University.
10. Jordan J. Hood, PhD Student, STOR-i @ Lancaster University, Spatio-Temporal Bayesian modelling (TensorFlow)
11. Adrienne Unsworth - Bioinformatics support unit at Newcastle University. Mostly analyse single cell & spatial transcriptomics data.
12. Daniel Kluvanec - PhD student at Durham University. I'm researching Deep Learning for 3D image processing applied to geological seismic data
13. Xiaoyue Wu - PhD student in Computational Chemistry from University of Leeds
14. Yuzheng Zhang - PhD student in Engineering from Durham University.
15. Jess Bridgen, Postdoc, Lancaster University
# Notes
**GPU Memory allocation:**
#hopefully no mistakes here? CM
**registers**=small, helpful local memory on each thread. high performance. can support small arrays.
with **register spillover**, data is allocated **automatically to local memory** (=global divided per thread). (performance penalty.)
**nvidia-gpu are not "cache-coherent"** - updates to global memory might not be reflected simultaneously until processes completed on all threads.
**static allocation** of memory to local of a specific size from host can be achieved by **adding memory size as an argument** to one's kernel.
**shared memory** = read and write-able from each thread to the block. use `__shared__` with variable in kernel to specify this location.
applying keyword `extern` to the variable permits a **parameter submitted from the host to determine size of shared memory item.**
if using shared allocation techniques, take care to ensure that different blocks don't overwrite the same spots in shared memory. can account for this with e.g.: use of an appropriate offset when indexing.
atomicAdd can be used in CUDA code to prevent same output being overwritten by threads.
# Further Reading
- Blog post on [CUDA memory hierarchy](http://thebeardsage.com/cuda-memory-hierarchy/)