High-Level GPU Programming with Julia - 鄭景文

由於場地問題，第二天我們移動到另一棟大樓啦！議程教室變動請見網站上的議程表。

歡迎來到 https://hackmd.io/@coscup/2019 共筆

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

點擊本頁上方的開始用 Markdown 一起寫筆記！
手機版請點選上方按鈕展開議程列表。

請從這裡開始

slides

What is GPU

How CPU DRAW

How GPU DRAW

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

GPU Programming Concept

Utilize the parallel structure
LOts of threads running at the same time

Work flow

GPU Memory allocation
copy data to device
call kernel <- I'm parallel
copy data back to host
free GPU memory

How GPU PROGRAMMING LOOKS LIKE IN OTHER LANGUAGES

Need to write CUDA-C in almost any language

Why JULIA

HIgh level programming language with low-level performance
- no need to worry about the non-gpu part is too slow
Provide a first calss array implementation
- No more reimplementation on array / tensor type
Good compiler design
- Allow us to compile native julia code??

JULIA OVER GPU

Kernel

Code that run on GPU
Execute on mutiple GPU threads

Writing A KERNEL

pick a function you want to parallelize
Spilt the workload into several subset
…

THREADS & BLOCKS

GPU 執行有多個 block
每一 block 中有多個 thread
ThreadId : block 中的順序？

hardware dependent numbers of threads & blocks
each threads / blocks ca be organized as 1/2/3 dimension grid
- e.g. : 32 threads can be (4,8) or (4,4,2)

GPU PROGRAMMING IN JULIA

Packages

CUDAnative.jl
CUDAAdrv.jl
CuArrays.jl
GPUArrays.jl
see github.com/JuliaGPU

GPU HELLO World

using CUDAnative

#hello world kernel 
function greeting ()
    @cuprint("This is block %ld , thread %d speaking , Hello /n")

function greeting2d()

helloGPU(3,5)

HIGH LEVEL GPU PROGRAMMING

With CuArray.jl
- cu()

GPU UNAWARE CODE ON GPU

GPU KERNEL EXAMPLE

MATMUL

using CuArrays

function matmul_kernel(C ,A ,B)
    i  = threadIdx().x + (blockIdx().x -1 )*blockDim().x
    j  = threadIdx().y + (blockIdx().x -y )*blockDim().y

CONV2D

GATHER

TOOLS

PROFILING WITH NVPROF

CUDAdrv.@profile func()

Info

API call
time : Max/Min/Avg

WHAT GET COMPILED

@device_code_typed @cuda
@device_code_llvm
@device_code_sass
@device_code_ptx @cuda

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

High-Level GPU Programming with Julia - 鄭景文

由於場地問題，第二天我們移動到另一棟大樓啦！議程教室變動請見網站上的議程表。

What is GPU

How CPU DRAW

How GPU DRAW

GPU Programming Concept

Work flow

How GPU PROGRAMMING LOOKS LIKE IN OTHER LANGUAGES

Why JULIA

JULIA OVER GPU

Kernel

Writing A KERNEL

THREADS & BLOCKS

GPU PROGRAMMING IN JULIA

Packages

GPU HELLO World

HIGH LEVEL GPU PROGRAMMING

GPU UNAWARE CODE ON GPU

GPU KERNEL EXAMPLE

MATMUL

CONV2D

GATHER

TOOLS

PROFILING WITH NVPROF

WHAT GET COMPILED

tags: COSCUP2019 Julia language IB501

tags: `COSCUP2019` `Julia language` `IB501`