owned this note changed 6 years ago
Linked with GitHub

High-Level GPU Programming with Julia - 鄭景文

由於場地問題,第二天我們移動到另一棟大樓啦!議程教室變動請見網站上的議程表

歡迎來到 https://hackmd.io/@coscup/2019 共筆

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

點擊本頁上方的 開始用 Markdown 一起寫筆記!
手機版請點選上方 按鈕展開議程列表。

請從這裡開始

slides

What is GPU

How CPU DRAW

How GPU DRAW

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

GPU Programming Concept

  • Utilize the parallel structure
  • LOts of threads running at the same time

Work flow

  1. GPU Memory allocation
  2. copy data to device
  3. call kernel <- I'm parallel
  4. copy data back to host
  5. free GPU memory

How GPU PROGRAMMING LOOKS LIKE IN OTHER LANGUAGES

Need to write CUDA-C in almost any language

Why JULIA

  • HIgh level programming language with low-level performance
    • no need to worry about the non-gpu part is too slow
  • Provide a first calss array implementation
    • No more reimplementation on array / tensor type
  • Good compiler design
    • Allow us to compile native julia code??

JULIA OVER GPU

Kernel

  • Code that run on GPU
  • Execute on mutiple GPU threads

Writing A KERNEL

  • pick a function you want to parallelize
  • Spilt the workload into several subset

THREADS & BLOCKS

GPU 執行有多個 block
每一 block 中有多個 thread
ThreadId : block 中的 順序?

  • hardware dependent numbers of threads & blocks
  • each threads / blocks ca be organized as 1/2/3 dimension grid
    • e.g. : 32 threads can be (4,8) or (4,4,2)

GPU PROGRAMMING IN JULIA

Packages

GPU HELLO World

using CUDAnative

#hello world kernel 
function greeting ()
    @cuprint("This is block %ld , thread %d speaking , Hello /n")

function greeting2d()

helloGPU(3,5)

HIGH LEVEL GPU PROGRAMMING

  • With CuArray.jl
    • cu()

GPU UNAWARE CODE ON GPU

GPU KERNEL EXAMPLE

MATMUL

using CuArrays

function matmul_kernel(C ,A ,B)
    i  = threadIdx().x + (blockIdx().x -1 )*blockDim().x
    j  = threadIdx().y + (blockIdx().x -y )*blockDim().y

CONV2D

GATHER

TOOLS

PROFILING WITH NVPROF

CUDAdrv.@profile func()

Info

  • API call
  • time : Max/Min/Avg

WHAT GET COMPILED

@device_code_typed @cuda
@device_code_llvm
@device_code_sass
@device_code_ptx @cuda
tags: COSCUP2019 Julia language IB501
Select a repo