Due: Tue, 2023/4/11 23:59
Slides: https://docs.google.com/presentation/d/1Ux1caoQtJmzLOwkE-C1PGS40WoyhO7u_lB94PWq5Fws/edit?usp=sharing
This homework helps you understand the basic concepts in CUDA.
The sobel operator is used in image processing and computer vision, particularly within edge detection algorithms where it creates an image emphasising edges.
In this homework, you are given the sequential (CPU) code of a 5x5 variant of the sobel
operator, and asked to parallelize it with CUDA. Refer to the appendix for the information of the CPU version.
Your code should only contain a single GPU kernel function named as sobel()
.
The input file is a PNG image with 3 color channels: RGB.
The output file is a PNG image with 3 color channels: RGB.
Your output is considered correct if at least 99.8% of the pixels are identical with the provided sequential version.
Your output is considered incorrect if the dimensions of the output image is incorrect.
We use NCHC container for this homework.
We use Makefile
to build your
code. The default Makefile for this homework is provided at /tmp/dataset-nthu-ipc23/share/hw3/Makefile
.
If you wish to change the compilation flags, include Makefile
in your submission.
To use Makefile to build your code, make sure Makefile
and hw3.cu
is in the
working directory, then run make
on the command line and it will build hw3
for you. To remove the built files, run make clean
.
We will compile your code with the following command:
Your code will be executed with a command equalviant to:
The time limit for each test case is 30 seconds.
Answer the following questions, in either English or Traditional Chinese.
cudaMalloc
and cudaMallocManaged
? When will you pick one over another?nvprof
. Show the difference with and without shared memory. In addition, measure the global memory load throughput (gld_throughput
) and instruction per cycle (ipc
) and explain your observation.Upload these files to EEClass:
hw3.cu
– the source code of your implementation.Makefile
– optional. Submit this file if you want to change the build command.report.pdf
– your report.Please follow the naming listed above carefully. Failing to adhere to the names
above will result to points deduction. Here are a few bad examples: hw3.CU
,
HW3.cu
, report.docx
, report.pages
Makefile.mak
.
Please note that this spec, the sample test cases and programs might contain bugs.
If you spotted one and are unsure about it, please ask on eeclass.
The reference C++ implementation is at /tmp/dataset-nthu-ipc23/share/hw3/sobel.cc
.
The refernce code follows the same input/output format as your homework, and
you can start implementing your version by copying it to hw3.cu
.
The sample test cases are located at /tmp/dataset-nthu-ipc23/share/hw3/samples
.
/tmp/dataset-nthu-ipc23/share/hw3/hw3-diff
can be used to compare two images.
For example, to compare your output with the answer, you may use:
The hw3-judge
and hw3-kernel-judge
command can be used to automatically judge your code against
all sample test cases, it also submits your execution time to the scoreboard
so you can compare your performance with others.
Scoreboard: https://apollo.cs.nthu.edu.tw/ipc23/scoreboard/hw3/
https://apollo.cs.nthu.edu.tw/ipc23/scoreboard/hw3-kernel/
To use it, run hw3-judge
in the directory that contains your code hw3.cu
.
It will automatically search for Makefile
and use it to compile your code,
or fallback to the TA provided /tmp/dataset-nthu-ipc23/share/hw3/Makefile
otherwise.
If code compiliation is successful, it will then run all the sample test cases,
show you the results as well as update the scoreboard.
Note:
hw3-judge
and the scoreboard has nothing to do with grading.
Only the code submitted to iLMS is considered for grading purposes.
Type hw3-judge --help
to see a list of supported options.
Verdict | Explaination |
---|---|
internal error | there is a bug in the judge |
time limit exceeded+ | execution time > time limit + 10 seconds |
time limit exceeded | execution time > time limit |
runtime error | your program didn't return 0 or is terminated by a signal |
no output | your program did not produce an output file |
wrong answer | your output is incorrect |
accepted | you passed the test case |
spec