# PP assignment 6 ###### tags: `PP Assignment` ### Q1 **Explain your implementation. How do you optimize the performance of convolution?** * In ```hostFE.c```, I set work-group size to be 25 x 25. This ensure every pixel is being processed by a work item and it's optimized for the device. * The kernel function lies in ```kernel.cl```. The kernel function start off by getting the item ID of the current work-item. Finally the pixel is calculated using zero-padding convlution based on the function in ```serialConv.c``` ```clike=1 __kernel void convolution(){ int i = get_global_id(0); int j = get_global_id(1); conv = 0; for (k = -halffilterSize; k <= halffilterSize; k++){ for (l = -halffilterSize; l <= halffilterSize; l++){ if (j + k >= 0 && j + k < imageHeight && i + l >= 0 && i + l < imageWidth){ conv += inputImage[(j + k) * imageWidth + i + l] * filter[(k + halffilterSize) * filterWidth + l + halffilterSize]; } } } output[j * imageWidth + i] = conv; } ```