# PP assignment 6
###### tags: `PP Assignment`
### Q1
**Explain your implementation. How do you optimize the performance of convolution?**
* In ```hostFE.c```, I set work-group size to be 25 x 25. This ensure every pixel is being processed by a work item and it's optimized for the device.
* The kernel function lies in ```kernel.cl```. The kernel function start off by getting the item ID of the current work-item. Finally the pixel is calculated using zero-padding convlution based on the function in ```serialConv.c```
```clike=1
__kernel void convolution(){
int i = get_global_id(0);
int j = get_global_id(1);
conv = 0;
for (k = -halffilterSize; k <= halffilterSize; k++){
for (l = -halffilterSize; l <= halffilterSize; l++){
if (j + k >= 0 && j + k < imageHeight &&
i + l >= 0 && i + l < imageWidth){
conv += inputImage[(j + k) * imageWidth + i + l]
* filter[(k + halffilterSize) *
filterWidth + l + halffilterSize];
}
}
}
output[j * imageWidth + i] = conv;
}
```