# Parallel Programming 2022 fall hw5
## 1. Counting Sort Questions
```c
void Count_sort(int a[], int n) {
int i, j, count;
int* temp = malloc(n∗sizeof(int));
for (i = 0; i < n; i++) {
count = 0;
for (j = 0; j < n; j++)
if (a[j] < a[i]) count++;
else if (a[j] == a[i] && j < i)
count++;
temp[count] = a[i];
}
memcpy(a, temp, n∗sizeof(int));
free(temp);
} /∗ Count sort ∗/
```
1. >If we try to parallelize the for i loop (the outer loop), which variables should be private and which should be shared?
Private: i (by default), count, j
Shared: temp[], a[]
2. >If we parallelize the for i loop using the scoping you specified in the previous part,are there any loop-carried dependencies? Explain your answer.
No, we don't have anything like a[i-1], a[i-2]... Also, openMP require the parallized for loops to not have loop-carried depencendies.
3. > Can we parallelize the call to memcpy? Can we modify the code so that this part of the function will be parallelizable?
Not easily, and I don't think doing so will improve performance much, since it would be blocked because all threads reads through all a[i].
Maybe we can have each thread record the calculated ```count``` for their local ```a[i]```, and apply a barrier to make sure all threads are done with a[], then we can have each thread do ```memcpy(a[i], temp[count], sizeof(int))``` on their local ```count``` and ```a[i]```;
4. >Write a C program that includes a parallel implementation of Count sort.
5. >How does the performance of your parallelization of Count sort compare to serial Count sort? How does it compare to the serial qsort library function?
- qsort performs best because of better time complexity
- parallel countsort is way faster then serial count sort






## 2. Producer-Consumer
>Given **==a keyword file== that contains many keywords**. Use OpenMP to implement a producer-consumer program in which some of the threads are producers and others are consumers. The **producers read text from ==a collection of files==, one per producer**. They insert lines of text into a single shared queue (please implement your own thread safe queue)NEW. The consumers take the lines of text and tokenize them. Tokens are “words” separated by white space. When a consumer finds a token that is keyword, the keyword count increases one. Please print each keyword and its count.
```mermaid
graph LR
file0.txt --> producer0 --> tail
file1.txt --> producer1 --> tail
file2.txt --> producer2 --> tail
file3.txt --> producer3 --> tail
head --> consumer0 --> key_count
head --> consumer1 --> key_count
head --> consumer2 --> key_count
head --> consumer3 --> key_count
tail --thread_safe_queue--> head
keywords.txt--> consumer0
keywords.txt--> consumer1
keywords.txt--> consumer2
keywords.txt--> consumer3
```
- Each producer gets its own filename, reads from it, then pushes a line into the queue.
```c=218
#pragma omp for schedule(auto) nowait
for (int i = 0; i < file_count; i++)
{
producer(&threadSafeQueue, file_name[i]);
// printf("openfilename: %s\n", file_name[i]);
}
```
- I create the same number of consumers as the produsers, each of the consumers read a line from the queue and tokenize it.
- I get most of my text_files from generating them with chatGPT like so.

## Difficulties
- OpenMP makes it better, but parallel programs are still very hard to debug for me.
## References
- [OpenMP: Data-Sharing Rules](http://jakascorner.com/blog/2016/06/omp-data-sharing-attributes.html#:~:text=A%20variable%20in%20an%20OpenMP,copy%20of%20the%20private%20variable.)
- [OpenMP: For & Scheduling](http://jakascorner.com/blog/2016/06/omp-for-scheduling.html)