# Assignment 8: Parallelism & Concurrency
This is a **pair or solo Gradescope assignment**—you will submit a README and the following OCaml files:
- `sort.ml`
- `sortfromfile.ml`
- `sorttest.ml`
- `bank.ml`
:::info
Recall that assignments, if completed as a pair, must be completed **together** rather than dividing work. See the syllabus for more.
:::
### Change log
I will avoid making changes to the assignment after it is posted, but any necessary changes will be added here:
- Added note about `time` ("`real`").
# Setup
:::success
Install Domainslib with:
```
opam install domainslib
```
:::
The [stencil code](https://drive.google.com/file/d/1Wt_ZIbDcDFjJsq6OS1xiva5LUJ9zOQsF/view?usp=sharing) is set up as a `dune` OCaml project.
## Part 1: Parallel sort
:::success
Your task in this part is to implement and evaluate a parallel sorting implementation on OCaml records.
:::
:::warning
For this problem, you should not use `List.sort`: you must implement a sorting algorithm yourself.
You may otherwise use any standard OCaml library functions, and may look up external sources as long as you cite them.
:::
You are provided with a file of records, with 1 record per line. A record includes a name (a string of lowercase letters) and an ID (an integer) separated by a comma. Use `String.compare` on record names as the primary key for comparing records, with IDs as the secondary key (for tie-breaking).
Choose one of the following 2 sorting algorithms to implement (or message me on Zulip to propose an alternative sorting algorithm):
1. [Mergesort](https://en.wikipedia.org/wiki/Merge_sort): you have already implemented in Racket in Assignment 1.
2. [Quicksort](https://en.wikipedia.org/wiki/Quicksort): more difficult to parallelize, but interesting to try if you have never implemented it. In practice, it's typically the fastest sorting algorithm for randomized data, and is the basis for most library sorting implementations.
:::info
Run the stencil code from the command line with, for example:
```
$ dune exec ./sortfromfile.exe -- 10_000_records.txt 1 1
```
Note that with the initial stencil code, this is expected to fail with:
```
Fatal error: exception Failure("line_to_record: not implemented")
```
:::
:::success
First, complete the code to read in records from a file (and print results to standard out) in `sortfromfile.ml`. Raise `InvalidInput` when needed.
Then, implement a sequential version of your chosen algorithm in `sort.ml`. The sequential version should ignores the later parameters.
Add several additional smaller, manual tests of `sort` in `sorttest.ml`. Test with `dune test`.
Once you are passing smaller tests, run your sequential code on the 4 provided `_records.txt` files.
:::
Next, parallelize your code!
:::success
In `sort.ml`:
- Use Domainslib tasks and `async`/`await` to parallelize across the provided number of threads, `k`.
- As with the `fib` example from class, you should not attempt to parallelize subproblems if the size is under some limit, here called`min_par_size`. For initial testing, I suggest setting `min_par_size` to just 2. For actual performance measurement, raise it (start with 100, experiment to adjust this).
:::
Finally, do some informal measurements of performance. You can use either the `time` function from the class notes, the Unix [`time` utility](https://en.wikipedia.org/wiki/Time_(Unix)) on MacOS/Linux, or [Measure-Command](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/measure-command?view=powershell-7.5) on Windows.
For the `time` Mac/Linux utility, you'll want to use the "real" entry, which corresponds to the "wall clock" time of how much time passed. The `time` function from class and `Measure-Command` both report real/wall clock time already.
:::info
Note: for performance testing, do not include the time to print records in your measurements (you can just temporarily comment it out).
:::
:::success
**Written Q1**: Experiment with different options for `k` for the largest file. On your computer, which option for `k` provides the fastest runtime?
**Written Q2**: Fix `k` at the time you found in **Q1**. Measure the runtime for each of the provided test files. Do you expect the trend between runtime and input size to be linear? Why or why not?
**Written Q3**: Explain how you decided where to insert calls to `await`. Justify your choices.
:::
---
## Part 2: Concurrent bank
The file `bank.ml` contains a simple model for a bank with accounts. A `bank` record contains a mutable list of accounts. Each `account` contains a `cell`, which implements a mutable balance for that account.
The provided code is currently **unsafe**: it does not correctly coordinate access to shared resources across multiple spawned domains.
Initially, the bank only contains a single account. The function `stress_test` runs multiple `deposit` and `withdrawl` calls on the single account.
Your first task is to demonstrate that the current implementation is broken.
The key issue is that the `deposit` and `withdraw` operations are not *atomic*. Clients expect that a `deposit` occurs as a single atomic event that---*all at once*---reads the current balance, adds to it, and updates it. However, the `UnsafeAccount` implements `deposit`/`withdraw` using distinct `cell_get` and `cell_put` operations on the underlying `cell`. If multiple client threads are interacting with the same account, the underlying `cell` operations by one client threads might happen in between the underlying `cell` of another client thread.
Modify the arguments to the `stress_test` function to show this issue resulting in an incorrect balance calculation.
:::info
Run the code from the command line by running:
```
dune exec ./bank.exe
```
:::
:::success
**Written Q4**: In your README, describe the arguments you changed and why the current implementation produces an incorrect balance calculation.
Provide a diagram of which lines of code could execute over time to cause a problem (with time flowing downward). For example, a start might be:
```
Domain 1 Domain 2
line 31: cell_get
line 31: cell_get
line 32: +
```
:::
Next, add logic to **safely** allow multiple domains/threads to withdraw or deposit from an account. You may edit the record definitions.
:::success
Edit the `account` record and the code that uses it such that there is still concurrency, but no incorrect balance risk.
:::
Finally, adapt both the `stress_test` function and the bank record to have multiple domains/threads safely add and remove account.
:::success
Edit the `bank` record and the code that uses it to allow multiple threads to add and delete accounts. Be sure to allow as much logic as possible to run in parallel.
**Written Q5**: Using any online resource(s) of your choosing, look up the difference between "course-grained" and "fine-grained" locking. Which is your code using here?
:::
You do not need to write tests for this part beyond editing the `stress_test` function.
---
## Part 3: Language constructs
:::info
Provide at least a few sentences for each question. Cite every source you referenced below each answer in your README.
:::
Using any online resource(s) of your choosing, read about the `async`/`await` constructs in Javascript.
:::success
**Written Q6**: How are `async`/`await` in Javascript similar to the constructs with the same name in OCaml?
**Written Q7**: How are `async`/`await` in Javascript different than the constructs with the same name in OCaml?
:::
Using any online resource(s) of your choosing, read about the `synchronized` keyword in Java.
:::success
**Written Q8**: Explain the Java `synchronized` keyword in your own words. Relate it to the constructs we learned in class. Do you consider it higher-level or lower-level than the OCaml constructs we used?
:::
# Submission
Submit the following **5 files** on **Gradescope**:
:::success
- `sort.ml`
- `sortfromfile.ml`
- `sorttest.ml`
- `bank.ml`
- `README` as either a `txt`, `md`, or `pdf` file, with **Written Questions** and
- Roughly how long did this assignment take you (in hours)?
- Which was the most challenging part of this assignment?
- Which resources did you find to be the most useful?
:::