# Report: Project 2 - Exactly-once Key-Value Store implementation
[TOC]
## Intro
In this project we implement a simple client-server system that provides a Key-Value Store service that guarantees exactly once RPC semantics. In essence, a client can send commands (`Get`, `Put`, `Append`) to the server which is to be performed on the key-value store and receive the corresponding results from the server. The server-client setup follows an exactly-once RPC semantics, i.e., each unique command will be executed once, regardless the possible message duplication in network. To achieve the same, we have divided the project into 3 parts - building the KV Store implementation in the server, then first guaranteeing an at-least once semantics from the client, and finishing with implementing the at-most once semantics from the server. The part 2 and 3 combined guarantees the exactly once semantics for the server-client setup.
## Flow of Control & Code Design
### Part 1: Key Value Store Implementation
Here we implement the `KVStore` (Key-Value Store) application that runs on the server. The `KVStore` class uses the `HashMap` data structure from the Java.utils library that holds a map from `String` (key) to `String` (value).
```java
private HashMap<String,String> kvStore;
```
The application accepts a command, performs the corresponding operation on the `HashMap` and finally returns an appropriate result when the `execute` method is invoked by the server. All commands accepted by the application must implement the `SingleKeyCommand` interface.
```java=
public KVStoreResult execute(Command command) {...}
```
All reults returned by the application implement the `KVStoreResult` interface. The application accepts and responds with the following commands and results.
* `Get` -> `GetResult`, `KeyNotFound`
```java=
//Inside execute
if (command instanceof Get) {
Get g = (Get) command;
//TODO: get the value
if (kvstore.containsKey(g.key())) {
return new GetResult(kvstore.get(g.key()));
}
return new KeyNotFound();
}
```
* `Put` -> `PutOk`
```java=
//Inside execute
if (command instanceof Put) {
Put p = (Put) command;
//TODO: put the value
kvstore.put(p.key(), p.value());
return new PutOk();
}
```
* `Append` -> `AppendResult`
```java=
//Inside execute
if (command instanceof Append) {
Append a = (Append) command;
//TODO: append after previous
if (!kvstore.containsKey(a.key())) {
kvstore.put(a.key(), a.value());
} else {
kvstore.put(a.key(), kvstore.get(a.key()).concat(a.value()));
}
return new AppendResult(kvstore.get(a.key()));
}
```
#### Deep Copy
To support transfer of application between server groups in the future, a deep copy constructor is also implemented. This constructor calls a custom deep copy method for the HashMap object as such:
```java=
// copy constructor
public KVStore(KVStore application){
kvstore = application.deepCopyReturn();
}
// custom deep copy method
public HashMap<String, String> deepCopyReturn() {
HashMap<String, String> kvstoreCopy = new HashMap();
for (Map.Entry<String, String> entry : kvstore.entrySet()) {
kvstoreCopy.put(entry.getKey(), entry.getValue());
}
return kvstoreCopy;
}
```
### Part 2: Atleast Once Messaging from the Client
Now that we have implemented a working `KVStore` implementation on the server, we are now going to add the at-least once semantics from the client. The at-least once semantics from the client means that the client should be able to handle any message drops/failure while communicating to the server. Once the error in communication happens, the client should be able to send the command back to the server for at-least once execution on the `KVStore`.
In order to implement this, we have taken the inspiration from at-least once semantics on Transmission Control Protocol (TCP). TCP provides an at-least once semantics by adding an integer in the IP header, known as the *sequence number*. If let us say that the message of a sequence number `A` is lost in the network, the client can re-send the same packet with the sequence number `A` again, until the server acknowledges/sends a packet back with the same or `A+1` sequence number. This is how client gets to know that the server has at least executed till the sequence number `A`. The next packet can then be of the sequence number `A+1` or any other unique sequence number that the client wishes to send. Normally, the clients send monotonically increasing sequence numbers in order to establish the past of the packets that it has already sent.
We have also implemented the same flow in our project. Here is the code on how the client sends a communication to the server :
- The `sendCommand` method is called from the client by the user once the previous result has been returned to the user. The method checks to see if the command object passed implements `SingleKeyCommand`. It then sends out the command to the server with a new sequence number and starts a timeout timer.
```java=
public synchronized void sendCommand(Command command) {
//TODO: send command to server
if (!(command instanceof SingleKeyCommand)) {
throw new IllegalArgumentException();
}
SingleKeyCommand s = (SingleKeyCommand) command;
this.command = s;
reply = null;
this.send(new Request(s, ++clientSequenceNum), serverAddress);
this.set (new ClientTimer(s, clientSequenceNum), CLIENT_RETRY_MILLIS);
}
```
- The `onClientTimer` method is called back on timeout of timer set by client. Upon timeout, the client checks if its still waiting on the previous result; If so, it retransmits the command and sets a new timer.
```java=
private synchronized void onClientTimer(ClientTimer t) {
//TODO: perform action when timer reach timeout
if (command != null && Objects.equals(command, t.command())) {
this.send (new Request(t.command(), t.sequenceNum()), serverAddress);
this.set (t, CLIENT_RETRY_MILLIS);
}
}
```
- When a `Reply` message is received from the server, the `handleReply` command is called back to process the message. It checks if the client is waiting on an answer from the server, and if so, it checks to see the received message has the same sequence number as the one it is expecting. It then updates the result and notifies all threads waiting on this result.
```java=
private synchronized void handleReply(Reply m, Address sender) {
//TODO: check desired reply arrive
if (command != null && clientSequenceNum == m.sequenceNum()) {
command = null;
reply = m.result();
notify();
}
}
```
- The method `getResult` waits for a notification from the message handler if the result not yet available before returning the result.
```java=
public synchronized Result getResult() throws InterruptedException {
//TODO: wait to get result
while (reply == null) {
wait();
}
return reply;
}
```
In summary, the client keeps on retrying until it receives the reply from the server with the same sequence number.
On the server, it simply sends back the response with the same sequence number that it received in the client request.
### Part 3: At-Most-Once Execution on the Server
At this point, we can be assured that a command sent from the client will reach the server atleast once. However, we still need to ensure that a given command is executed at most once. This will guarantee exactly-once semantics.
For this purpose, in this part we will implement a `AMOApplication` wrapper arround an object that implements application to handles At-Most-Once Execution for any given Application. It is designed as a parameterized class:
```java=
public final class AMOApplication<T extends Application>
implements Application {...}
```
In the case of our system, `T` will be `KVStore`.
It holds two attributes:
1. The Application its wrapping:
```java=
private final T application;
```
2. A HashMap mapping client address and most recent result sent to that client.
```java=
private HashMap<Address, AMOResult> recentReplies = new HashMap<>();
```
When the `execute` method of the`AMOApplication` is called, it checks if the command is already executed based on the sequence number of the command received. If the most recently executed command is received, it retransmits the same result; If older, it doesnt respond (Eats the command). When a new command is received, it calls the `execute` method of application it is wrapping (`KVStore`) after extracting the `Command` from `AMOCommand` and updates the `recentReplies` HashMap. The result of `KVStore` is wrapped into AMOResult along with seqence number before returning it.
```java=
@Override
public AMOResult execute(Command command) {
if (!(command instanceof AMOCommand)) {
throw new IllegalArgumentException();
}
AMOCommand amoCommand = (AMOCommand) command;
//TODO: execute the command
//Hints: remember to check whether the command is executed before and update records
if (alreadyExecuted(amoCommand)) {
AMOResult execResult = recentReplies.get(amoCommand.sender());
if (amoCommand.sequenceNum() == execResult.sequenceNum()) {
return execResult;
} else {
return null; // return null to do nothing
}
}
AMOResult newResult = new AMOResult((KVStoreResult)application.execute(amoCommand.command()), amoCommand.sequenceNum());
recentReplies.put(amoCommand.sender(), newResult);
return newResult;
}
//function to check if command is already executed
public boolean alreadyExecuted(Command command) {
if (!(command instanceof AMOCommand)) {
throw new IllegalArgumentException();
}
AMOCommand amoCommand = (AMOCommand) command;
//TODO: check whether the amoCommand is already executed or not
if (recentReplies.containsKey(amoCommand.sender())) {
if (recentReplies.get(amoCommand.sender()).sequenceNum() >= amoCommand.sequenceNum()) {
return true;
}
}
return false;
}
```
In the Server, when a request is received from a client, it calls the `handleRequest` method. In the `handleRequest` method, the server extracts the command & sequence number from the message received and encapsulates them into `AMOCommand`. It then calls the execute function of the `AMOApplication` which returns an `AMOResult` object; from which, `Result` and `sequenceNum` are extracted and wrapped into a `Reply` before it is sent back to the client.
```java=
private void handleRequest(Request m, Address sender) {
//TODO: handle client request
AMOCommand command = new AMOCommand(m.command(), m.sequenceNum(), sender);
AMOResult r = app.execute(command);
if (r != null) {
this.send (new Reply(r.result(), r.sequenceNum()), sender);
}
}
```
## Design Decisions
1. If the client sends the server any command that is not of the type `Get`, `Put`, `Append`, the server will throw an `IllegalArgumentException`.
2. We have chosen Java `HashMap` for our `KVStore` implementation, as it is the best utility in Java to store a key-value store of data.
3. The deep-copy of the `KVStore` is implemented by going through each key and value present in the `HashMap` and then copying the values into another instance of the `HashMap`. This makes sure that we are doing a "deep" copy of the existing `KVStore`. The same approach is applied during the deep-copy implementation of `AMOApplication`, where the `KVStore` is deep-copied by the `KVStore` class deep copy constructor, and the `recentReplies HashMap` for `AMOApplication` is copied key-by-key and value-by-value into a new instance of `HashMap` data structure.
4. The reason why we maintain the client address in the `recentReplies` `HashMap` of `AMOApplication` is to support multi-client connections on the server. The client address lets us store the last sequence number per-client, and then we can send back tailored responses to each client.
5. During unreliability of network, client is simply re-sending the message with the same sequence number. If it has already reached the server, the server is ready with the response and doesn't need to re-compute it. This helps us save processing time on the server and also ensures at-most-once execution of the command on the `KVStore` on the server.
6. We have also stored the response for read-only `KVStore` commands, like `Get`. We have taken this decision after we thought of the following scenario - Let us say client A is trying to get the value of key `K`. The server reads the value of `K` from the `KVStore` and returns back the response. But the response is lost in the network. During the time that client A is trying to re-send the command to the server, client B comes along and updates the value of key `K`. If we would have not stored the repsonse that was initially sent to client A, we would be sending the updated value, when client A's re-transmitted request arrives. This can be thought of a wrong response in many distributed computing algorithms. Hence, we have stored the recent replies of even read-only commands.
7. If the server receives a request with a sequence number lower than the last received sequence number from a client, we are discarding the lower sequence number packet on the server, in order to maintain exact-once semantics. If the server receives the same sequence number, it resends the reply that it had computed when the request had first arrived, as described in Part 3 of the Flow of Control section.
## Missing Components
Currently, from my perspective on the project, there is no missing components for the intended exact once semantic key-value store server-client setup project. We can always enhance it by adding several extra features, for eg., adding fault-tolerance (what happens when a server dies while the client is waiting, or vice-versa), adding scalability (building replicas), etc. With the other projects having been already outlined in the dslabs, there is a whole lot of stuff that can be done with the dslabs framework and we will be doing the same in the future projects.
## References
1. TCP concepts from [Computer Networking: A Top-Down Approach, 6th Edition, by Kurose and Ross](https://www.pearson.com/us/higher-education/product/Kurose-Computer-Networking-A-Top-Down-Approach-6th-Edition/9780132856201.html)
2. Java Docs - https://docs.oracle.com/javase/8/docs/api/
3. For the knowledge of internals of the dslabs framework - https://github.gatech.edu/CS7210-Fall21/project1-intro/tree/main/handout