# Design Doc - Project 2 CS 161
###### Authors: Pulkit Bhasin, Harsh Gupta
Submissions
> A draft design document, and;
> A draft test proposal, consisting of six proposed test cases you’d like to implement
## Data Structures
### struct User
Description: One `User` struct will be created per individual using the file storage system. The username, password, and private_key attributes are pretty self-explanatory. The file_info attribute contains an entry for every file that a user has access to it, and it contains the username of the author/creator of the file and the original filename. sym_keys contains information about the symmetric keys involved in encrypting/computing HMAC for files that the user created.
```
type user struct {
username string
password string
private_key PKEDecKey
sign_key DSSignKey
file_info map[string][][]byte
sym_keys map[string][][]byte
}
```
### struct File
Description: This struct will be used for representing a file on the file storage system. Contents represent file contents.
```
type file struct {
contentsHead ContentsLinkedList
contents ContentsLinkedList
authenticated_users treeNode
}
```
To manage user authentication and heirarchy, we can create a TreeNode struct
### struct ContentsLinkedList
Description: This struct will be used for representing file contents
```
type contentsLinkedList struct {
content []byte
next contentsLinkedList
}
```
To manage user authentication and heirarchy, we can create a TreeNode struct
### struct TreeNode
Description: This tree will represent the users that can access a file. The root of the tree will be the author of the file. Each Node of the Tree will act like a tuple of the User and the filename stored in that user's file namespace, and the children of the Node will be the users a user gives access to.
```
type TreeNode struct {
username string
filename string
children []treeNode
}
```
### struct Invitation
Description: This structure will handle the invitation for every file. Attributes are hopefully self-explanatory from names given descriptions above.
```
type invitation struct {
filename string
to string
from string
original_filename string
author string
sym_keys [][]bytes
}
```
### struct UserFileSystem
Description: This struct contains a map that will have key as the encryption of filename and the corresponding value as the encryption of the file object.
```
type UserFileSystem struct {
files_created map[[]byte][]byte
}
```
## User Authentication
When `InitUser` is called, we created a `User` struct object (defined above) and we store *username* and *password* as string types in the respective attributes of the User struct; the *private_key* attribute of the struct will initially be set to nil; the *file_info* and *sym_keys* attributes will be set to a empty maps respectively. Now, on every call to a User method, we will have access to that `User` object and its attributes.
We can now generate the (public key, private key) pair by calling `PKEKeyGen()`. If the err is nil, we obtain a public key and private key for the user. Let the public key be defined as `public_k` and the private key be defined as `private_k`. The public key will be stored in the Keystore by calling
`userlib.KeystoreSet(username + "-public", public_k)`
Next, we store `private_k` in the `private_key` attribute of the user struct we created initially.
We can now generate the (Sign Key, Verify Key) pair by calling `DSKeyGen()`. If the err is nil, we obtain a sign key and verify key for the user. Let the verify key be defined as `verify_k` and the sign key be defined as `sign_k`. The verify key will be stored in the Keystore by calling
`userlib.KeystoreSet(username + "-verify", verify_k)`
Next, we store `sign_key ` in the `sign_key` attribute of the user struct we created initially.
Since the username has to be unique for every user by definition and the password is good source of entropy, we can use the PBKDF function provided. We will convert the *username* to a byte array by calling `[]byte(username)`. We will convert the user's **password** to a byte array by calling `[]byte(password)`, and we can use this as the first argument in the function. We will pass in `keyLen` argument as 16 - so we can get a 128-bit symmetric key from the function. Let this be **SK1**.
SK1 = `Argon2Key(byte[](username), byte[](password), 16)`
Now, we will use HashKDF to generate another symmetric key. We use **SK1** as the sourceKey parameter. For the `purpose` argument, we create a string value "Encrypt-{username}" where username is the username of the current user. This will output another 64-byte symmetric key. We truncate this 64-byte symmetric key to 16 bytes, since all cryptographic functions defined in this project require 16-byte symmetric keys. Let this be **SK2**.
If err is nil,
SK2 = `truncate(HashKDF(SK1, "Encrypt-{username}"))`where username will be the actual username -- ? Do we add the truncate part
Similarly, we generate another symmetric key using `HashKDF`. We use **SK1** as the sourceKey parameter. For the `purpose` argument, we use string "HMAC-{username}" where username is the username of the current user. We get a 64-byte key that can be truncated to 16-byte. Let this be **SK3**.
If err is nil,
SK3 = `truncate(HashKDF(SK1, "HMAC-{username}"))` where username will be the actual username --? is SK3 truncation or what
Now we can create a (key, value) pair to store the user object on DataStore for persistence and authentication. The Datastore key will be the UUID for the user's username. This can be derived by calling the built-in Go function `UUID.FromBytes` on the hash of the username.
key: `UUID.FromBytes(Hash(username))`
We generate one random number by calling `RandomBytes(16)`. Let this random number be **IV1**. For the `value` attribute on the Datastore, we will then convert the `User struct` into a list of bytes using the `encode/JSON` package. Let this be defined as `user_object`. Now, we can define the value for the Datastore.
value: `SymEnc(SK2, IV1, user_object)||HMACEval(SK3, SymEnc(SK2, IV1, user_object))`
and store it by calling `userlib.DatastoreSet(key, value)`
We must also create a `UserFileSystem` struct corresponding to the user. This represents the files this user has created. Since the user has not created any file, this does not store any sensitive information, and it must be empty, and so, we create a struct with attributes equal to nil (let this struct be **filesystem_object**, and we store it on the Datastore with the following function call:
`userlib.DatastoreSet(UUID.FromBytes(Hash(username + "-filesystem")), filesystem_object)`
When, a user logs in and calls `getUser`, they will enter their username and password. First, we compute `UUID.FromBytes(Hash(username))` and call `userlib.DatastoreGet(UUID.FromBytes(Hash(username)))`. If **ok** is returned to be False, return an **error**. Else, we have obtained our **value**. Re-compute SK1 and then SK2, SK3 using the process defined earlier. If password is correct, the same symmetric keys would be generated since the methods are determinsitic.
Extract `SymEnc(SK2, IV1, Hash([]byte(password)))` from **value**. Let this be `C1`. Compute `SymDec(SK2, C1)` If it is not equal to the password entered by the user, return an **error**. Else, compute `HMACEval(SK3, C1)`. Extract `HMACEval(SK3, SymEnc(SK2, IV1 Hash([]byte(password))))` stored in **value**. Compare the two values by passing them in the `HMACEqual` function. If the function returns `False` , return an **error**. Else, authentication has been completed successfully.
Note: We can extract whatever we need from **value** easily because every component is of a fixed-known length for all users.
This should ensure that a user can have multiple client instances (e.g. laptop, phone, etc.) running simultaneously, since we ensure persistence. All data relevant to a user is stored on the DataStore, and nothing is cached locally.
## File Storage and Retrieval
Let's first consider what happens when a user calls `StoreFile`. First, we check whether a file of this name exists within the user's namespace or not. We retrieve the User struct corresponding to this user from the DataStore, and see whether an entry for Hash(filename) exists in the `file_info` attribute or not.
Create a new ContentsLinkedList struct with content attribute equal to content parameter passed into StoreFile and next = nil. Create a file struct with **contents** = ContentsLinkedList struct created and **contentsHead** = ContentsLinkedList struct created, **authenticated_users** = new treeNode struct with username as username of current user, and empty list for children.
If it does not, then our system generates four new symmetric keys **FileSK1**, **FileSK2**, **FileSK3**, **FileSK4**, and two IVs **IV1** and **IV2** by calling `RandomBytes(16)` six times. Then, a a new entry in the **files_created** attribute of this user's `UserFileSystem` struct will be created with
key = `SymEnc(FileSK1, FileIV1, Hash(filename))||HMACEval(FileSK2, SymEnc(FileSK1, IV1, Hash(filename)))`
value = `SymEnc(FileSK3, FileIV2, file_struct)||HMACEval(FileSK4, SymEnc(FileSK3, FileIV2, file_struct))`
**sym_keys** attribute within the User struct will also be updated to include entry with:
key = `Hash(filename)`
value = `[FileSK1, FileSK2, FileSK3, FileSK4]`
If an entry for Hash(filename) exists in the file_info attribute, we need to first check if the user is the creator of the file, by checking if an entry for Hash(filename) exists within `file_info` attribute, and then if it does, using those keys, we recompute `SymEnc(FileSK1, FileIV1, Hash(filename))||HMACEval(FileSK2, SymEnc(FileSK1, IV1, Hash(filename)))`, and see whether an entry with this key exists in the `files_created` attribute within the UserFileSystem struct corresponding to the user. If this user is the owner, they can just overwrite the file contents using their keys using the method described above. If they're not the owner, they'll need to retrieve the value corresponding to key `UUID.FromBytes(PKEEnc(PK1, Hash(original_filename+author+username)))` where PK1 is user's public key, which can be retrieved from KeyStore. We then compare encrypted part of ciphertext with DSVerify(digital signature). If it matches, we decrypt encrypted part of ciphertext. If it is empty, no acess. If it's not empty, we've received all keys needed to overwrite file using method above.
Next, let's consider what happens when a user calls AppendFile.
First, we check whether a file of this name exists within the user's namespace or not. We retrieve the User struct corresponding to this user from the DataStore, and see whether an entry for Hash(filename) exists in the file_info attribute or not. If it doesn't then error. If an entry for Hash(filename) exists in the file_info attribute, we need to first check if the user is the creator of the file, by checking if an entry for Hash(filename) exists within `file_info` attribute, and then if it does, using those keys, we recompute `SymEnc(FileSK1, FileIV1, Hash(filename))||HMACEval(FileSK2, SymEnc(FileSK1, IV1, Hash(filename)))`, and see whether an entry with this key exists in the `files_created` attribute within the UserFileSystem struct corresponding to the user. If this user is the owner, they can use these keys to append file contents by updating the contents attribute of the file struct in the UserFileSystem struct (can be retrieved and decoded and decrypted from DataStore) by assigning creating a newContentsLinkedList struct (with content equal to content parameter and next equal to nil), assigning contents.next to this new struct and then assigning the content attribute to the new struct.
If they're not the owner, they'll need to retrieve the value corresponding to key `UUID.FromBytes(PKEEnc(PK1, Hash(original_filename+author+username)))` where PK1 is user's public key, which can be retrieved from KeyStore. We then compare encrypted part of ciphertext with DSVerify(digital signature). If it matches, we decrypt encrypted part of ciphertext. If it is empty, no acess. If it's not empty, we've received all keys needed to decrypt file, and then we can append using the method defined above.
This is efficient b/c insertion depends only on size of new contents.
Now, let's consider LoadFile
First, we check whether a file of this name exists within the user's namespace or not. We retrieve the User struct corresponding to this user from the DataStore, and see whether an entry for Hash(filename) exists in the file_info attribute or not. If it doesn't then error. If an entry for Hash(filename) exists in the file_info attribute, we need to first check if the user is the creator of the file, by checking if an entry for Hash(filename) exists within `file_info` attribute, and then if it does, using those keys, we recompute `SymEnc(FileSK1, FileIV1, Hash(filename))||HMACEval(FileSK2, SymEnc(FileSK1, IV1, Hash(filename)))`, and see whether an entry with this key exists in the `files_created` attribute within the UserFileSystem struct corresponding to the user. If this user is the owner, they can just overwrite the file contents using their keys using the method described above. If they're not the owner, they'll need to retrieve the value corresponding to key `UUID.FromBytes(PKEEnc(PK1, Hash(original_filename+author+username)))` where PK1 is user's public key, which can be retrieved from KeyStore. We then compare encrypted part of ciphertext with DSVerify(digital signature). If it matches, we decrypt encrypted part of ciphertext. If it is empty, no acess. If it's not empty, we've received all keys needed to decrypt file, and then we can just combine the `content` attribute of the linked list starting from contentsHead attribute in the file struct in the author's file system, which we can decrypt using the keys we just obtained, and then we return it.
## File Sharing and Revocation
If one user wants to share their file with another user, the user calls the `CreateInvitation` method. First, our system checks whether the user has access to the file using the same method described in the previous section. If the user does have access, our system creates an Invitation struct. First three attributes can be filled in a straightforward manner. For author, orginal_filename, sym_keys, the system will have to retrieve the User struct for this user from the DataStore, decode it, and then extract them from file_info[Hash(filename)] and then populate them within the Invitation struct. We then retrieve the recipient's public key from the KeyStore. Let this be defined as PK1. We also extract the user's provate key from its User struct. Let this be PK2. After this, a DataStore entry will be created with
key = UUID.FromBytes(PKEEnc(PK1, Hash(original_filename+author+receiver)))
value = PKEEnc(PK1, invitation_struct)|DSSign(PK2, PKEEnc(PK1, invitation_struct))
The method returns the key.
Within AcceptInvitation, we first recieve the invitation pointer. Then we try and retrieve the DataStore entry for that key. If none exists that means there was tampering. If some exists, we retrieve the value. We retrieve sender's DS verification key from the KeyStore (let this be PK3), and then, we compute DSVerify(PK3, DSSign(PK2, PKEEnc(PK1, invitation_struct))). If it is equal to PKEEnc(PK1, invitation_struct) in value, no tampering has occurred. Then, we decrypt PKEEnc(PK1, invitation_struct) using the recipent's private key (which we can retrieve from User struct from DataStore), and then we can update `file_info` attribute of the User struct with information regarding this file, and then, we update restore User struct in DataStore. We must also update authenticated_users within the file struct within the user's object, which we can do b/c we have all information necessary.
After revocation, the owner sets DataStore entry for key UUID.FromBytes(PKEEnc(PK1, Hash(original_filename+author+receiver))) - where PK1 is public key of revoked user, PK2 is DS Sign key of owner - to `PKEEnc(PK1, [])|DSSign(PK2, PKEEnc(PK1, []))`, and then, it does the same for ever descendant of the revoked user, and it generates new keys and updates entry for all users who should still have access and for the sym_keys in its own User struct. This is done using the `authenticated_users` tree in the file struct in the filesystem struct. The tree is also updated accordingly to prune branches which no longer exist. Since they won't have access to keys, they won't be able to do anything.
## Helper Methods
### `truncate(input []bytes, len int)`
This function truncates this list of bytes to the len bytes. This will be used to truncate the 64-byte keys that we get to truncate to 16-bytes keys that will be used throughout the project implementation.