OMR Aztec update 1

# OMR Aztec update 1 Update on different components of OMR, performance values, and suggestions for integration. **Keys:** All size values are without any compression/optimisation. 1. Detection key size is around 19.1 MB. User needs to upload this once to the server. User can update detection key later without having to change their clue keys or rendering existing clues useless. Detection keys need to only be re-uploaded when parameters of OMR are modified. 3. Clue size is 971 bytes. This needs to be included with every transaction sent to the server. I think clue size can further be reduced to 956 bytes, as mentioned in paper. Back of the paper calculation only requires 964 bytes, which can further be improved upon by not using 17 bits to represent values that are in range 2^16 + 1. 4. Clue key size is 136KB. This is used by others to generate clues for transactions pertaining to user. Users can have multiple un-linkable clue keys without having to update their detection key. 5. Clues and clue keys will remain valid as long as we don't require modifying false positive rate. Changing other params of OMR won't render clues/clue keys invalid. 6. Message digest is 114.9KB. User will retrieve this from server to find pertaining transactions. **Performance:** For transaction set size containing 2^14 transactions with transaction size 256 bytes, it takes 115s on `apple-m1` and 220s on `x86 with AVX512` (both using 8 cores) to generate message digest for a user. I find 220s on x86 more than what I expected. If I restrict to just 1 core, it takes around 820s on apple-m1 and 1020s on x86. Increasing cores to 8 should decrease time by 8. Ideally with 8 cores, time on x86 should be around 127.5s, but for some reason this isn't the case. I am not sure of what am I missing, any suggestions ? I am using `c5n.2xlarge` ec2 instance for tests, if it helps. Edit 1: Due to Miscellaneous point (1) it takes 170ms on x86 now. Edit 2: Time is down to 143s on x86 without caching (ie Miscellaneous point (1)) and 130s with caching. This improvement is due to integrating HEXL library for fast operations on Avx512. However note that on single core it takes 636s (without caching), but 143s on 8 cores. With 8 cores performance should improve by 8, implying 143s to be 80s instead. I measured performance of hexl on multi-threaded environment and noticed that performance improvement isn't proportional with number of cores in use. I have asked folks behind the library as to why that is that case. I suspect that there's something missing in the way I am handling rust bindings for Hexl (i.e. Hexl is only available in C++). Will look into this further. Edit 3: Related to Edit 2. I had been testing on the wrong machine with only 4 real cores (ie m6i.2xlarge). After switching to machine with 8 real cores (ie m6i.4xlarge), performance on x86 is down to 80s, as expected. **Performance and false positive rate:** With the current parameters, if I understand calculation correctly, the false positive rate is 2^-14. However, for same params, paper states false positive rate as 2^-21. I might be missing to include some necessary value in calculation. Maybe, someone more mathematically inclined can help me with calculation ? None the less, decreasing false positive rate will increase performance and decrease clue, clue keys, detection keys sizes. So, depending on situation, if we are comfortable with lower false positive rate we can opt for it. **User side:** WASM bindings for functions necessary for client side processing works. Clients should be able to generate secret keys, detection keys, public keys. However, we still need to test performance of bindings on client side. What performance tests will be useful in aztec's sdk context ? Users need to store 2 things locally: BFV secret key, PVW secret key. BFV secret key is an array of 2048 i64 values. We can compress them further since each value is sampled from gaussian distribution with variance 10. But I am not sure of the proper way to achieve compression. PVW secret is an array of 1800 unsigned values in field mod 65537. So the maximum size it can have is 1800 * 17 bits. Users generate detection key and Clue key locally. It will be better if users upload detection key to server using some mixer to hide their IP. However, I am not sure of the user-flow for this. Do you have any suggestions? Or shall we skip this for prototype and not enforce this on the user? Another concern related to detection key is, how does the server differentiates between valid and fake users? One way, as discussed before, is user proves that they belong to a set of subscribed users and use hash of detection key to identify unique users. However I am unsure of what will be best way to generate membership proof in this case. What do you guys think ? Another option is to have normal authentication. But I don't find it as good as the one above since it requires server to associate user with unique identity. I should note that detection keys, clues, and clue keys are un-linkable. One cannot lead to another. A server that knows to whom does a clue key belongs cannot link the identify with its detection key. Nor anyone provided with identity of user to which detection key belongs can link it with their clue key/clues. **Server side:** Server processes phase 1 for all received clue+transaction combination (I refer to clue+transaction combination as message) in sets at regular intervals for every detection key (i.e. every user) and stores the result. After phase 1 we have several ways to generate message digest and handle user queries. 1. Generate message digest right after phase 1 and store it: The digests in this case will correspond to the sets for which they were generated, and will likely be timestamped. For example, digest `detection_key_hash/10am-11am.bin` is for user that owns detection key corresponding to `detection_key_hash`. Digest stored in `10am-11am.bin` consists all pertaining messages for user in the set of all messages received between 10am and 11 am. User can request the digest using detection key hash and time-interval . This is low latency, since the server only needs to send back message digest to user upon request. However, since we are pre-processing the digest we have to fix `k` , that is the maximum number of messages pertaining to user within a set. If more than `k` messages are pertaining to user within any given set, the user will not be able to extract messages from corresponding message digest. An adversary can take advantage of this by sending several small transactions to a specific user such that `k` limit is exceeded. Other ways mentioned below overcome this by having user specify `k` within request. One way to overcome spam, is to have zk proof accompanying every clue+transaction combination that proves that transaction is of above some certain value. This way spammer cannot spam forever without losing a good amount of value to the victim. But I am not sure of how this will prevent an highly motivated adversary that wants to block their victim from being notified of any of their transactions. 2. Generate message digest right after phase 1 but with generously high value of `k` and store it: This is like (1) but generates message digest with a high value of `k`. Size of message digest, and client processing time, is somewhat proportional (grows in intervals of degree of ciphertext polynomial) to value of `k`. This means, sending message digest to user with high `k` isn't optimal. Hence, user requests digest with `k'` that is `<= k` and server performs a bit of computation on digest generated for `k` to reduce it to a digest for `k'`. The computation is very cheap, so shouldn't increase request latency by much. However, this still limits `k` to a maximum value. Since `k` is publicly known, any motivated adversary can block out a user from receiving their pertaining transactions. 3. Don't generate message digest after phase 1 and generate it upon user request. This allows user to specify `k` within request. Upon receiving the request, server uses pre-processed result from phase 1 and generates message digest with received `k` for the user. Generating message digest is cheap and should not increase request latency by much. Moreover, this allows user to be flexible in terms of interval and still receive a single digest. For example, user can make request as: `send me pertaining messages in last 24 hours AND I expect them to be around 100 at maximum`. This is better for users compared to (1) and (2). In (1) and (2) upon requesting pertaining messages in last 24 hours, user will receive multiple digests corresponding to each interval in duration of 24 hours, not just a single digest. This approach leaks to server the number of messages a user expects to receive in a given interval. User can obscure this information by sending random requests to server with random `k` values. Another caveat in this approach is user cannot learn whether there has just been a spike in messages sent to them. In this case, user will first request digest for their expected `k` value. Only after receiving the digest for `k` and failing to extract messages from it, user will learn that they have received more than expected messages. To retrieve all messages they will have to send another request, maybe followed by few more, to the server with a higher value of `k`. By observing increase in value of `k` in subsequent requests from a user, server can learn that the user has received atleast `k` messages or user has suddenly become popular. Learning this can dangerous. For example, Ukraine donation addresses would have seen sudden spike in transactions right after war broke out. To figure out right value of `k` they will be forced to make repeated requests to server with higher `k` in subsequent requests. This will make it easier for server to isolate their detection keys as detection keys belonging to Ukraine donation addresses. Users can hide behind mixer when sending subsequent request to prevent linkability of queries using IP address, but server can still link the request using detection key hash. Approach (4) overcomes this by using two servers. 4. Use two servers, first to determine correct value of `k` and second to retrieve digest. First server runs phase1 (ie only detection) and generates a light digest for user. User requests the light digest from server to determine correct value of `k`. Second server, like (3), generates message digest upon receiving `k` within request from user. For example, if user wants to retrieve all pertaining messages in last 24 hours. They will first send request to server 1, requesting total pertaining messages count as light digest. After learning total message count they will request message digest for suitable `k` from server 2. Light digest from server 1 should be quite smaller compared to message digest from server 2 (with current parameters it is around 14.1 Kb). The 2 servers must not be trusted since user will upload two different detection keys. Detection keys are un-linkable, this means requests to two servers cannot be correlated by colluding. We can stick to a single server for both requests and this will save us from computing phase 1 twice. But the single server will be able to correlate the two requests. I categorise (3) and (4) as stream based approaches, since they generate message digest upon user request as opposed to (1) and (2). (3) and (4) have higher storage requirements than (1) and (2), since the former have store pertinency ciphertext for each message and latter only needs to store message digest. However, I am more inclined towards (4) due to its stronger privacy guarantees for users. Assuming storage is cheap, generating message digest on the fly with pre-processed phase 1 shouldn't add a lot of request latency. What do you guys think ? What more performance values, data should I collect to assist us with deciding what's right for aztec ? On integration with rollup server. I think users should send clue+transaction to the rollup. If this is the case, OMR server will need someway to read received clue+transaction. **Clue key distribution** Clue key size is 136 kb. Sender should obtain recipient's clue key before sending the transaction. Is it possible to include have clue keys as an extension to existing address? Or are individuals responsible for making their clue keys public ? For example a public endpoint from where anyone can retrieve their clue key? **Message digest** | Msg count \ Tx set size | <= 2^15 | 2^16 | 2^17| | -------- | -------- | -------- | -------- | | 10 | 43 | 57.4 | 86.8 | | 50 | 114.9 | 129.2 | 158 | | 100 | 201 | 215.4 | 244.2 | | 200 | 373 | 387.8 | 416.5 | 1. All values are in Kb 2. First column represents max. no. of messages within a digest (increasing order from top to bottom) 3. First row represents tx set size (increasing order from left to right) Formula: size_in_bytes = `(ceil((128 * m * 2) / ct_degree) + ceil(set_size / 2^15)) * 14364` where, m: max. no. of messages in digest set_size: transactions set size ct_degree: 2048 (for parameters in use) **Miscellaneous** 1. Caching rotations of pvw secret keys ciphertexts to avoid rotations in `decrypt_pvw` fn, improves performance of `decrypt_pvw` by 20 times (20s -> 950ms). Since we have traded off memory for performance, for bigger ciphertext size this might not be practical. In that case, we can do a combination of caching + rotations to minimise computation at runtime.