# Research Proposal for Oblivious Message Retrieval ## Applicant background The main proposer is [Zeyu (Thomas) Liu](https://zeyuthomasliu.github.io/), a masters' student and Graduate Research Assistant at Columbia University, who is applying for a scholarship to support his ongoing academic research on this problem and his collaborations for bringing it to practical integration. Thomas’ research interests broadly lie in the field of cryptography, including lattice-based cryptography and blockchain-based protocols, as exemplified by his recent works on fully homomorphic encryption schemes and their applications. The proposal is submitted jointly with [Eran Tromer](https://tromer.org/), who will collaborate with Thomas on the academic research and will advise on execution and integration, but is not requesting support. Eran is an Associate Research Scientist at Columbia University, a coauthor of the [Zerocash paper](http://zerocash-project.org/), a founding scientist of the Electric Coin Company (he has [donated](https://electriccoin.co/blog/ecc-owners-approve-donation-to-bootstrap-project/) all of his interest in it), and formerly an advisor to the Zcash Foundation. ## Description of Problem or Opportunity - Efficient private node detection/retrieval has long been a problem on Zcash light clients. - The recipient wants to know and retrieve the transactions that are addressed to them. - If the recipient has only limited computation power (e.g., on mobile chips), the recipient may want to outsource this detection/retrieval process to some server, while maintaining privacy (i.e., the server should not know which transactions are for the recipient) - When the server is sending back the result of detection/retrieval as a digest, ideally, we want the digest to contain only relevant information (e.g., for retrieval, only all the transactions addressed to that recipient) - Currently, Lightwalletd is sending part of each node back to the recipient and letting the recipient use trial decryption to identify which transactions are addressed to them [ZIP-307](https://zips.z.cash/zip-0307) - Cost is 116 bytes per node - For the recipient, the computation cost is hundreds of microseconds per transaction (on a Compute-optimized GCP instance) - It is for detection only, and, therefore, the retrieval process is still not done privately - Importance evidence: - Zcash developers [[Hor20]](https://defuse.ca/downloads/Fixing%20Privacy%20Problems%20in%20the%20Zcash%20Light%20Wallet%20Protocol.pdf) deem it an “action item” that the “lightwalletd server” learns which transactions belong to the wallet” ## Proposed Solution - Oblivious Message Retrieval (OMR) - A protocol, developed by Eran and Thomas, that can allow the recipient to outsource all the work to the server, while remaining fully private - [Detailed paper](https://eprint.iacr.org/2021/1256.pdf) - [Preliminary demo implementation](https://github.com/ZeyuThomasLiu/ObliviousMessageRetrieval) - Potentially, all the Zcash users can use this protocol to privately retrieve their transactions - Integrating OMR with Nighthawk Wallet - Providing support for building a Zcash SDK integration with OMR library - Providing server side libraries for integration with lightwalletd - All the users of the Nighthawk wallet will be able to use OMR to privately retrieve their transactions ## Solution Format - C++ based libraries (published under an open-source license, tentatively MIT) - OMR library - Library of improved OMR schemes (e.g., group OMR) - Demos of integration with Zcash transactions - Academic papers ## Technical approach ![image info](https://raw.githubusercontent.com/ZeyuThomasLiu/ObliviousMessageRetrieval/master/systemModel.jpg) ### OMR research: - Current Version: - The figure above shows the high-level picture of our protocols. - The recipient holds a detection key and a clue key, both public. The sender generates a clue using the clue key and attaches the clue with the transaction. - The server (detector) uses the detection key to compress the transactions into a small digest and send them to the recipient. - Full protocol in: https://eprint.iacr.org/2021/1256.pdf - Implement the open-sourced OMR library using C++, and make the library easy to use for developers - Group OMR - Group version of OMR, serving for an ad-hoc group of recipients instead of processing each recipient’s clue separately, thereby improving efficiency - We also use the group OMR to improve the efficiency of regular OMR (Expecting ~3-5x faster on the server’s side, but 2~3x larger for the digest size) - Implement the group OMR schemes - Other improved versions of OMR - E.g., hardware accelerated - Implement the improved version OMR based on OMR library - Stronger privacy guarantee - E.g., proving strong snake-eye conjecture generalization to PVW encryption system (see details in section 8.3 in [our paper](https://eprint.iacr.org/2021/1256.pdf) for the proofs we have now) - E.g., stronger security parameters with practical costs (e.g., 192-bit security and 256-bit security) ### Prototype integration with Nighthawk wallet (on ongoing basis): - Working with the Nighthawk team to integrate OMR with their Android codebase, first as a simple stand-alone proof-of-concept, and building towards end-to-end integration with the Nighthawk Wallet & lightwalletd. - Note that the scope of this grant does not include the Nighthawk side of the changes, but rather, supporting that integration by adapting the library API, software implementation, and cryptographic scheme to the requirements identified by Nighthawk. **Additional integration considerations are discussed in detail in Section 11 of [our paper](https://eprint.iacr.org/2021/1256.pdf)** ## How big of a problem would it be to not solve this problem? The node detection/retrieval will remain non-private, susceptible to surveillance, and computation/bandwidth-heavy for the light clients. ## Execution risks - Since the current OMR scheme is already developed and demoed (paper and experimental results available), the risk for the first several steps (including OMR library and stand-alone offline demo of integration) is relatively small. - The group OMR is also researched and discussed by Eran and Thomas from a theoretical perspective, so the risk of having a workable group OMR scheme and implementing the prototype would also be relatively small. - The risk comes from the next several steps including testing the group OMR library and further improving OMR and group OMR (which is an open-ended research question). - Further experiments and analysis may reveal new performance bottlenecks - Integration into the Zcash blockchain transaction format will require suitable encoding conventions, and perhaps incompatible transaction-format changes, that ultimately (for main-net deployment) need to be specified and accepted as ZIPs. This grant includes collaborating with the Nighthawk team, and any other interested parties, on designing and implementing these changes on the OMR library side. ## Unintended Consequences - The privacy guarantees of our OMR construction rely on a computational assumption (hardness of the lattice-based Ring Learning With Error problem) which, while well-studied and widely believed, is different from the elliptic-curve computational assumptions currently used in Zcash. If the RingLWE assumption is falsified, then metadata leak may occur from transactions that utilize the OMR mechanism. - If the transaction format is modified, via a suitable ZIP, to include clues in new designated fields (rather than reusing existing ones), then a network upgrade will be required and may be disruptive. ## Evaluation plan Delivery, for each milestone, of the described code (published under an open-source license, tentatively MIT), or a detailed cryptographic scheme description (presented as an academic paper or preprint). ## Schedule and Milestones **This predicted schedule is informed by the timeline of the OMR project to date, which started on Feb 2021 and took ~9 month for Eran and Thomas to develop and refine the schemes, write a paper, and complete a demo implementation.** - Milestone 1 **[Jan 2022 - Mar 2022, 3 month]**: develop usable OMR library Convert the current benchmark/demo into a developer-friendly C++ library, with API guided by requirements from Nighthawk and other wallet developers - Milestone 2 **[April 2022 - June 2022, 3 month] or [April 2022 - Sept 2022, 6 month, *in parallel* with Milestone 3]**: Stand-alone offline demo of using OMR to retrieve Zcash transactions - Milestone 3 **[July 2022 - Sept 2022, 3 month]**: design “group OMR” constructions that serve for an ad-hoc group of recipients instead of processing each recipient’s clue separately, thereby improving efficiency - Milestone 4 **[Oct 2022, 1 month]**: design an improved scheme that reduces detector computation cost using a “group OMR” construction that combines the clues for multiple recipients - Milestone 5 **[Nov 2022 - Dec 2022, 2 month]**: implement and test the “group OMR” scheme - Milestone 6 **[Jan 2023 - Feb 2023, 1.5 month]**: implement and test the improved OMR based on “group OMR” schemes - Milestone 7 **[Feb 2023 - Mar 2023, 1.5 month]**: stand-alone offline demo of using improved OMR to retrieve Zcash transactions - Milestone 8 **[April 2023 - May 2023, 2 month]**: further improvements to the performance, functionality or security of the scheme and its implementation, based on currently-preliminary ideas and on progress in FHE implementations and techniques. Details TBD as this is focused but open-ended research. Additional goal on an ongoing basis, but not mandatory milestone as a major amount of the work is out of the scope of this fellowship **[July 2022 - Mar 2023, 9 month *in parallel* with milestone 3-7]**: real integration of OMR with Zcash walletAdditional goal on an ongoing basis, but not mandatory milestone as a major amount of the work is out of the scope of this fellowship **[July 2022 - Mar 2023, 9 month *in parallel* with milestone 3-7]**: real integration of OMR with Zcash wallet The deliverable of each stage will be the corresponding code (published under an open-source license, tentatively MIT), or a detailed scheme description (presented as an academic paper or preprint). ## Budget Totally requesting $98k, divided into $90k for scholarship and $8k for initial costs; details are follows: Requesting $90k for academic scholarship for Thomas, to support this activity as part of his graduate studies for one academic year. It is to be paid as a research grant to the university where he will conduct this work — tentatively Columbia University, but depends on Ph.D. admission, which will be ascertained in the forthcoming months. Work will commence immediately, but payment may be deferred until admissions are finalized. This amount is a typical cost of one academic year of Ph.D. at research-focused U.S. universities, including the universities’ mandatory indirect costs to support research facilities and services. Requesting $8k for the initial cost (environmental setups), including GCP instance for 1.5 years + hardware including computer, monitors, and other necessary accessories + necessary overhead (e.g., taxes including income tax and sales tax). Please note that a z-address is included only for the enviromental setup phase (and potentially public donations), but the main intent of this application is for the $90k ZOMG funding to be made directly to the university as a grant to cover Thomas' graduate student position.