owned this note
owned this note
Published
Linked with GitHub
## Deliverables
Report on the current compiler stack considered as a holistic product (Juvix-VampIR-Taiga), considering which parts are strong, which parts are weak, where the open questions are, and what we need to prioritise in the immediate future across the whole stack.
## People
Joshua (lead), Jan, Lukasz
## Links
Matching Experiment doc: https://hackmd.io/@heliax/ByxPTlQOh
## Report Outline
1. Strengths
2. Weaknesses
3. Open questions
4. Priorities
## Notes 2023-06-26
- Disconnection from Taiga product
- How to evaluate interface with Taiga
- Coordinate with Taiga
- Coordinate with Team 5
- What is the interface between Taiga and the vamp-ir/the compiler stack?
- How does GEB come into the picutre?
- Current implementation of Juvix blows up superlinearly with recursion
- GEB should address this to some extent
- Not sure when GEB will be ready
- Virtual machine
- Formalize compiler stack as virtual machine that encodes functions as field elements
- Interpreter understands field elements as functions
- Is it possible to write an interpreter using linear recursion? Yes
- (Is this relevant for the near future?)
- Limitations
- GEB has missing features
- Not using GEB allows us to escape limitations from missing features
- However we have a blow-up problem
- Next meeting should be with Taiga people
## Notes Taiga <-> CSaaP
- Interfaces with:
- VampIR
- From Taiga POV there is nothing but VampIR
- Taiga uses VampIR as a library
- Taiga produces proofs by calling VampIR
- Ongoing work on serialization
- Taiga is instantiated in multiple roles, e.g. wallet, solver, executor node
- Proving happens by "wallet" instaitation of Taiga
- Fields from notes will be represented in a vampir circuit
- Not totally specced but finalizing now
- L: how to encode more complex datatypes?
- v: Juvix can handle encoding, those not using juvix must implement their own method
- v: most fields do not need special encoding
- v: those that do need an ecoding:
- update static, update dynamic
- additional inputs
- strings, decimals/floats, etc
- L: how independently of Juvix can encoding be done?
- V: depends if we push for Taiga or for Juvix
- V: If for Taiga, we would make things easiaer to use without Juvix, and vice versa
- A: Can Taiga interface with Juvix as well as VampIR?
- V: Nothing stops this
- V: VampIR is a way of expressing VPs for Taiga
- L: VampIR is close to the circuit model, so perhaps not the best for representing transparent VPs?
- V: for transparent exec, the VP doesn't need to be in VampIR
- V: not sure how it will look right now
- A: transparent transactions are just like private but everyone has the viewing key for them
- V: transparent transactions executed in the final stage with additional data
- Jan: In order for people to try Taiga, when will we be able to interact with it
- A: There are tests available now, e.g. read a file written in VampIr and create a proof
- L: we don't know how to run Taiga. Not a simple binary to download and run like VampIR and GEB etc
- V: Taiga can run on Anoma testnet
- V: Soon we will be able to try VPs in Taiga
- P: working on a server instatiation of Taiga that accepts JSON
- P: end-to-end Taiga tests are forthcoming
- Jan: why not a CLI?
- P: the server was a requirement of the concern description
# Juvix -> Geb
The Juvix to Geb pipeline consists of the following steps:
* [Lifting](https://github.com/anoma/juvix/pull/1579) `let-rec` expressions into top-level functions
* GEB optimization phase:
- [let-folding](https://github.com/anoma/juvix/issues/1899),
- [argument specialization](https://github.com/anoma/juvix/pull/2164),
- [inlining](https://github.com/anoma/juvix/pull/2036),
- [case folding](https://github.com/anoma/juvix/pull/2229)
* [Recursion unrolling](https://github.com/anoma/juvix/pull/1912)
* Type inference
* Direct translation to GEB's STLC frontend
Currently, only Juvix programs which don't use unbounded data types or arithmetic can be compiled to VampIR through GEB. The required features are still missing in GEB.
GEB 0.3.2 introduces the following changes.
* The STLC frontend no longer requires full type information in terms. The syntax of the terms changed.
* An error node has been introduced which allows to compile Juvix `fail` nodes.
The following features required for compilation from Juvix are still missing in GEB.
* Modular arithmetic types ([GEB issue #61](https://github.com/anoma/geb/issues/61)).
* Functor/algebra iteration to implement bounded inductive types ([GEB issue #62](https://github.com/anoma/geb/issues/62)).
# Juvix -> Vamp-ir
The current direct Juvix to VampIR pipeline (without going through GEB) consists of the following steps:
* [Lifting](https://github.com/anoma/juvix/pull/1579) `let-rec` expressions into top-level functions
* VampIR optimization phase:
- [let-folding](https://github.com/anoma/juvix/issues/1899),
- [lifting calls out of cases](https://github.com/anoma/juvix/issues/2200),
- simplification of if-expressions
* [Recursion unrolling](https://github.com/anoma/juvix/pull/1912)
* [Normalization](https://github.com/anoma/juvix/pull/2038)
* [Let hoisting](https://github.com/anoma/juvix/issues/2033)
* Direct translation to VampIR
The compilation-by-normalization approach suffers from the branching problem, where recursive functions with at least two recursive calls in the body cause an exponential blow-up and thus fail to compile. This is mitigated to some extent by the optimization of lifting calls out of cases, but the problem remains in general.
An [experimental VM-based compilation to VampIR](https://github.com/anoma/juvix/pull/2241) might solve the branching problem, but in general might not be efficient enough in practice.
More details about the direct Juvix to VampIR pipeline can be found in: [Lukasz report](https://heliax.slack.com/files/U03MUDTCR3L/F055JR1B0D6/vampir-pipeline-report.pdf).
## Report Draft
Our compiler stack consists of Juvix, GEB, Vamp-IR, and Taiga. Juvix can compile to GEB and then to Vamp-IR or directly to Vamp-IR. Taiga can then call Vamp-IR functions to create or verify proofs as a part of its execution model.
### Status
Currently, we can take a Juvix program and compile it directly into Vamp-IR (bypassing GEB). Then Taiga can call Vamp-IR functions to `compile`, `prove`, or `verify` using the Vamp-IR circuit generated by Juvix. GEB is bypassed currently due to some missing features that are necessary to integrate GEB with Juvix.
### Strengths
This stack's primary strength is the uniformity in approach between the elements of the stack. The functional/category theoretic approach is present in each compiler, making the interface between elements simpler to maintain and modify. As a concrete example, the Vamp-IR compiler allows the inclusion of higher-order "intrinsic functions" like `map` and `fold` which correspond to those functorial operations in GEB and Juvix. We believe this approach is the right one for meeting Anoma's needs.
Another strength of our stack is that the overall design sticks close to mathematical objects we are trying to use. Juvix describes a simply-typed lambda calculus, Vamp-IR describes arithmetic circuits as systems of polynomials, GEB describes transformations between STLC and polynomials. While perhaps not the most friendly to developer onboarding, this approach allows quick integration of other tools (like Z3 for analyzing arithmetic circuits in Vamp-IR) and known PLT and compiler design techniques.
### Weaknesses
There are a few weaknesses in this compiler stack as it stands today.
1. GEB is not integrated into the stack because of some missing features, namely *modular arithmetic types* and *functor iteration*.
2. There are (mostly minor) changes that need to be made to the interfaces in the Vamp-IR library to accommodate Taiga's needs.
3. The *branching problem* (BP). The BP comes from a "normalization" step in the Juvix -> Vamp-IR compilation sequence. When functions with two or more recursive calls in the body are normalized into a form that Vamp-IR accepts, the resulting circuit contains an *enormous* number of constraints. Even very simple recursive functions of this kind compile to extremely large circuits that fail to compile due to memory constraints.
4. Far from optimal circuits in Vamp-IR
5. There is no end-to-end test of the entire stack, i.e. from Juvix to Taiga.
### Approaches to Addressing Weaknesses
Most of these weaknesses were already recognized before this report and are actively being addressed. The GEB team continues to iterate rapidly and add features. Taiga team members are communicative with the Vamp-IR team about needed features and changes to the Vamp-IR library interface. These improvements are typically minor changes which are quickly completed. The branching problem is more challenging to solve. Integrating GEB may have an effect on the BP but it is unclear at this point how much it will help. Compilation passes performing optimization steps (in the Juvix compiler, Vamp-IR compiler, or both) may reduce the extent of the blow up. Additionally, new compilation strategies such as VM-based compilation may avoid the BP. (See Lukasz's experimental VM-based compilation to Vamp-IR here: https://github.com/anoma/juvix/pull/2241.) The lack of end-to-end testing was identified during the preparation of this report. We are told the Taiga team is already working on a VP testing in Taiga. With an appropriate interface we should soon be able to perform an end-to-end test with a VP written in Juvix compiling through Vamp-IR and executed by Taiga.
### Next Steps and Priorities
The weaknesses above suggest our next steps and how to prioritize them. The issues can be divided into numerous small ones which can be solved relatively quickly and a few larger ones which require time and care. The issues with interfaces are small, numerous, concrete, straight-forward, and not particularly time-consuming to dispatch. They give an immediate pay-off by removing friction in the compiler stack and allowing our teams to test and iterate more quickly. The larger issues (the branching problem, GEB completion, represent major barriers to having a usable compiler stack and are very important, but they require more time and care. Therefore we recommend that the appropriate teams put the majority of their focus on these larger issues so that continual progress is made on them. However, when smaller interface issues arise the large problem(s) can be set aside briefly to dispatch interface issues which reduce the friction of development in the stack as a whole. Alternatively, a single team member or small subteam can focus on interface issues while the larger team remains focused on the bigger issues. Because the larger weaknesses in the compiler stack are also less concrete, teams working on them should take steps to concretize the problem as much as possible so that growth and progress can be seen.
## Final Report