###### tags: `S-Juvix` `Juvix` `juvix-meeting-notes`
The Why of S-Juvix
==================
Juvix as a language is type theoretic in nature. In particular, it
wishes to leverage dependent type theory to make safe and fast
validity-predicates. In order to properly achieve this goal, it is
important to make a high level language that is fit for the domain. Thus
the following properties become of the utmost importance for Juvix.
1. Correctness
2. Speed
3. Ergonomic Contract Code
Points `1.` and `2.` are mostly covered by the dependent type theory,
with parts of `2.` being an implementation detail of the `QTT` logic
with practical compromises in the IR to allow impure data structures.
[Even point `3.` is somewhat satisfied if we ignore the boilerplate that comes
in trying to use Haskell/Idris style abstractions when it comes to
validity predicates. (See the link as to where this falls apart quite a
bit)](https://linear.app/anoma/issue/JUVIX-689/juvix-macro-general)
However, as we begin to write an implementation with compiling to a type
theoretic `IR`, we have noticed a lot of trouble in development
1. The lack of Core being able to extend itself, leaves the requirement
for a more complex front pass
- There are only a couple of ways we can do this in Haskell
1. Write out concrete data structure between each pass, and
write a small pass between each structure
2. Use `Tagless Final` to make changing a big AST less
cumbersome
3. Use `Trees that Grow` to simulate small changes within an
AST
4. Give up on small changes, and bundle together `n` changes in
one pass.
5. Use Generics to automate the structural changes.
6. Use S-expressions to change the structures
- The issue with `1)` is that it incurs a code cost of the length
of the structure you are working over. This means that for any
unchanged part of the syntax, we must map it to itself recursing
down.
- The issue with `2)` is that it becomes `O(n)` (find link to
oleg\'s FTP site) to do any transformation which requires
matching on arguments. Along with complications of forming the
`observation` type in Haskell, make it a less compelling case
than other options.
- The issue with `3)` is that it only automates forming the
data-type itself, not any pass. Thus to make a transformation,
it takes as much code as `1)`. With `5)` the situation may
improve but we will discuss it when we discuss `5)`.
- The issue with `4)` is that we now have to bundle `n` individual
passes into one macro pass. This greatly simplifies the issues
with the other passes, as we have very fixed layers of data
types that we can scrutinize, however this leaves any proving or
reasoning about the passes themselves much harder.
- The issue with `5)` is that we are leveraging on a system that
splits information evenly at the type level and the term level.
We can demonstrate this phenomenon by using `to` and `from`
directly to see the representation this would take even for some
very basic terms.
```haskell
-- Getting the term from a basic structure
λ> Juvix.Library.from (Other 3)
M1 {unM1 = L1 (L1 (L1 (M1 {unM1 = M1 {unM1 = K1 {unK1 = 3}}})))}
-- Checking the type information which allows us to get back to a
-- concrete type
λ> :t Juvix.Library.from (Other 3)
Juvix.Library.from (Other 3)
:: Num a =>
D1
('MetaData "BinderPlus" "Juvix.Contextify.Binders" "main" 'False)
(((C1
…
λ> :t Juvix.Library.to
Juvix.Library.to :: Generic a => Rep a x -> a
λ> :t Juvix.Library.from
Juvix.Library.from :: Generic a => a -> Rep a x
```
Here we can see that for the constructor `Other`, we have 3 `L1`
structures and two `M1` structures wee have to get rid of before
we can get to the data. Worse if we look at the type level, we
can see that the entire ADT is reserved through this. So if we
wanted to create a pass that got rid of the boilerplate,
translating from one ML data type to another, we can not just
call `to` and `from` as the types are incompatible, even if we
changed the subset of cases we wish to change. Thus we\'d have
to reconstruct the `M1's` and `L1's` to get back a proper typed
representation. This means that we have to compute on the `Rep`
representation of the type during the pass, calling `to` once
all cases have been run, effectively ending up with the same
amount of boilerplate at 1.
One approach that might work is combining this with `3)` to
properly generate these transformations. However this would take
more template Haskell to properly experiment with.
- The issue with `6)` is that we\'ve expressed the passes as a
bona fide macro-expansion, therefore proving that they can
indeed be done within a language. Thus instead of working on
getting the language to a state where it can support such
expressions and centralizing work upon this effort, we put extra
complexity on getting a large frontend language to meetup with
the small core. Thus, infrastructure becomes an issue as we need
proper tools to properly conduct these transformations.
2. There is a large disconnect between Core and an ergonomic frontend
puts extra cost upon the transformation.
- An illustrative point of this is by seeing how many different
kinds of ideas each can expressed. As by the calculation in
`Appendix A` we can see that Core expresses a total of 36
different types of ideas that can be fulfilled, whereas as the
Frontend language exhibits a total of 65 ideas that can be
expressed. Further, the forms of the Frontend language is more
complex, thus forcing us to map them down into a much smaller
core.
3. What Core outputs is not an effective compilation target.
- We can see this in that many high level ideas go through core
such as
1. Lambdas
2. Pattern Matching in global function definitions
3. List style records (not put together in an array
representation for compilation)
4. Deconstructors expressed in terms of higher order functions.
- What this means is that we have a core language that then
requires yet another series of passes to properly compile to a
suitable backend.
In order to tackle these complexities to development and to uphold the
ideals of Juvix Ι propose the language that will become Juvix in time,
S-Juvix.
S-Juvix is an un-typed Lisp with the following properties:
1. It\'s minimal
2. It\'s a fast compile target
3. It can extend itself
4. It is easy to apply type theory onto
point `2.` combined with `1.` covers the awkwardness of making a
compiler for it, while also satisfying fast output code.
Point `3.` is a direct answer to the complexity issues we\'ve faced in
development, while also pushing for the ability to make ergonomic
contract code.
Lastly, point `4.` is of special importance as this is what will allow
the language to be typed by any direction what our type theory takes us.
It further lets our type theory try out various ideas without throwing a
wrench in the rest of the compiler.
With the preface to the language lay out, let us introduce the full
language standard
The S-Juvix Language
====================
S-Juvix is a rather standard low level lisp, offering very little
builtin definitions. Below we will briefly describe all components of
the base language. For a more complete specification, view `Appendix
B`.
- The builtin data-structures of the language are:
- Arrays
- Lists
- Hash-tables
- Fixed length products
- The Image
- Modules (derived from Hash-tables)
- Symbols
- Numbers
- The builtin evaluation functions are:
- (eval \<form\>)
- (macroexpand \<form\> \<recursive-or-not\>)
- (apply \<form\> \<list\>)
- The main definition functions are:
- (def \<name\> \<body\>)
- (lambda \<binder\> \<body\>)
- (def-local \<name\> \<body\>)
- The usual suspects with lisps are:
- (gensym \<name\>)
- (read \<value\>)
- (eq \<item-1\> \<item-2\>)
- (quote \<form\>)
- (do \<body\>+)
- Branching primitives:
- (case \<number-or-symbol\> \<body\>+)
There are of course more functions (mainly in regards to manipulating
builtin data-structures), however we can notice that the language itself
is quite small. Even the given base forms are quite low level, with the
`def` forms lacking even optional arguments to take!
Thus to make a function that takes any amount of arguments, we\'d have
to write.
``` {.commonlisp org-language="lisp"}
(def factorial
(lambda (x)
(case (>= 0 x)
:true 1
:false (+ x (factorial (- x 1))))))
```
Which is not very ergonomic, as our main way to define a function is not
able to take any arguments, we can derive it with the functions derived
in `Appendix D`.
``` {.commonlisp org-language="lisp"}
(defmacro defun (name arguments body)
`(def ,name (lambda ,arguemnts ,body)))
(defmacro if (pred then else)
`(case ,pred
:true ,then
:else ,else))
(defun factorial (x)
(if (>= 0 x)
1
(+ x (factorial (- x 1)))))
```
Now, we can see that the factorial function is defined like we\'d expect
in a high level language.
Even the function/macro `defmacro` is defined from `macro` which is
defined from `def`. Due to how this language can grow, we can from a
small core, derive a full blown ergonomic high level language.
Deriving Types in S-Juvix
-------------------------
Typing is traditionally done by making sequent calculus rules and
deriving the syntax one wants from there. Likewise we can apply typing
rules upon the primitives of this language. Thus if we can do this, then
all that remains to be seen, is how would we extend the language with
enough information to support explicit declarations and places for the
typechecker to insert type information.
Thus I present one example strategy for including type-signatures with
global definitions
``` {.commonlisp org-language="lisp"}
(defmacro sig (name type)
(let ((name-generated
(gensym (symbol-append name `signature))))
`(do
(def ,name-generated ,type)
(set-meta-data ,name
(insert (meta-data ,name)
:type
,name-generated))
(set-meta-data ,name-generated
(insert (meta-data ,name-generated)
:siganture
:true))
,name-generated)))
```
In this example, we abuse the fact that each global definition has a
meta-data slot, we use this to refer to the new signature. We also
define the signature as a normal declaration. We can see an example
usage here
``` {.commonlisp org-language="lisp"}
(sig factorial (-> int int))
(defun factorial (x)
(if (>= 0 x)
1
(+ x (factorial (- x 1)))))
```
Thus, the syntax of the language can be extended properly with a typing
system that Juvix Core will manipulate.
More examples of this kind of extension can be found in Appendix C.
The S-Juvix Abstract Machine
----------------------------
### Compilation Considerations
Appendix A Size Calculations
============================
``` {.haskell}
-- Core
λ> import qualified Generics.Deriving.ConNames as ConNames
λ> length (ConNames.conNames (undefined :: Term Int Int Int))
20
λ> length (ConNames.conNames (undefined :: RawGlobal Int Int Int))
4
λ> length (ConNames.conNames (undefined :: Elim Int Int Int))
5
λ> length (ConNames.conNames (undefined :: Pattern Int Int Int))
7
λ> 20 + 4 + 5 + 7
36
-- Frontend
λ> length (ConNames.conNames (undefined :: Expression))
25
λ> length (ConNames.conNames (undefined :: TopLevel))
10
λ> length (ConNames.conNames (undefined :: MatchLogicStart))
4
λ> length (ConNames.conNames (undefined :: NameSet Int))
2
λ> length (ConNames.conNames (undefined :: Header Int))
2
λ> length (ConNames.conNames (undefined :: Computation))
2
λ> length (ConNames.conNames (undefined :: Constant))
2
λ> length (ConNames.conNames (undefined :: Numb))
2
λ> length (ConNames.conNames (undefined :: Arg))
2
λ> length (ConNames.conNames (undefined :: GuardBody Int))
2
λ> length (ConNames.conNames (undefined :: Product))
3
λ> length (ConNames.conNames (undefined :: Adt))
2
λ> length (ConNames.conNames (undefined :: Name))
2
λ> length (ConNames.conNames (undefined :: Data))
2
λ> length (ConNames.conNames (undefined :: InfixDeclar))
3
λ> 3 + 2 + 2 + 2 + 3 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 4 + 10 + 25
65
```
Appendix B Base Language
========================
``` {.commonlisp org-language="lisp"}
;; Definition Facilities
(def <symbol> <body>)
;; we may want to unify all "marking" functions under some kind of set
;; system
(mark-def-as-macro <symbol>)
;; we may want a separate local def form. That or rely on inlining
;; being quite good.
(lambda <name> <body>)
;; May not be needed, but is handy
(def-local <name> <body>)
;; Builtin Types
list (immutable)
array (mutable)
hash-table (mutable)
;; Reader extensions are omitted
;; Image commands
(save-juvix-and-die <location>)
;; evaluation
(eval <form>)
(macroexpand <form> <recursive-or-not>)
(apply <form> <list>)
;; Module Manipulators
(in-module <quoted-name>)
(new-module <quoted-name>)
(meta-data <symbol>)
(set-meta-data <symbol> <value>) ; data should be a hashtable
(open <name>)
(include <name>)
;; Miscellanies Functions
(eq <item-1> <item-2>)
(garbage-collect)
(de-allocate <ref>)
(read <value>)
(gensym <name>)
(<application> <arg-1> … <arg-n>)
(do <body>+) ; traditionally called progn (CL) or begin (Scheme), or do (Haskell, Clojure)
(quote <form>) ; Literal constructors
;; choice
(case <thing> <body>+)
;; products co-products
(product <list-of-values>)
;; List (immutable)
(car <list>)
(cdr <list>)
(cons <ele> <list>)
(pair? <list>)
(nil? <list>)
nil
;; Array (mutable)
(elt <array> <ele>)
(set <array> <ele> <value>)
(allocate <array> <length> <initial-value>)
;; Hashtable (mutable)
(lookup <hashtable> <key>)
(insert <hashtable> <key> <element>)
(remove <hashtable> <key>)
```
Appendix C Type Theoretic Macros
================================
- We will be relying on `Appendix D` to give us enough definitions to
write the following code with ease
``` {.commonlisp org-language="lisp"}
;; Signatures
;; In reality we'd want to check if the sig already exists, and work
;; off that gensym, instead of defining a new symbol every time we try
;; to work this.
(defmacro sig (name type)
(let ((name-generated
(gensym (symbol-append name `signature))))
`(do
(def ,name-generated ,type)
(set-meta-data ,name
(insert (meta-data ,name)
:type
,name-generated))
(set-meta-data ,name-generated
(insert (meta-data ,name-generated)
:siganture
:true))
,name-generated)))
;; Need to derive this more explicitly, rather than just stating it
;; Examples of Sum types without explicit sum types
;; note we use cons explicitly here to demonstrate how we can
;; construct it, we wouldn't do that in practice due to safety
;; concerns. This is the exact same as categorical co-product is it
;; not?
(defadt foo ()
(foo bart)
(baz baz)
(bar bar))
(def my-bar
(make-bar <bar-record>))
---> (def my-bar (cons :bar <bar-record>))
(def my-foo
(make-foo <foo-record>))
---> (def my-foo (cons :foo <bart-record>))
;; type checker assisted, so will give a warning about missing cases
(match my-bar
(bar <bar-record>) <body-with-bar-record>
(foo <bart-record>) <body-with-bart-record>
(baz <baz-record>) <body-with-baz-record>)
--->
(at-compile-time-thus-not-in-generated-code
(if (all-arguments-checked (constructors of-case))
t
(warning "cases missed......")))
;; this code shows up
(case (car my-bar)
;; symbols are O(1) compare, so basically integers when interned
:bar (let <bind-bar-record> <body-with-bar-record>)
:foo (let <bind-bart-record> <body-with-bart-record>)
:baz (let <bind-baz-record> <body-with-baz-record>))
```
Appendix D Derivations
======================
- All the code is valid in this specification of Juvix, except for
values that are related to the Juvix reader itself (thus, reader
extensions and some literals).
- All these exceptions will be noted, otherwise all code is
compiling and extend Juvix itself
``` {.commonlisp org-language="lisp"}
(def list
(lambda (&rest arguments)
arguments))
;; First macro facility
(def macro (lambda (name args body)
(list (quote do)
(list (quote def) name args body)
(list (quote mark-def-as-macro) name)
name)))
(mark-def-as-macro (quote macro))
;; defmacro next level of macro lever
(macro defmacro (lambda (name args body)
(list (quote macro)
(list (quote lambda) args body))))
(defmacro if (pred then else)
(list (quote case) pred
:true then
:false else))
(defmacro and (x y)
(list (quote if)
x
(list (quote if)
y
:true
:false)
:false))
;; getting List function
(def foldr
(lambda (f list acc)
(if (nil? list)
acc
(f (car x) (foldr f (cdr xs) acc)))))
;; need this for quasi-quote
(def conc
(lambda (xs ys)
(foldr cons xs ys)))
;; And now the tricky bit, we want to implement `
;; We will have literals we will deal with, like
;; #(1 2 3 4) Array
;; Quasiquote formally works like this.
;;
;; Note 'x = (quote x)
;; `x = (qquote x)
;; ,x = (unquote x)
;; ,@x = (splice x)
;;
;; `x = 'x
;; `,x = x
;; `(x y) = (cons `x `(y))
;; `(,x y) = (cons x `(y))
;; `(,@x y) = (conc x `(y))
;;
;; The code will look ugly as I say (quote x) instead 'x, and Ι lack
;; the proper tools, which I'll make with this macro qquote! All of
;; this is the pains of bootstrapping yourself, and nothing more.
(defmacro qquote (x)
;; if we had cond, this would be a lot prettier
(if (array? x)
;; array-literal and array-list are left as an exercise to the
;; reader
(list (quote array-literal)
(list (quote qquote)
(array-list x)))
;; Check if it's a hashtable literal
(if (hashtable-literal? x)
;; hashtable-literal, and hash-list left as an exercise to
;; the reader.
(list (quote qquote)
(hash-list x))
;; next check if it's not a list... if so we're golden.
(if (not (pair? x))
(list (quote quote) x)
;; Check if we have the following case
;; `,x --> x
(if (eq (quote unquote) (car x))
(car (cdr x))
;; Check if we the following case
;; `(,x y) --> (cons x `(y))
(if (and (pair? (car x))
(eq (quote unquote) (car (car x))))
(list (quote cons)
(car (cdr (car x)))
(list (quote qquote)
(cdr x)))
;; Check if we the following case
;; `(,@x y) --> (conc x `(y))
(if (and (pair? (car x))
(eq (quote splice) (car (car x))))
(list (quote conc)
(car (cdr (car x)))
(list (quote (cdr x))))
;; no other case thus it's just a list
;; `(x y) --> (cons `x `(y))
(list (quote cons)
(list (quote qquote) (car x))
(list (quote qquote) (cdr x))))))))))
;; register the reader extension somehow. This is the one part that is not specified
(reader-extension qquote ...)
;; We'll likely make this along with and CPU instructions
(defmacro or (x y)
`(if ,x
:true
(if ,y
:true
:false)))
(def group-by-2
(lambda (xs)
(def-local recursive
(lambda (xs acc)
;; drop an element if list is not empty
;; an assert would be better here
(if (or (nil? xs) (nil? (cdr xs)))
acc
(group-by-2 (cdr (cdr xs))
(cons (list (car xs) (car (cdr xs)))
acc)))))
(recursive xs nil)))
(defmacro cond (&body lists)
(foldr (lambda (x ys)
`(if ,(car x)
,(car (cdr x))
,ys))
(group-by-2 lists)
nil))
```