Juvix
juvix-project
The Juvix module system is one that wishes to leverage the benefit of a nested module system (rust, ocaml) while maintaining a workflow that can support working interactively or through a more static approach. In order to achieve this goal, we analyze different approaches of module management, and distill the qualities that are positive to a project and user behavior.
The Haskell view of modules is an interesting one in that the language's modules look nested, while continuing to be flat.
Consider the following example
-- Compilation.hs
module Juvix.Backends.LLVM.Compilation
import qualified Juvix.Backends.LLVM.Codegen.Block as Block
import qualified Juvix.Backends.LLVM.Codegen.Closure as Closure
import qualified Juvix.Backends.LLVM.Codegen.Record as Record
import qualified Juvix.Backends.LLVM.Codegen.Sum as Sum
import qualified Juvix.Backends.LLVM.Codegen.Types as Types
import qualified Juvix.Backends.LLVM.Codegen.Types.CString as CString
...
preProcess ::
ErasedAnn.AnnTerm PrimTy RawPrimVal -> Types.Annotated Types.TermLLVM
preProcess = Conversion.op
compileProgram ::
Monad m =>
-- | Term to compile.
ErasedAnn.AnnTerm PrimTy RawPrimVal ->
Feedback.FeedbackT [] P.String m Text
compileProgram t =
case Block.runEnvState (register t) mempty of
...
Here we import the path Juvix.Backends.LLVM.Codegen
over and over in a truly nested world we would be able to say
import Juvix.Backends.LLVM.Codegen
preProcess ::
ErasedAnn.AnnTerm PrimTy RawPrimVal -> Types.Annotated Types.TermLLVM
preProcess = Conversion.op
compileProgram ::
Monad m =>
-- | Term to compile.
ErasedAnn.AnnTerm PrimTy RawPrimVal ->
Feedback.FeedbackT [] P.String m Text
compileProgram t =
case Block.runEnvState (register t) mempty of
...
as what exists within Juvix.Backends.LLVM.Codegen
is the Block
, Closure
, Sum
, Types
, CString
module. We could even take this a step further and qualify Juvix.Backends.LLVM.Codegen
as C
, and be able to say C.Block.runEnvState
.
This is very much so what the java programming language does, which in some ways has similar syntatical convention (.
) for it's "module" lookups as Haskell
public class test {
public static void main (String args[]) {
System.out.print("Test");
}
}
// we can also write
import java.lang.System.*
// for our purposes we could have also wrote
// import static java.lang.System.out;
public class test2 {
public static void main (String args[]) {
out.print("Test");
}
}
Not only would a nested module system allow Juvix to save on imports, it will also allow more effective organization of libraries and business code.
Take for example the organization of a standard library.
-- psuedo Haskell
-- Protolude.hs
module Protolude where
type Int = Int.T
type Word = Word.T
------------------------------
-- Protolude/List.hs
------------------------------
module ProtoLude.List where
filter :: (a -> Bool) -> [a] -> [a]
filter = ...
zip :: [a] -> [b] -> [(a,b)]
-- Currently named nonEmpty
toNonEmpty :: [a] -> Maybe (NonEmpty a)
toNonEmpty = ...
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
sort :: Ord a => [a] -> [a]
--------------------------------
-- Protolude/Int.hs
--------------------------------
module Protolude.Int where
type T = Prelude.Int
-- this is gcdInt' in Protolude
gcd :: T -> T -> T
gcd = ...
-------------------------------
-- Protolude/Word.hs
-------------------------------
module Protolude.Word where
type T = Prelude.Word
-- this is gcdWord' in Protolude
gcd :: T -> T -> T
gcd = ...
The above example shows a few positive points over the current protolude
that we use today. The first being is that functions which have multiple variants like gcdInt'
and gcdWord'
can now be used as the following
import Protolude
myLogic :: Word
myLogic = Word.gcd 3 5
Where the disambiguation can happen on the module level. However this leads into another important point. For functions like filter
we could see this function as working on many different data structures, and maybe even a type class! To figure out what the version in protolude is, the user would have to type in the prompt :t filter
to confirm the version. However in the case of writing List.filter
, it becomes readily apparent that the filter function will be on the list data type.
Thus the standard library is setup in such a way that encourages operations pertaining to a particular type to be in the same module as that type, combined with no extra imports from the user means that users are more readily available to be using qualified names (which in Ocaml, Clojure, … is considered good practice) and have lower resistance for trying other data structures in the standard that is not directly in prelude (every inconvenience discourages users from using [1] [2]).
Another advantage comes in the large scale packaging of code and the flexibility of later renaming.
Let us argue this point from the perspective of two different library authors working on a similar problem. They don't know each other, and decide to focus on different parts of the problem. Let's say the domain pertains to proving different properties of stream fusion. Thus they both define
However maybe one defines Data.Stream.Proof.SingularPass
, while the other defines Data.Stream.Proof.ConstantSpace
.
This is all fine, as long as no one requires both of these packages at once. However, say we are a startup who wants to prove properties about stream fusion as we need hard real time guarantees. Thus we'd wnat a proof about the passes all being fused, and a proof that the code is taking a constnat amount of space as any GC would make our application not hard real time.
Thus we wish to use both of these. However in the flat representation, we could only use both packages if we rename each shared module in one of the packages (that is if the language even allows us!). Thus I have to rename one package as follows
For each module given. However in a nested setting our choices get a lot more intersting. If the package manager allows us we can simpily isolate this issue at the package level and state, and import both depencies as qualified
-- build file
qualified [Singular, Constant]
-- File Level in our project
module Data.Stream.Proofs where
module Singular = Singular.Data.Stream.Proofs
module Constant = Constant.Data.Stream.Proofs
This ability can further be automated away, if we treat modules as values, as we can request particular handlers to say concatenate the modules if no conflicting symbols are found etc.
Another interseting point is that Haskell enforces a system of one module per file and that if one does not import a module, then it can't be accessed at all. This has the upshot of Haskell's build system not requiring explicit ordering, however in times where a particular task demands multiple functions for one "kind" of item in the system, then we have little recourse.
For example:
Let us say we are making LLVM data types, we need to
Below is the code that does precisely that.
portTypeNameRef :: Type
portTypeNameRef = Type.NamedTypeReference portTypeName
portTypeName :: IsString p => p
portTypeName = "graph_port"
portType :: Type -> Type
portType nodePtr =
StructureType
{ isPacked = True, ... }
portPointer :: Type
portPointer = pointerOf portTypeNameRef
portTypeSize :: Num p => p
portTypeSize = numPortsSize + pointerSizeInt
-- | Construct a 32 bit port space so we can put many inside a node cheaply
-- The pointer points to the beginning of a node and an offset
nodePointer :: Type
nodePointer = pointerOf nodeTypeNameRef
nodeTypeNameRef :: Type
nodeTypeNameRef = Type.NamedTypeReference nodeTypeName
nodeTypeName :: IsString p => p
nodeTypeName = "graph_node"
nodeType :: Type -> [Type] -> Type
nodeType tag extraData
| bitSizeEncodingPoint =
abstracted
{ elementTypes = ... }
| otherwise =
abstracted
{ elementTypes = ... }
where
abstracted =
StructureType
{ isPacked = True,
elementTypes = []
}
We make *type
, *typeName
, *typeNameRef
, etc etc etc. If Haskell had a system where the user could make an adhoc module on the spot then the code would have been put in
module Port where
nameRef = ...
type' = ...
typename = ...
pointer = ...
size = ...
module Node where
nameRef = ...
type' = ...
typename = ...
pointer = ...
size = ...
as it would be easy to do so. When tasks become laborious to do the right way users will often take lazy short cuts. Some examples of the right way to do it in haskell include:
Some other benefits of the more Module
focused way, I can safely factor Node
and Port
to their own files with no differnece in the caller interface. This very explicit structure, encoruages me to write a module signautre for them. And lastly I've increased the regularity of the codebase as this easy to form idiom has a concrete meaning.
Thus to sum up the implications of Haskell's Module View
Although Unix is not a programming language, it would be remis to neglect the correspondence between the file system of unix and programming lanugages.
The Unix file system has a few notions.
Let us demonstrate each point
cd ~/holder/example
# We are now in ~/holder/example module/directory
2 katya@Babylon-4:~/holder/example %
# go to the module above via home dir
cd ~/holder
# get to example again by the module in ~/holder
cd example
2 katya@Babylon-4:~/holder/example % echo "" > bin
2 katya@Babylon-4:~/holder/example % LANG=en_US.UTF-8 mkdir bin
mkdir: cannot create directory ‘bin’: File exists
# Modules can't be the same name as files!
# can go to the top directory just fine
cd /
# Can go to work just fine as well!!!
cd ~/Documents/Work
# lets get weird about this
3 katya@Babylon-4:~/holder/example % mkdir Foo
3 katya@Babylon-4:~/holder/example % mkdir Foo/Bin
3 katya@Babylon-4:~/holder/example % cd Foo/Bin
2 katya@Babylon-4:~/holder/example/Foo/Bin % ln -s -d ~/holder/example/Foo .
2 katya@Babylon-4:~/holder/example/Foo/Bin % cd Foo
2 katya@Babylon-4:~/holder/example/Foo/Bin/Foo % cd Bin/Foo/Bin
2 katya@Babylon-4:~/holder/example/Foo/Bin/Foo/Bin/Foo/Bin %
From an interactive point of view the Unix file system is quite nice, in that I can access the entire unix world from any directory, there is no arbitrary cut off of access to other modules depending on what module I am in.
Further, since the Unix filesystem allows symlinks (which can be viewed as module aliases) and directories are in the same namespace as files, we can form a graph from what was a tree. This is an important implication that falls out of the rules:
For a programming language this too would be a natural implication, although checks can be had if one really insits on the matter
Agda/MiniJuvix is interesting to analyze here, as it's explictily looking for a static development model.
A good way to describe the MiniJuvix's Module view is as follows.
Let us give an example of interesting behavior that is entailed by the above rules.
Say we are a business, and we have some very vital business functions. it's so important in fact that we have named a module BusinessLogic.Important
to handle all the important requests. We can think of it as a common API.
Now BusinessLogic.Important
is a bit… verbose to say, so, we have another Global module, BusinessLogic
which rexports Important
, and perhaps adds a few other functions that are handy with the less important functions of BusinessLogic.
.
Now say we browse the codebase and see the pattern of Application.Platform
all over the codebase, people import BusinessLogic
, and carry on with important and non important functions.
Now for our task, let's make an app, but we are still in the experimentation phase, so we just import the Important
functions to begin.
-- BusinessLogic/Importnat.mju
----------------------------------
module BusinessLogic.Important
.... key-change does not exist in here ...
-- BusinessLogic.mju
----------------------------------
module BusinessLogic
import BusinessLogic.Important
module Important =
open public BusinessLogic.Important
def key-change = ....
-- Application/Platform.mju
----------------------------------
module Application.Platform
import BusinessLogic
def fi = BusinessLogic.Important.key-change
module BL = BusinessLogic
def vital = ... BL.Important.key-change ...
-- MyFirstApp.Ju
----------------------------------
module MyFirstApp
import BusinessLogic.Important
-- lets use our key-change!
def lets-business = ... BusinessLogic.Important.key-change ....
Please take a second and think about what you expect from module MyFirstApp
.
Does this code work? The answer is no. the Global
module BusinessLogic.Important
does not have key-change
, the only module which does is
BusinessLogic.Important
, sorry let me rephrase, BusinessLogic
's local module, Important
.
Thus Although the pathing looks somewhat nested, it simply isn't. Thus the following code would lead to an ambiguity error being shown to the user
-- MyFirstApp.Ju
----------------------------------
module MyFirstApp
import BusinessLogic.Important
import BusinessLogic
-- lets use our key-change!
def lets-business = ... BusinessLogic.Important.key-change ....
as these two modules are in fact not the same! Thus the modules both live in the same namesapce as everything else, but also don't on a global level.
Futher issues arise when we want user contributions in the STDlib. For example, the user can not extend the base standard library.
It is nice when a user can extend the language in the same way as the language developer can, thus user extensions feel built in as they look no different than what the language authors have wrote down.
In the case of the standard library this might occur like in the example in the section An argument over the Flat World Model
. Namely in the Data.Stream.Proofs
example.
In the MiniJuvix world, this would not so easily be gotten, and thus the Prelude may export Stream
module Prelude
import Prelude.Stream
module Stream =
open public Prelude.Stream
and in the normal module of a truly nested system we can get
import Prelude
open Prelude
fi = Stream.Proofs.occurence
however in the MiniJuvix case, this would not be possible, and we'd have to explicitly import the extension
import Prelude
import Prelude.Stream.Proofs
-- I'm unsure of how minijvuix handles the Prelude here, as they both have it...
open Prelude
fi = Prelude.Stream.Proofs.occurence
Thus user extensions to the language do not feel built in, and getting it to be so, would require users to write their own prelude for each project, which tries to hookup whatever extensions they want.
This import
behavior of global modules to be recognized in the current scope, has another wide ranging implication. Namely that:
For example, let us play in the MyFirstApp
.
JU-User> import MyFirstApp
JU-User> in-module MyFirstApp
MyFirstApp> lets-business
... Gave wrong result ...
It seems lets-business
gave the wrong result. Let us check out our tests to see what is up
MyFirstApp> Test.MyFirstApp
here I can auto complete Test.My
as the system knows about the code in the system, but it is not accessible, thus I should get an error like
error Test is not a module in MyFirstApp
Thus we have to import it
MyFirstApp> import Test.MyFirstApp
MyFirstApp> Test.MyFirstApp.run-the-tests
... 5 tests out of 10 failing ...
and now we can run it. If we were to investigate the issue and deduce it comes from Business.LessImportant
instead then we can maybe play with the data like
MyFirstApp> import Business.LessImportant
MyFirstApp> in-module Business.LessImportant
Business.LessImportant> ... play around for a while and play with behavior ...
Business.LessImportant> import MyFirstApp -- time to go back, but first import where we are
Business.LessImportant> in-module MyFirstApp
Note here we had to remember to rebring MyFirstApp
into scope, as by the very act of switching what module our current environment is, has made us forget about MyFirstApp
even though it exists perfectly well in the environment, we simply can't talk about that namespace in MyFirstApp
[3].
import BusinessLogic.Importat
, import BusinessLogic
. Multiple BusinessLogic.Important
!).Point 5. leads to
And point 3. leads to
This is mainly due to requiring an extra import, that builtins do not need.
Out of all these models. The closest world view on modules to the Desirable values
table above is the Unix
view with Ocaml characteristics.
Thus the design for Juvix's module system is quite simple.
Most of the properties of the Juvix Module system fall out of fact 1.
, 2.
and 3.
.
For example
Graphs
may form because if one rexports their ancestory, they can then by some path trace back to themselves. This is a rare occurrence but fall out of the rules. Some arbitrary rules can be applied to stop this, as the system knows enough to fully stop this behavior if need be.
From the Interactive Sesssion, the user can hop to any module without recompilation or stating explicitly what they need to bring into scope as the user has the full access to the system.
The Juvix Module system will have special code to handle
Where an implicit open is something opened in the environment automatically.
Other feautres such as Aliases
, are derived from these two features above.
Thus the lookup rule for the Current module of Juvix can now be explained
For Global lookups the rule changes slightly
TODO :: how it works just falls out of the implications, so just have fun with design - mariari
asdf
over your code[1] https://www.youtube.com/watch?v=-J8YyfrSwTk I believe it is this talk where Yaron Minsky explains that if tests aren't easy to make people won't do it, even if it is important.
I don't have the exact time stamp, and it may be a different talk. Update this reference when it is precisely found.
[2] Hence +
for string concatenation in java is horrible and ought to be taxed. Even making it be ++
to make it slightly more taxing to write may indeed lower some usage.
[3] Note it has to be like this, as many shortcuts we can make, can run into an ambiguity and frustrating of wanting to go to BusinessLogic
's Important
rather than BusinessLogic.Important
.