Juvix
juvix-project
In this document we outline the Berlin Pipeline. The document is broken up into three core chunks along with a miscellaneous section that outlines helper structures which are integral to the smooth operation of the pipeline.
These four sections are
The fourth section covers extensions that have made it in the codebase that may alter the shape of the first three sections that are written about.
With the Following sections to prop up the understanding of the Model
The full code that backs the proposal can be found here.
The Last section of the document will cover the expected time and how the tasks should be distributed
One may ask why we should even begin the Berlin Pipeline, afterall we currently can get LLVM
, Michelson
, and Plonk
a term to compile. However what this does not take into account is the following properties:
Out of these issues, only issue 2.
is solvable with the current architecture.
This is becuase the pipeline has the following structure:
type Pipeline = Feedback.FeedbackT [] [Char] IO
class HasBackend b where
type Ty b = ty | ty -> b
type Val b = val | val -> b
type Err b = e | e -> b
stdlibs :: b -> [FilePath]
stdlibs _ = []
-- | Parse juvix source code passing a set of libraries explicitly to have them in scope
toML' :: [FilePath] -> b -> Text -> Pipeline [(NameSymbol.T, [Types.TopLevel])]
toSexp :: b -> [(NameSymbol.T, [Types.TopLevel])] -> Pipeline (Context.T Sexp.T Sexp.T Sexp.T)
toHR ::
(Show (Ty b), Show (Val b)) =>
Param.Parameterisation (Ty b) (Val b) ->
Context.T Sexp.T Sexp.T Sexp.T ->
Pipeline (Core.RawGlobals HR.T (Ty b) (Val b))
toIR ::
Core.RawGlobals HR.T (Ty b) (Val b) ->
Pipeline (Core.PatternMap Core.GlobalName, Core.RawGlobals IR.T (Ty b) (Val b))
toErased ::
Constraints b =>
Param.Parameterisation (Ty b) (Val b) ->
(Core.PatternMap Core.GlobalName, Core.RawGlobals IR.T (Ty b) (Val b)) ->
Pipeline (ErasedAnn.AnnTermT (Ty b) (Val b))
typecheck :: Context.T Sexp.T Sexp.T Sexp.T -> Pipeline (ErasedAnn.AnnTermT (Ty b) (Val b))
compile ::
FilePath ->
ErasedAnn.AnnTermT (Ty b) (Val b) ->
Pipeline ()
Starting form toSexp
we get the Pipeline
type which is just an alias for Feedback. Thus no environment could be shared. However, even if we extend the Pipeline
type with Context Information, which context do we share? All the passes up to toHR
are compatabile with the Context.T
type, however at that point core uses its own hash table structure. Thus to properly fix this situation we need a consistent type to thread between the backends.
Further we can see with this structure, that the details of each pass are not expressed. in fact toSexp
hides over 7 passes that we have no way of stopping between unless we reconstruct the pipeline. Hence, the Easy
module explicitly repeats this code. Thus we need some kind of structure that allows us to talk about passes and sub-passes.
The Berlin Pipeline seeks to unify both the type of a step and build up the type of the pipeline itself. Through this structure we can get all 5 of the missing properties and have a principled way of talking about what a pass is. This unification also allows tools to be made around a pass allowing easy creation of a new pass and the placement in the pipeline.
The model of the new pipeline is laid out in the following way:
Pipeline.Step
module and Pipeline.Step.t
will serve as the type any compiler step inhabits.Pipeline.Environment
module, with the Pipeline.Environment.t
being the constructed compiler pipeline itself.Pipeline.Automation
module which has functions to make inhabiting Pipeline.Step
easier.In this model Pipeline.Environment
will give the pipline organizer the tools to clearly lay out a straightforward Pipeline. It will also give the caller of the pipeline the tools needed to introspect, end the pipleine early etc.
The Pipeline pass writer will utilize the Pipeline.Automation
module to create passes for the pipeline, inhabiting Pipeline.Step.t
and finally naming the type giving the pipeline organizer the Pipeline.Step.named
type.
The current Juvix pipeline already has a notion of Meta information through the use of the Feedback
monad. However, the current Feedback
monad conflates Feedback and the result of computation along with not being easily modifiable to add extra meta information.
Thus we extend and break apart the Feedback
monad as such.
module Meta : sig
type t = { feedback : Feedback.t; trace : Trace.t }
end
module ComputationResult : sig
type 'a t =
| Success of {meta : Meta.t; result : 'a}
| Failure of {meta : Meta.t; partial_feedback : 'a option}
end
The Meta.t
type is filled with the meta information from computation. Namely Feedback.t
and Trace.t
. The Feedback.t
is the actual feedback part of the Feedback
monad, and the Trace.t
is any traces we could have gotten from a piece of computation. The Meta.t
type could be extended with further information if further information should be captured.
Further we have the new concept of ComputationResult.t
stemming from the Failure
and Success
case of the old Feedback
monad. However we have a twist in that if we fail computation, we may get a partial result.
Another import structure in the Berlin pipeline is concept of the circular list
we give this type stub to represent the idea
(** A Recursive list represents the concept of a list with a chance to nest
* This allows us to get a tree like structure,
* where A pipeline step can contain more steps
*)
module RecursiveList : sig
type 'a t
end
(** A circular list represents the idea of steps that can be recursive
* on each other
*)
module CircularList : sig
(* TODO : add a recursion function to the recursive bit, that tells
* it how it's recursive. perhaps it uses a predicate function
* on the 'a type to determine what function to go to next.
*)
type 'a recursive_schema =
| Recursive of 'a list
| NonRecursive of 'a
type 'a t =
'a recursive_schema RecursiveList.t
end
The CircularList.t
is broken down into two parts. The first is a RecursiveList.t
which represents a list which can recurse. This is very similar to the Sexp.t
concept except instead of holding s-expressions it instead holds arbitrary data.
Using the recursive list, we are able to construct a circular list which has the notion of cycles. The cycle behavior is determined by the recursive_schema
type which states how the contents are recursive on each other. The recursive_schema
type should be refined in future specification.
In this proposal we will be using OCaml code for signatures. OCaml as opposed to Haskell was chosen as it has a notion of modules in language so the specification can be more straightforwardly laid out.
This does mean that for many examples we will have Sexp.t list
instead of List.t Sexp.t
or even [Sexp.t]
.
Further when we convert the OCaml functions to Haskell, we will turn snake_case
into camelCase
.
Since OCaml lacks notions of effects, modules may be given that represent the Haskell effects and Haskell may be given to supliment where the OCaml may be confusing.
The Step Module serves as the base interface for any compiler step. The module is quite minimal, with only a few types along with a small set of operations.
This main type of the module, t
, is simply a function type that takes a structure type that has more than it needs to do any pass called computational_input
and gives back the result of the computation plus Meta
-data.
module Step : sig
...
type t = computational_input -> working_environment ComputationResult.t
...
end
We say that the pass takes more than it needs because not only is computational_input
composed of the context
and the current_expression
we are trying to compile against:
(** [env_or_sexp] expresses data that may be already added to the environment
* or are still in a base sexp form that requires no lookup
*)
type env_or_sexp =
| InEnv of NameSymbol.t
| Sexp of Sexp.t
type computational_input = {
language_data : working_environment;
...
}
type working_environment = {
current_expression : env_or_sexp list;
context : Sexp.t Context.t
}
but also a surrounding environment, which contains information that should be put in the passes working environment.
type computational_input = {
language_data : working_environment;
surrounding_data : surrounding_environment;
}
type surrounding_environment = {
current_step_name : NameSymbol.t option;
meta_information : Meta.t
}
This surrounding_environment
is composed of Meta
information that is discussed in the [#Computational Meta Data section], along with the name of the current_step
. The usefulness of the name will be discussed in a later section, but this environment should be able to grow without effecting passes which don't touch the extra information.
The Step
Module also has a notion of a named
pass
module Step : sig
type t = computational_input -> working_environment ComputationResult.t
type named = {
name : NameSymbol.t;
func : t
}
val name_pass : t -> NameSymbol.t -> named
end
The named
type simply allows us to attach a name to any given Pipeline Step function. This will serve as the type that Pipeline.Environment.register_step
will deal with. This type should be exported opaquely.
Lastly, the name_pass
function is a public constructor of this type.
Overly non constraining types like the input type to Step.t
should be met with criticism, however in the case of the Step
module it is necessary to avoid the situation of trying to enforce the Pipeline.Environment
's effect type upon the passes.
Since this is a needed separation, the Step
module should not be programmed against directly, but rather through the Pipeline.Automation
module that will be able to make a pass work over a single env_or_sexp
and a Context.t Sexp.t
along with the pass's custom environment.
Now that we have the Pipeline.Step
properly specified for the passes, we can get to the infrastructure this allows the creation of.
First before we get to the Environemnt
in which the majority of the compiler pipeline lives, we must first take a closer look at the surrounding_environment
the pass gets.
(** [surrounding_environment] serves as the minimum surrounding information
* in the environment of the Pipeline Steps.
*)
type surrounding_environment = {
(* This will be taken as a read value *)
(** The [current_step_name] represents the current running step name *)
current_step_name : NameSymbol.t option;
meta_information : Meta.t
}
(** [computational_input] is the actual data passed into the [Step.t] type.
* The [language_data] field serves as the direct data that the direct data
* a pass wishes to take, while [surrounding_data] serves as extra environment
* constraints that one wants to include.
*)
type computational_input = {
language_data : working_environment;
surrounding_data : surrounding_environment;
}
We can see the input type of the pass along with the surrounding_environment
it caries. For this proposal the surrounding_environment
code is quite simple, with having meta_information
which contains the Trace.t
and Feedback.t
inside of it. This inclusion just lets Trace and Feedback information be threaded through the passes, as the output type directly has new versions of these values. It also has the current_step_name
, which reflects the current name of the pass running. We will see some application of this in the ##Automation Section.
The important part of this surrounding_environment
is that through the output type of Pipeline.Step.t
, we can have enough data to fill the dynamic parts of the Pipeline.Environment
itself.
The Pipeline Type is defined as follows
module Environment : sig
type t = {
(** [information] serves as the public facing data to the environment
* that passes can access
*)
information : computational_input;
(* All other information is made to be exclusive with the Pipeline as
* a whole that the Steps do not bother with.
* This consists of information like how to deal with Traces between passes
* to the entire pass infrastructure itself.
*)
(** [registered_pipeline] is the pipeline itself.
* It features all the steps named and registered with the system *)
registered_pipeline : Step.named CircularList.t;
stopping_step : NameSymbol.t option
}
...
end
It contains a few pieces of information.
The first piece is the information
which contains the same type that Pipeline.Step.t
takes.
This is considered the "Dynamic" data of the pass and serves a few purposes:
computation_input
.Environment.t
directly, by feeding it directly as input.
Automation
become part of the running environment of the pass directly, but we avoid the issue of the passes needing to use anything in Environment.t
that would stop it being ran in it's own environment. This is Crucial as we can't run partial effects with Capabilities.The next few pieces of data relate to the pipeline itself and will be constructed fully from functions that make up the pipeline or wish to manipulate the execution of the pipeline.
The first piece of data in this list, is the registered_pipeline
which is what records the passes.
The type is a CircularList.t
over the Step.named
structure we analyzed previously. The CircularList.t
type was chosen because the recursive nature gives us a few advantages over a normal List of Step.named
:
Testy
with their T.TestCase
testGroup :: TestName -> [TestTree] -> TestTree
We will state more on how registered_pipeline
works when we get to the API function portion.
Lastly for now, we will also have the value stopping_step
, this simply states at which step we should stop running the compiler and return any working data that is left.
Although we only include two fields exclusive to the pipeline environment, the design is in such a way that more fields and behaviors should be easily added without much hassle.
With the environment out of the way, let us consider how one is supposed to use this environment and the functions on the environment.
The pipeline is intended to grow a single compile function, in which we shall call eval
. We do this by registering the Pipeline.step.named
into the environment. After we register all the steps, we can chose if the pipeline stops early, introspect if some pass fails etc. When it comes time to executing the pipeline a run
function will be given that takes the Pipeline.Environment.t
value and starts executing the registered passes, stopping either when a pass fails, the stopping_step
says we should stop at this step, or if we have successfully compiled the given form.
(** [register_step] registers the pipeline function to the environment *)
val register_step : Step.named CircularList.t -> unit
register_step
has the simple job of taking a named pass and added it to the registered_pipeline
type. Note that we take a CircularList.t
over this type to simply make the operation monoidal. An example implementation could be something as simple as.
This will be a core way one interacts with the pipeline
registerStep pass =
modify @"registeredPipeline" (<> pass)
While the above function is useful, it has no way of helping us build groups of steps. For that we have
(** [def_pipline_group] creates a named group of pipeline steps or a
* nested grouping of pipeline steps *)
val def_pipline_group
: NameSymbol.t
-> Step.named CircularList.t list
-> Step.named CircularList.t
This function should remind one of the testGroup
function that is used in Testy
tests. It has the same behavior as that, we can then register this function with register_step
.
The next set of functions deal with talking about stopping the pipeline early.
val stop_at : NameSymbol.t -> unit
val stop_at_nothing : unit
stop_at
, tells the environment to simply stop when we reach the pass with the given name. stop_at_nothing
just resets the stopping behavior to none.
The next function in the environment is the actual eval
function which is responsible for running the code.
(** [eval] is responsible for taking the environment, running it
* to the desired point and giving back what data is left.
* The output type should be more carefully considered
* and have many operations on it.
*)
val eval : t -> computational_input
The eval
function has a tough job, as that it is responsible for taking the information in the t
type and properly threading the calls to the passes.
As mentioned in the doc comment above the function, the output type is not fully considered. For this proposal we chose the computational_input
type.
This ends the core API of the Pipeline
, more functions will be needed in regards to introspecting the output type, but these are the core functions to properly construct and change the core environment.
module Pipeline : sig
(** [env_or_sexp] expresses data that may be already added to the environment
* or are still in a base sexp form that requires no lookup
*)
type env_or_sexp =
| InEnv of NameSymbol.t
| Sexp of Sexp.t
(** [working_environment] serves as the input data relevant to any
* pipeline step function. This is what is actively being processed
* in the pipeline itself.
*)
type working_environment = {
current_expression : env_or_sexp list;
context : Sexp.t Context.t
}
(** [surrounding_environment] serves as the minimum surrounding information
* in the environment of the Pipeline Steps.
*)
type surrounding_environment = {
(* This will be taken as a read value *)
(** The [current_step_name] represents the current running step name *)
current_step_name : NameSymbol.t option;
meta_information : Meta.t
}
(** [computational_input] is the actual data passed into the [Step.t] type.
* The [language_data] field serves as the direct data that the direct data
* a pass wishes to take, while [surrounding_data] serves as extra environment
* constraints that one wants to include.
*)
type computational_input = {
language_data : working_environment;
surrounding_data : surrounding_environment;
}
module Step : sig
(** the pipeline step, takes the entire [computational_input] as input.
* This is quite a wide type in that it has data that a pass does not
* directly want to handle.
* Thus it is expected one works with the [Automation] module, to make
* fulfilling both the [ComputationResult.t] and the input form bearable.
*)
(* May promote [working_environment] to [information_record] *)
type t = computational_input -> working_environment ComputationResult.t
type named = {
name : NameSymbol.t;
func : t
}
val name_pass : t -> NameSymbol.t -> named
val name_of_pass : named -> NameSymbol.t
end
module Environment : sig
type t = {
(** [information] serves as the public facing data to the environment
* that passes can access
*)
information : computational_input;
(* All other information is made to be exclusive with the Pipeline as
* a whole that the Steps do not bother with.
* This consists of information like how to deal with Traces between passes
* to the entire pass infrastructure itself.
*)
(** [registered_pipeline] is the pipeline itself.
* It features all the steps named and registered with the system *)
registered_pipeline : Step.named CircularList.t;
stopping_step : NameSymbol.t option
}
(** [register_step] registers the pipeline function to the environment *)
val register_step : Step.named CircularList.t -> unit
(** [def_pipline_group] creates a named group of pipeline steps or a
* nested grouping of pipeline steps.
*)
val def_pipline_group
: NameSymbol.t
-> Step.named CircularList.t list
-> Step.named CircularList.t
(** [stop_at] tells the environment to stop at a particular step when
* running the environment.
*)
val stop_at : NameSymbol.t -> unit
(** [stop_at_nothing] tells the environment to run the compiler fully.
*)
val stop_at_nothing : unit
(** [eval] is responsible for taking the environment, running it
* to the desired point and giving back what data is left.
* The output type should be more carefully considered
* and have many operations on it.
*)
val eval : t -> computational_input
(** [run] serves as the running point, the unit argument is a placeholder
* for the monadic functions used to build up the environment
* the [t] is the given pre-built environemnt
*)
val run : unit -> t -> computational_input
val extract : unit -> t
end
end
This Structure was chosen to maximize the following:
EasyPipeline
notions.With that said there may some questions one may have in regards to the API
This is brought up in regards to why not a more functional approach to composition, indeed the following does not look functional!
eval = Pipeline.extract $ do
Pipeline.register_step
(Pipeline.def_pipeline_group "Desugar"
[ Desugar.condToIf
, Desugar.ifToCond
, ...])
Pipeline.register_step
(Pipeline.def_pipeline_group "Contextify"
[ Contextify.resolveInfix
, ...])
...
This was chosen over the more functional approach given in the idealized pipeline proposal, because it allows us to name each step and stop in the middle of any of them. Where as the more functional approach would leave that structure ambiguous and behave much like a tagless final
system where the constructors fold onto itself leaving no room for interpretation of the bare structure.
Further interest was given in programatically moving passes around to properly see what makes the most sense. The given registration system along with an extension to talk about dependencies in passes would allow this experimentation.
module EasyPipeline where
passNames :: CircularList.t NameSymbol.t
passNames =
fmap
(Step.nameOfPass . Pipeline.registered_pipeline)
Compiler.eval
prettyPassNames :: IO ()
prettyPassNames = CircularList.prettyPrintList passNmaes
runToCondWithTrace =
Pipeline.stopAt "Desugar.if-to-case"
-- Extra functions to trace what we care about
Trace.enableRecursively ["Pass.cond-to-if"]
λ> Pipeline.run runToCond Compiler.eval >>| Pipeline.run-trace
We have seen in the Step section that we have a general consistent data type for a Pipeline Step, however we also discovered that the type is:
Thus we propose the Automation
module to alleviate these issues.
To simplify matters to what a pass cares about we transform the computational_input
form below
type env_or_sexp =
| InEnv of NameSymbol.t
| Sexp of Sexp.t
type working_environment = {
current_expression : env_or_sexp list;
context : Sexp.t Context.t
}
type computational_input = {
language_data : working_environment;
surrounding_data : surrounding_environment;
}
into
type pass_argument = {
current : Pipeline.env_or_sexp;
context : Sexp.t Context.t
}
type simplified_pass_argument = {
current : Sexp.t;
context : Sexp.t Context.t
}
Notice, how instead of working over a Pipeline.env_or_sexp list
for the current expression we instead worry about the current expression. And we even have a more simplified version that removes the InEnv
potential of the Pipeline.env_or_sexp
in the simplified_pass_argument
. This allows a pass to not worry about if a Sexp.t
is referenced in the environment, and allows all the attention to be had on the transformation if the pass simply does not care.
Further, the entire surrounding_environemnt
is dropped from the signature entirely.
This information is not completely gone, but instead we relegate it to the environment surrounding the pass, rather than a direct input to the pass.
The output of the pass also moves away from returning a working_environment ComputationResult.t
type working_environment = {
current_expression : env_or_sexp list;
context : Sexp.t Context.t
}
module ComputationResult : sig
type 'a t =
| Success of {meta : Meta.t; result : 'a}
| Failure of {meta : Meta.t; partial_feedback : 'a option}
end
into a more stream-lined type
type t =
| ProcessJob of process_job_no_env
| UpdateJob of { new_context : Sexp.t Context.t; process : process_job }
type job = t
type stage
= Current
| FromTopToCurrent
| Eval
type process_job = {
current : Pipeline.env_or_sexp;
new_forms : (stage * Pipeline.env_or_sexp) list
}
type process_job_no_env = {
current : Sexp.t;
new_forms : (stage * Sexp.t) list
}
What changes, that instead of giving back a list of current_expressions
, we instead have this idea of a process_job
to go along with the new input type of taking a single expression.
What this means, is that for any pass which defines a new top level definitions to be consumed, instead of adding it to the current_expression
list, we instead add it to the new_forms
, letting the Automation handle how this commits back to current_expresions
. The current expression stays as the current expression, thus the pass takes a Pipeline.env_or_sexp
and gives back a potentially new Sexp.t
, with any new definitions as new_forms
.
Further we introduce the idea of an UpdateJob
vs a ProcessJob
, meaning that instead of every pass that does not touch the context
just giving back the current context
, we can instead denote, with the ProcessJob
that our pass has no changes to the context itself. The UpdateJob
indicates that we have changed the context, so commit back the changed context.
This also explains the difference between process_job_no_env
and process_job
, since the ProcessJob
can not touch the Context
it ca not change any reference to the name which can be found in Pipeline.env_or_sexp
Lastly, we introduce the concept of staging the newly defined expressions through the stage
type. This can take the form of Current
, FromTopToCurrent
, or Eval
.
Current
would act exactly like adding any new expression to the current_expression
list.FromTopToCurrent
acts almost like Current
but runs all passes that have previous ran to the current position on the new Sexp.t
definedEval
runs the entire compiler on the newly defined Sexp.t
, updating the Context
in the process before we proceed with the current
form.This is best to seen in action in the ###Examples Section
With the input and output types out of the way (pass_argument -> job
), we can now talk about the surrounding environment that was alluded too earlier.
(* This represents the monad type that we require for a pass *)
module type MonadEff = sig
type 'a t
type 'a output_eff
(* Trace State effect *)
val get_trace : unit -> Trace.t t
val set_trace : Trace.t -> unit t
(* feed_back State effect *)
val get_feedback : unit -> Feedback.t t
val set_feedback : Feedback.t -> unit t
(* This is in all likelihood what is needed to setup the environment
* usurping the above effects for actually setting up the pass
* we leave the above effects for demonstrative purposes
*)
val has_env : unit -> Pipeline.surrounding_environment t
val set_env : Pipeline.surrounding_environment -> unit t
(* Comonad effect *)
val run : 'a t -> 'a output_eff
end
This OCaml module refers to the following Haskell signatures
type class Functor m => Comonad m where
run :: m a -> a
type HasTrace m = HasState @"trace" Trace.t
type HasFeedback m = HasState @"Feedback" Feedback.t
type HasMeta m = (HasFeedback m, HasTrace m)
type HasEnv m =
(HasMeta m, HasReader @"CurrentStepName" (Maybe NameSymbol.t) m)
What is important here is that the has_nev
/HasEnv
effect entails all the data that is passed through surrounding_environment
plus any pass specific data that it may want.
This allows the automation tooling to automatically hook into the environment of the pass to convert the job
into a ComputationResult.t job
for any given environment that a pass may care about.
We can see this signature in action with the automation functions of this module.
(* Type class in Haskell that dictates being able to extract to a step *)
module Runable :
functor (M : MonadEff) -> sig
type 'a t = 'a M.t
(** [run] runs the given environment, extracting out a result with meta data
* over the resulting value.
*)
val run : 'a t -> 'a ComputationResult.t M.output_eff
(** [apply_simplified_pass] serves as a HOF that allows for passes to be
* a more simplified type, namely a function that takes a [pass_arugment]
* to an effectual result over [job] that determines how the pass should
* be brought together.
* Note that [Pipeline.computational_input -> Pipeline.working_environment t]
* is an approximation of the [Step.t] type without the Meta information attached
*)
val apply_simplified_pass
: (pass_argument -> job t)
-> Pipeline.computational_input
-> Pipeline.working_environment t
(** [run_simplified_pass] simply combines the [run] function with
* [apply_simplified_pass], to get the output effect, which corresponds to a
* [Pipeline.Step.t] with an effect attached to it
*)
val run_simplified_pass
: (pass_argument -> job M.output_eff)
-> Pipeline.computational_input
-> Pipeline.working_environment ComputationResult.t M.output_eff
(** [simplify] allows a pass to ignore the fact that expression coming in may
* be added to the [Context.t] already, and we can act as if it were just a
* normal [Sexp.t] being passed in.
*)
val simplify : (simplified_pass_argument -> job M.output_eff)
-> (pass_argument -> job M.output_eff)
end
This roughly corresponds to the Haskell Signature
type class HasExtract a m | a -> m where
extract :: a x -> m (ComputationResult.t x)
data Automation.PassArgument
= PassArg { context :: Context.t Sexp.t, current :: Sexp.t }
transformOutputType :: Automation.t -> Pipeline.workingEnvironment
-- we share the one m monad here for the function
-- because we will extract it after we run the m effect
applySimplifiedPass
:: (HasTrace m, HasMeta m)
=> (Automation.PassArgument -> m Automation.t)
-- These form Pipeline.Step.t for some m over the
-- output modulo the ComputationResult.t over it
-> Pipeline.ComputationalInput
-> m Pipeline.WorkingEnvironment
runSimplifiedPass
:: (HasExtract _a m)
=> (Automation.PassArgument -> m Automation.t)
-> Pipeline.ComputationalInput
-> m (ComputationResult.t Pipeline.WorkingEnvironment)
runSimplifiedPass f = do
extract . applySimplifiedPass f
We can see the extract
or run
as the OCaml version calls it, takes the environment needed to run the pass, runs it, and gives us back out a ComputationResult.t
over the output type in the effect.
The HasMeta
and HasTrace
effects are needed in practice to support this. Note that we also get an output Monad type. This is needed as the pass may invoke IO actions, and thus we really get back an output_eff
ect from the pass itself.
The applySimplifiedPass
reflects the underlying purpose of the Automation
modules. Namely a pass has become as simple as (Automation.PassArgument -> m Automation.t)
, where we take only what we care about, and return back an effect over the job
type. the goal of this function then becomes the following:
job
/Automation.t
types and give back a Pipeline.WorkingEnvironment
Since the module has enough information via the constraints, this is fairly easy to achieve.
This leads into runSimplifiedPass
, which simply just composes applySimplifiedPass
and extract
/run
. We can see here how these two functions combine and fulfill the signature of Pipeline.Step.t
which we will repast here
module Step : sig
type t = computational_input -> working_environment ComputationResult.t
The simplify
function is a simplification that allows as pass to not worry if the argument has come from the context or not. This is useful for passes which do simple transformations but are ran after the point in which functions show up in the environment, or any passes before that we know the terms don't show up in the environment.
(** the job of [Automation] is to make the step function more amenable to writing passes
* It is not ergonomic to take extra information one may not care about *)
module Automation : sig
type stage
= Current
| FromTopToCurrent
| Eval
type process_job = {
current : Pipeline.env_or_sexp;
new_forms : (stage * Pipeline.env_or_sexp) list
}
type t =
| ProcessJob of process_job_no_env
| UpdateJob of { new_context : Sexp.t Context.t; process : process_job }
type job = t
(** [pass_arguments] is a pipeline processing function, namely we wrap
* the current expression and the context into a single function *)
type pass_argument = {
current : Pipeline.env_or_sexp;
context : Sexp.t Context.t
}
(* This represents the monad type that we require for a pass *)
module type MonadEff = sig
type 'a t
type 'a output_eff
(* Trace State effect *)
val get_trace : unit -> Trace.t t
val set_trace : Trace.t -> unit t
(* feed_back State effect *)
val get_feedback : unit -> Feedback.t t
val set_feedback : Feedback.t -> unit t
(* combining the two effects above *)
val get_meta : unit -> Meta.t t
val set_meta : Meta.t -> unit t
(* This is in all likelihood what is needed to setup the environment
* usurping the above effects for actually setting up the pass
* we leave the above effects for demonstrative purposes
*)
val has_env : unit -> Pipeline.surrounding_environment t
val set_env : Pipeline.surrounding_environment -> unit t
(* Comonad effect *)
val run : 'a t -> 'a output_eff
end
val transform_output_type
: job list -> Pipeline.working_environment
(* Type class in Haskell that dictates being able to extract to a step *)
module Runable :
functor (M : MonadEff) -> sig
type 'a t = 'a M.t
(** [run] runs the given environment, extracting out a result with meta data
* over the resulting value.
*)
val run : 'a t -> 'a ComputationResult.t M.output_eff
(** [apply_simplified_pass] serves as a HOF that allows for passes to be
* a more simplified type, namely a function that takes a [pass_arugment]
* to an effectual result over [job] that determines how the pass should
* be brought together.
* Note that [Pipeline.computational_input -> Pipeline.working_environment t]
* is an approximation of the [Step.t] type without the Meta information attached
*)
val apply_simplified_pass
: (pass_argument -> job t)
-> Pipeline.computational_input
-> Pipeline.working_environment t
(** [run_simplified_pass] simply combines the [run] function with
* [apply_simplified_pass], to get the output effect, which corresponds to a
* [Pipeline.Step.t] with an effect attached to it
*)
val run_simplified_pass
: (pass_argument -> job M.output_eff)
-> Pipeline.computational_input
-> Pipeline.working_environment ComputationResult.t M.output_eff
(** [simplify] allows a pass to ignore the fact that expression coming in may
* be added to the [Context.t] already, and we can act as if it were just a
* normal [Sexp.t] being passed in.
*)
val simplify : (simplified_pass_argument -> job M.output_eff)
-> (pass_argument -> job M.output_eff)
end
end
It would be remiss of us to not include any examples in this demonstration, so let us model a simple pass that separates an ADT sum type declaration from the constructors it may have.
Let our running example be
;; type foo = Bar int int | Baz int int
(:type foo ()
(Bar int int)
(Baz int int))
Let us first try to compile this down like
(:declare-type foo ())
(:constructor Bar foo int int)
(:constructor Baz foo int int)
We can use the new-pipeline like so
deconstructPass :: Pipeline.Step.Named
deconstructPass =
Step.name_pass
(Automation.runSimplifiedPass deconstructType)
"Desugar.deocnstruct-type"
deocnstructType Automation.{current} =
Trace.with "Desugar.deconstruct-type" [show current] $
Sexp.foldSearchPredWithExtra
(Automation.simplify f) (== Structure.typeName) current
|> \Sexp.Extra{data, extra} ->
Automation.PrcoessJob {current = data, newForms = extra}
where
f car cdr =
Trace.with "Desugar.deconstruct-type-pass" [show car, show cdr] $
case Structure.toType (car Sexp.:> cdr) of
Just typ ->
let extraData =
(typ ^. body)
>>= (\case Structure.Sum {name, arguments} ->
Structure.Constructor name (typ ^. name) arguments
|> pure
_ -> []
)
>>| \x -> (Automation.Current, Structure.fromConstructor x)
let currentForm =
Structure.DeclareType (typ ^. name) (typ ^. arguments)
|> Structure.fromDeclareType
Structure.Extra {data = currentForm, extra = extraData}
Nothing ->
-- we get back a Failure on the ComputationResult
throw @"failure" (ComputationResult.InvalidForm (car Sexp.:> cdr))
deconstructorPass
is the pass itself, that we can directly register. the implementation is held in the function deconstructType
. We show in gory details how precisely this pass works. we search for a certain structure, using the normal Sexp
functions, once it's found we run our pass which we call f
.
f
now constructs two pieces of data, the current form to return, along with any new forms which are defined from this. We construct the extra forms by checking if they are sum types, if they are, then great they are a constructor. We then take the constructors and say that we will compile it Current
ly along with the declaration-type
. Note that if we wanted to compile these ahead of time we could simply replace Current
with Eval
. In this case, it makes more sense to compile them Current
ly instead of ahead of time. As for the current form, this gets transformed into a generic declaration with the arguments in tact, and given back as the current expression. Finally, the FromEvalTopToCurrent
uses the current_step_name
discussed in the ##Pipeline Section, to run the compiler up until the current point and then work with them Current
ly together.
in 25 lines we have an entire new compiler pass! That may seem like a lot of lines, however most of them are simply type conversion to type safe versions in which to do processing on, or shifting one data type to another.
In short the Injection passes are a way to register pipeline passes that run on every step. For both passes that take signular env_or_sexp
through Automation
, and through passes that ought to be run befroe and after the Step.t
type.
To view the issue on inspiration and the state of this extension please read
https://linear.app/anoma/issue/JUVIX-939/injection-passes
The signature laid out to do this work is as follows
The siganture of these functions are
open Pipeline
val Env.registerBeforePass : Automation.pass_argument -> m Automation.job
val Env.registerAfterPass : Automation.pass_argument -> m Automation.job
val Env.register_before_pass_entireList : computational_input -> m computational_output
val Env.register_after_pass_entireList : comptuational_input -> m computational_output
As for us the Env.register*Pass
variants will run on individual s-expresions, and will run before or after the pass itself. When one runs Automation.aplySimplifiedPass
that function will automatically apply the registered functions.
the Env.rgister*PassEntireList
variants will run before or after each step at the eval level. Thus eval will handle running these registered steps before and after each step is called.
The implementation of this requires tweaking where functions are stored, since we want to tweak surroudning_environemnt
to be the following
type around = Before | After
type surrounding_environment = {
(* This will be taken as a read value *)
(** The [current_step_name] represents the current running step name *)
current_step_name : NameSymbol.t option;
meta_information : Meta.t;
on_single_pass : (around * Automation.pass_argument -> Automation.pass_argument aroundIO) list
}
type around_env = {
trace : Trace.t;
feedback : Feedback.t;
}
type aroundEnvIO a = MinIO {run : Sexp.t (IO.t around_env State.t) Except.t}
Since this requires values in Pipeline
to have information about Pipeline.Automation
, we must remove all data declarations in Pipeline.Automation
, and put them in Pipeline
, rexporting them in Pipeline.Automation