MiniZinc 3.0 Planning

# MiniZinc 3.0 Planning ## Optimisation Toolkit - How to integrate diversity/conflict resolution into new compiler given its current form (Python driver using current MiniZinc) - Avoid having to write Python drivers - Fit black box optimisation methods into this framework? - Could put black box into CBLS? ## High-level planning ### Goals - Produce larger models faster (large FlatZinc, e.g. linear) - Modularise the compiler core so that analysis/transformation phases can be plugged in - Model level - Diversity - Online - Fairness - Stochastic - Conflict resolution/softening - Language of learning - Instance level - Globalizer (but transforms model) - FindMUS - Incrementality of the interpreter (entrypoints in bytecode rely on global state) - Maintainability of the compiler - Good traceability of compilation (paths) and error handling - Incremental interfacing with solvers - Consider other kinds of solvers and what information they need - Documentation - Of the compiler and interpreter - And user documentation - Extensibillity so that we can add new language features TODO: Discuss recompilation, paths, certified translation, trace giving explanation of analysis How to modify objective for incremental interpreter? Function to post a constraint on a free objective variable. ``` var 1..3: a; var 1..3: b; % Introduce a predicate which posts a constraint to set the objective predicate make_objective() = x = a + b; var int: x; solve minimize x; ``` Separation of objective from the model? How to disable phases of the compiler (e.g. so that FindMUS doesn't just get a `false` model)? Turn off propagation in the interpreter/replace with constraints? ### Architecture - MiniZinc is the input model format, without data - DataZinc only includes data for instances - MicroZinc is a lower level subset of MiniZinc - The bytecode is an imperative representation of the MicroZinc - NanoZinc binds constraints to variables #### Overall architecture ```mermaid flowchart TD; mzn>MiniZinc model files]--Parsing-->cst["Parse tree (CST)"] cst-->ast["Abstract syntax tree (AST)"] ast--"Lowering (syntactic desugaring)"-->hir["High-level intermediate representation"] subgraph parser [Parsing] cst ast end hir-->validation[/"Validation (e.g. type checking)"/] validation-->thir["Typed high-level intermediate representation"] validation-->ls["Code tools (e.g. language server)"] thir-->transform[/"Basic transformations"/] transform-->mir["Middle-level intermediate representation"] mir-->analysis[/"Code analysis (e.g. mode analysis)"/] analysis-->rewrite[/"Rewriting"/] rewrite-->uzn[MicroZinc] uzn-->codegen[/"Code generation"/] codegen-->bc[Bytecode] subgraph compiler [Data independent compilation] hir thir mir validation transform analysis rewrite uzn codegen bc end dzn>Data files] bc--->interpret dzn-->interpret interpret[/Interpretation/] interpret-->nzn[NanoZinc] subgraph interpreter [Interpreter] interpret nzn end nzn-->solver[FlatZinc solver] solver-->result[Result] result-->heuristic[Search heuristic] heuristic-->solution(Solution) heuristic--->interpret subgraph solving [Solving] solver result heuristic solution end ``` ##### Key features - Compiler frontend is designed with tooling in mind - Compilation is data independent - Modular - some parts can be added to the pipeline - Incremental - Main compilation step targets MicroZinc - Code generation step targets a bytecode - Interpretation targets NanoZinc/FlatZinc #### Detailed architecture ```mermaid flowchart TD; mzn>MiniZinc input files]--"Parsing"-->cst[CST] cst--"Removal of non-semantic elements"-->ast["AST"] ast--Include resolution-->includeres{{Are includes files found?}} includeres--Yes-->includes[Included MiniZinc files] includeres--No-->error1(Stop) includes--Parsing-->cst subgraph syntaxmod [Syntax module] cst includeres includes ast end ast--"Lowering (syntactic desugaring)"---->hir[HIR] hir--"Scope collection"-->scope[Scope for expressions] scope-->typecheck hir-->typecheck subgraph typecheck [Type checker] typing[/Type inference/] nameres[/Name resolution/] fnres[/Overloading resolution/] end typecheck-->typed[Resolved and typed HIR] typed--Validation-->validate{{Is program valid?}} validate--Yes-->validHir[Valid HIR] validate--No-->error2(Stop) subgraph hirmod [HIR module] hir scope typecheck typed validate validHir end validHir--"Lowering"-->thir[THIR] thir-->rewrite[/Tree rewriting/] rewrite-->thir subgraph thirmod [THIR module] thir rewrite end thir--Pretty printing--->mzno[MiniZinc output for testing] thir--"Lowering"--->mir[MIR] mir--"Totality analysis"-->totalMir[Totalised MIR] totalMir--"Type specialisation"-->spec[Type-specialised MIR] spec--"Context analysis"-->ctxMir[Context-analysed MIR] ctxMir--"Optionality analysis"-->optMir["Optionality-analysed MIR"] optMir--"Inlining, hoisting"-->inlined["Inlined MIR"] inlined--"CSE analysis"-->cse["CSE analysed MIR"] cse--Compilation-->uzn uzn[MicroZinc] subgraph mirmod [MIR module] mir totalMir ctxMir optMir inlined cse spec uzn end uzn--Pretty printing-->uzno[MicroZinc output for testing] uzn--"Type erasure for enums/records"-->erased[Partially type erased MIR] erased--Code generation-->bc[Bytecode] bc--Constant folding/propagation-->obc[Optimised bytecode] subgraph codegenmod [Code generation module] bc obc erased end obc-->interpret[/Interpreter/] dzn>Data files]-->interpret interpret--Interpretation-->nzn(NanoZinc) subgraph interpreter [Interpreter] interpret end ``` ##### Components - Syntax module (parsing) - Could allow swapping in other parsers to support other languages - High-level intermediate representation (HIR) module - Syntactic desugaring - Scope/type checking - Program validation - Avoids recomputation on modification of source files - Final stage which tries to continue in the presence of errors - Typed high-level intermediate representation (THIR) module - Combines the type-checking results with the HIR - Provides an API for performing transformations - Middle-level intermediate representation (MIR) module - Generates code for e.g. enum constructors/deconstructors - Involves the bulk of the compilation into MicroZinc - Used to generate the interpreter bytecode - Could also anaylse what could be done in parallel in the interpreter - Interpreter - Takes the data files and the bytecode to produce NanoZinc ### Milestones The components needed are - [X] Parser - [X] Typechecker - [X] Pretty printing of MiniZinc (entire program including compiled stdlib) - [ ] Rewriting into SSA-like form - [ ] Totalisation - [ ] Mode analysis (non essential, other than trivial) - [ ] Inlining, hoisting (non essential) - [ ] CSE analysis (non essential) - [ ] Type specialisation - [X] Specialisation of generics - [ ] Specialisation of reification - [ ] Also need to remove overloading, by adding a preamble to functions which checks if args are fixed or occur and dispatch to the more specific function - [ ] Passing globals as arguments when referred to in a function (could be extended to closures) - [ ] Type erasure - [ ] Remove optionality - [ ] Remove enums - [ ] Remove records - [ ] Pretty printing to MicroZinc - [ ] Generation of bytecode The essential milestones are - [X] Spec for MicroZinc (Jip, 2022-08-26) - [X] Pretty printing of THIR as MiniZinc, running the test suite, etc. (Jason, 2022-09-02) - [X] Monomorphisation - [x] Removal of overloading (adding dispatch to par/non-opt versions) (Jason, 2023-03-03) - [x] Desugaring of var if-then-else, var comprehensions (Jason, 2023-03-03) - [ ] Decomposing optional types and records (Jason, 2023-03-17) - [ ] Rewriting into SSA - [ ] Totalisation - [ ] Reification specialisation (minimal mode analysis) - [ ] Pretty printing of MIR Compiler working 2023-06-30 - [x] API for inserting custom transformations Next phase is MicroZinc interpreter ### Interpreter - Define the input format - Textual, with annotations for expression types - Read DZN input data - Enum handling - Command line handling - Solver interfaces - Output processing ## Technical planning ### Language freeze Possible language features - [ ] Modules (instead of includes) - [ ] Nested problems (requires changes to solver interface) - [ ] Unit types - [X] Product enum constructors - [X] Tuples/records - [X] Type aliases - [X] Case/pattern matching - Only allowed for enums? - [ ] Closures/anonymous functions? ### Analysis phases #### Type specialisation analysis - If a function with an enum arg doesn't call any special enum functions it can just be specialised to `int` #### Context analysis - Determines what context is enforced on the arguments to a function with respect to the context of the call - Does this benefit from doing type specialisation? - Likely not since you can only call generic functions on generic arguments - Generates predicates depending on the mode that it's being called in #### Optionality analysis - Transforms optional types which are known to occur into non-optional types. - How useful is this data independent? #### Binding analysis? #### Representation of analysis results Currently we have several disjoint ways of representing analysis results: - Annotations - Attached to nodes in the AST - Side hashtables/sets - Transforming calls Could have a global database with these results? ### Modular compiler phases - Where can these slot in? - Can you use completely external phases (de/serialise IR?) - Some kind of plugin manager? - Build in an iterative solving loop into the main driver - Controlled by predicates in the model - Allow plugins for the compiler phases - Plugins can be generalised to work with the interpreter and compiler at different entrypoints ### Desugaring / type erasure Syntactic desugaring (before type-checking) - Operators into functions - Indexed array literals into calls - String interpolation into calls After type-checking - Destructuring declarations into individual declarations - Case into element constraint or if-then-else - No capturing in functions - Only returning identifiers from functions - Rewriting into something more like SSA? After type-specialisation - Enums into integers - Records into tuples - Optional types into tuple of boolean of occurs and deopt value - No overloading ### MicroZinc TODO: Create proper spec for MicroZinc ### Output ### Data format - How would data input work with the interpreter? E.g. we need to handle enums/unit types/records even though the types will be erased in the bytecode - Need to maintain some kind of mapping with enough to do this - Data format should suport JSON for all types - What should actually be allowed in DZN? - Magic constants? - Arbitrary function calls? - Annotations? - No assignment to variables, no `_` (use `<>` instead) ### Annotations - Need to specify how annotations should actually work ### Incrementality - Can incrementally call predicates - Ability to add/retract annotations? - E.g. search annotations - Where are annotations handled? ### Compiler API ### Testing ### Documentation / RFCs ## Meeting 2023-03-17 ### Totalisation - Introduce a `let` at boolean contexts - Create tuple of boolean and value - Use boolean for the `in` of the boolean context `let` Totalising ``` function int: f(int: x, int: y) = ...; function int: g(int: x, int: y) = ...; function int: h(int: x, int: y) = ...; function int: h2(int: x, int: y) = ...; function int: g2(int: x, int: y) = ...; function int: foo(int: x, int: y, int: z) = f(g(h(x),h2(y)), g2(z)); ``` Would result in ``` function tuple(bool, int): f(int: x, int: y) = ...; function tuple(bool, int): g(int: x, int: y) = ...; function tuple(bool, int): h(int: x, int: y) = ...; function tuple(bool, int): h2(int: x, int: y) = ...; function tuple(bool, int): g2(int: x, int: y) = ...; function tuple(bool, int): foo(int: x, int: y, int: z) :: promise_total = let { tuple(bool, int): hx = hx(x); tuple(bool, int): h2y = if hx.1 then h2(y) else (false, 0) endif; tuple(bool, int): gv = if h2y.1 then g(hx.2, h2y.2) else (false, 0) endif; tuple(bool, int): g2z = if gv.1 then g2(z) else (false, 0) endif; tuple(bool, int): fv = if g2z.1 then f(gv.2, g2z.2) else (false, 0) endif; } in fv; ``` For `var`, we the resulting boolean would be a `forall` of the partiality booleans, and the if-then-elses work on `fix(b)` #### Partiality due to domains Domain constraints become part of the partiality boolean ``` enum E = {A, B, C, D, E}; set of E: PartOfE = {A, C, E}; function int: foo(PartOfE: x) = 1; ``` Would become ``` function tuple(bool, int): foo(E: x) :: promise_total = let { bool: b = x in F; constraint assert_warn(b, "Argument x is \(x), which is outside of set \(PartOfE)"); } in (b, 1); ``` ### Mode analysis ``` function var int: weird(var int: x, var int: y) = let { constraint y >= 0 } in x - y; ``` Totalised: ``` function tuple(var bool, var int): weird(var int: x, var int: y) = let { var bool: b = y >= 0; var int: v = x - y; } in (b, v); ``` Mode analysis needs to give monotone/antitone/no effect/mixed annotations to each variable with respect to each returned value. ``` function tuple(var bool, var int): weird( var int: x :: mode_analysis((no_effect, monotone)), var int: y :: mode_analysis((monotone, antitone))) = let { var bool: b = y >= 0; var int: v = x - y; } in (b, v); ``` Manually annotate builtins. ## Meeting 2024-01-24 - [ ] Merge totalisation (Jason) - [ ] Reification specialisation (Jason) a - Design interpreter

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.