# MiniZinc 3.0 Planning
## Optimisation Toolkit
- How to integrate diversity/conflict resolution into new compiler given its current form (Python driver using current MiniZinc)
- Avoid having to write Python drivers
- Fit black box optimisation methods into this framework?
- Could put black box into CBLS?
## High-level planning
### Goals
- Produce larger models faster (large FlatZinc, e.g. linear)
- Modularise the compiler core so that analysis/transformation phases can be plugged in
- Model level
- Diversity
- Online
- Fairness
- Stochastic
- Conflict resolution/softening
- Language of learning
- Instance level
- Globalizer (but transforms model)
- FindMUS
- Incrementality of the interpreter (entrypoints in bytecode rely on global state)
- Maintainability of the compiler
- Good traceability of compilation (paths) and error handling
- Incremental interfacing with solvers
- Consider other kinds of solvers and what information they need
- Documentation
- Of the compiler and interpreter
- And user documentation
- Extensibillity so that we can add new language features
TODO: Discuss recompilation, paths, certified translation, trace giving explanation of analysis
How to modify objective for incremental interpreter? Function to post a constraint on a free objective variable.
```
var 1..3: a;
var 1..3: b;
% Introduce a predicate which posts a constraint to set the objective
predicate make_objective() = x = a + b;
var int: x;
solve minimize x;
```
Separation of objective from the model?
How to disable phases of the compiler (e.g. so that FindMUS doesn't just get a `false` model)? Turn off propagation in the interpreter/replace with constraints?
### Architecture
- MiniZinc is the input model format, without data
- DataZinc only includes data for instances
- MicroZinc is a lower level subset of MiniZinc
- The bytecode is an imperative representation of the MicroZinc
- NanoZinc binds constraints to variables
#### Overall architecture
```mermaid
flowchart TD;
mzn>MiniZinc model files]--Parsing-->cst["Parse tree (CST)"]
cst-->ast["Abstract syntax tree (AST)"]
ast--"Lowering (syntactic desugaring)"-->hir["High-level intermediate representation"]
subgraph parser [Parsing]
cst
ast
end
hir-->validation[/"Validation (e.g. type checking)"/]
validation-->thir["Typed high-level intermediate representation"]
validation-->ls["Code tools (e.g. language server)"]
thir-->transform[/"Basic transformations"/]
transform-->mir["Middle-level intermediate representation"]
mir-->analysis[/"Code analysis (e.g. mode analysis)"/]
analysis-->rewrite[/"Rewriting"/]
rewrite-->uzn[MicroZinc]
uzn-->codegen[/"Code generation"/]
codegen-->bc[Bytecode]
subgraph compiler [Data independent compilation]
hir
thir
mir
validation
transform
analysis
rewrite
uzn
codegen
bc
end
dzn>Data files]
bc--->interpret
dzn-->interpret
interpret[/Interpretation/]
interpret-->nzn[NanoZinc]
subgraph interpreter [Interpreter]
interpret
nzn
end
nzn-->solver[FlatZinc solver]
solver-->result[Result]
result-->heuristic[Search heuristic]
heuristic-->solution(Solution)
heuristic--->interpret
subgraph solving [Solving]
solver
result
heuristic
solution
end
```
##### Key features
- Compiler frontend is designed with tooling in mind
- Compilation is data independent
- Modular - some parts can be added to the pipeline
- Incremental
- Main compilation step targets MicroZinc
- Code generation step targets a bytecode
- Interpretation targets NanoZinc/FlatZinc
#### Detailed architecture
```mermaid
flowchart TD;
mzn>MiniZinc input files]--"Parsing"-->cst[CST]
cst--"Removal of non-semantic elements"-->ast["AST"]
ast--Include resolution-->includeres{{Are includes files found?}}
includeres--Yes-->includes[Included MiniZinc files]
includeres--No-->error1(Stop)
includes--Parsing-->cst
subgraph syntaxmod [Syntax module]
cst
includeres
includes
ast
end
ast--"Lowering (syntactic desugaring)"---->hir[HIR]
hir--"Scope collection"-->scope[Scope for expressions]
scope-->typecheck
hir-->typecheck
subgraph typecheck [Type checker]
typing[/Type inference/]
nameres[/Name resolution/]
fnres[/Overloading resolution/]
end
typecheck-->typed[Resolved and typed HIR]
typed--Validation-->validate{{Is program valid?}}
validate--Yes-->validHir[Valid HIR]
validate--No-->error2(Stop)
subgraph hirmod [HIR module]
hir
scope
typecheck
typed
validate
validHir
end
validHir--"Lowering"-->thir[THIR]
thir-->rewrite[/Tree rewriting/]
rewrite-->thir
subgraph thirmod [THIR module]
thir
rewrite
end
thir--Pretty printing--->mzno[MiniZinc output for testing]
thir--"Lowering"--->mir[MIR]
mir--"Totality analysis"-->totalMir[Totalised MIR]
totalMir--"Type specialisation"-->spec[Type-specialised MIR]
spec--"Context analysis"-->ctxMir[Context-analysed MIR]
ctxMir--"Optionality analysis"-->optMir["Optionality-analysed MIR"]
optMir--"Inlining, hoisting"-->inlined["Inlined MIR"]
inlined--"CSE analysis"-->cse["CSE analysed MIR"]
cse--Compilation-->uzn
uzn[MicroZinc]
subgraph mirmod [MIR module]
mir
totalMir
ctxMir
optMir
inlined
cse
spec
uzn
end
uzn--Pretty printing-->uzno[MicroZinc output for testing]
uzn--"Type erasure for enums/records"-->erased[Partially type erased MIR]
erased--Code generation-->bc[Bytecode]
bc--Constant folding/propagation-->obc[Optimised bytecode]
subgraph codegenmod [Code generation module]
bc
obc
erased
end
obc-->interpret[/Interpreter/]
dzn>Data files]-->interpret
interpret--Interpretation-->nzn(NanoZinc)
subgraph interpreter [Interpreter]
interpret
end
```
##### Components
- Syntax module (parsing)
- Could allow swapping in other parsers to support other languages
- High-level intermediate representation (HIR) module
- Syntactic desugaring
- Scope/type checking
- Program validation
- Avoids recomputation on modification of source files
- Final stage which tries to continue in the presence of errors
- Typed high-level intermediate representation (THIR) module
- Combines the type-checking results with the HIR
- Provides an API for performing transformations
- Middle-level intermediate representation (MIR) module
- Generates code for e.g. enum constructors/deconstructors
- Involves the bulk of the compilation into MicroZinc
- Used to generate the interpreter bytecode
- Could also anaylse what could be done in parallel in the interpreter
- Interpreter
- Takes the data files and the bytecode to produce NanoZinc
### Milestones
The components needed are
- [X] Parser
- [X] Typechecker
- [X] Pretty printing of MiniZinc (entire program including compiled stdlib)
- [ ] Rewriting into SSA-like form
- [ ] Totalisation
- [ ] Mode analysis (non essential, other than trivial)
- [ ] Inlining, hoisting (non essential)
- [ ] CSE analysis (non essential)
- [ ] Type specialisation
- [X] Specialisation of generics
- [ ] Specialisation of reification
- [ ] Also need to remove overloading, by adding a preamble to functions which checks if args are fixed or occur and dispatch to the more specific function
- [ ] Passing globals as arguments when referred to in a function (could be extended to closures)
- [ ] Type erasure
- [ ] Remove optionality
- [ ] Remove enums
- [ ] Remove records
- [ ] Pretty printing to MicroZinc
- [ ] Generation of bytecode
The essential milestones are
- [X] Spec for MicroZinc (Jip, 2022-08-26)
- [X] Pretty printing of THIR as MiniZinc, running the test suite, etc. (Jason, 2022-09-02)
- [X] Monomorphisation
- [x] Removal of overloading (adding dispatch to par/non-opt versions) (Jason, 2023-03-03)
- [x] Desugaring of var if-then-else, var comprehensions (Jason, 2023-03-03)
- [ ] Decomposing optional types and records (Jason, 2023-03-17)
- [ ] Rewriting into SSA
- [ ] Totalisation
- [ ] Reification specialisation (minimal mode analysis)
- [ ] Pretty printing of MIR
Compiler working 2023-06-30
- [x] API for inserting custom transformations
Next phase is MicroZinc interpreter
### Interpreter
- Define the input format
- Textual, with annotations for expression types
- Read DZN input data
- Enum handling
- Command line handling
- Solver interfaces
- Output processing
## Technical planning
### Language freeze
Possible language features
- [ ] Modules (instead of includes)
- [ ] Nested problems (requires changes to solver interface)
- [ ] Unit types
- [X] Product enum constructors
- [X] Tuples/records
- [X] Type aliases
- [X] Case/pattern matching
- Only allowed for enums?
- [ ] Closures/anonymous functions?
### Analysis phases
#### Type specialisation analysis
- If a function with an enum arg doesn't call any special enum functions it can just be specialised to `int`
#### Context analysis
- Determines what context is enforced on the arguments to a function with respect to the context of the call
- Does this benefit from doing type specialisation?
- Likely not since you can only call generic functions on generic arguments
- Generates predicates depending on the mode that it's being called in
#### Optionality analysis
- Transforms optional types which are known to occur into non-optional types.
- How useful is this data independent?
#### Binding analysis?
#### Representation of analysis results
Currently we have several disjoint ways of representing analysis results:
- Annotations
- Attached to nodes in the AST
- Side hashtables/sets
- Transforming calls
Could have a global database with these results?
### Modular compiler phases
- Where can these slot in?
- Can you use completely external phases (de/serialise IR?)
- Some kind of plugin manager?
- Build in an iterative solving loop into the main driver
- Controlled by predicates in the model
- Allow plugins for the compiler phases
- Plugins can be generalised to work with the interpreter and compiler at different entrypoints
### Desugaring / type erasure
Syntactic desugaring (before type-checking)
- Operators into functions
- Indexed array literals into calls
- String interpolation into calls
After type-checking
- Destructuring declarations into individual declarations
- Case into element constraint or if-then-else
- No capturing in functions
- Only returning identifiers from functions
- Rewriting into something more like SSA?
After type-specialisation
- Enums into integers
- Records into tuples
- Optional types into tuple of boolean of occurs and deopt value
- No overloading
### MicroZinc
TODO: Create proper spec for MicroZinc
### Output
### Data format
- How would data input work with the interpreter? E.g. we need to handle enums/unit types/records even though the types will be erased in the bytecode
- Need to maintain some kind of mapping with enough to do this
- Data format should suport JSON for all types
- What should actually be allowed in DZN?
- Magic constants?
- Arbitrary function calls?
- Annotations?
- No assignment to variables, no `_` (use `<>` instead)
### Annotations
- Need to specify how annotations should actually work
### Incrementality
- Can incrementally call predicates
- Ability to add/retract annotations?
- E.g. search annotations
- Where are annotations handled?
### Compiler API
### Testing
### Documentation / RFCs
## Meeting 2023-03-17
### Totalisation
- Introduce a `let` at boolean contexts
- Create tuple of boolean and value
- Use boolean for the `in` of the boolean context `let`
Totalising
```
function int: f(int: x, int: y) = ...;
function int: g(int: x, int: y) = ...;
function int: h(int: x, int: y) = ...;
function int: h2(int: x, int: y) = ...;
function int: g2(int: x, int: y) = ...;
function int: foo(int: x, int: y, int: z) =
f(g(h(x),h2(y)), g2(z));
```
Would result in
```
function tuple(bool, int): f(int: x, int: y) = ...;
function tuple(bool, int): g(int: x, int: y) = ...;
function tuple(bool, int): h(int: x, int: y) = ...;
function tuple(bool, int): h2(int: x, int: y) = ...;
function tuple(bool, int): g2(int: x, int: y) = ...;
function tuple(bool, int): foo(int: x, int: y, int: z) :: promise_total =
let {
tuple(bool, int): hx = hx(x);
tuple(bool, int): h2y = if hx.1 then h2(y) else (false, 0) endif;
tuple(bool, int): gv = if h2y.1 then g(hx.2, h2y.2) else (false, 0) endif;
tuple(bool, int): g2z = if gv.1 then g2(z) else (false, 0) endif;
tuple(bool, int): fv = if g2z.1 then f(gv.2, g2z.2) else (false, 0) endif;
} in fv;
```
For `var`, we the resulting boolean would be a `forall` of the partiality booleans, and the if-then-elses work on `fix(b)`
#### Partiality due to domains
Domain constraints become part of the partiality boolean
```
enum E = {A, B, C, D, E};
set of E: PartOfE = {A, C, E};
function int: foo(PartOfE: x) = 1;
```
Would become
```
function tuple(bool, int): foo(E: x) :: promise_total =
let {
bool: b = x in F;
constraint assert_warn(b, "Argument x is \(x), which is outside of set \(PartOfE)");
} in (b, 1);
```
### Mode analysis
```
function var int: weird(var int: x, var int: y) =
let { constraint y >= 0 } in x - y;
```
Totalised:
```
function tuple(var bool, var int): weird(var int: x, var int: y) =
let {
var bool: b = y >= 0;
var int: v = x - y;
} in (b, v);
```
Mode analysis needs to give monotone/antitone/no effect/mixed annotations to each variable with respect to each returned value.
```
function tuple(var bool, var int): weird(
var int: x :: mode_analysis((no_effect, monotone)),
var int: y :: mode_analysis((monotone, antitone))) =
let {
var bool: b = y >= 0;
var int: v = x - y;
} in (b, v);
```
Manually annotate builtins.
## Meeting 2024-01-24
- [ ] Merge totalisation (Jason)
- [ ] Reification specialisation (Jason)
a
- Design interpreter