# Constraint evaluation AST
## Final format
Below is a description of the format for describing the constraint evaluation AST. The format is JSON-based. All field elements are represented in their canonical values encoded as strings.
At the root level, the AST is defined by the five components described below:
```json
{
metadata: "<field, trace, and variable definitions>",
zerofiers: "<an array denominators for constraint expressions>",
periodic: "<an array of periodic column definitions>",
expressions: "<an array of expressions defined by numerators/denominators>",
nodes: "<an array of all nodes in the AST>",
}
```
Each of these components is described below.
### Metadata
The metadata component describes the parameters of the field, the number of variables and trace segments to be provided as inputs for constraint evaluation.
Variables and trace segment values are specified as base field elements, but they can be interpreted as extension field elements during evaluation. Trace segments are expected to be 2-dimensional matrices in the **row-major** form. All trace segments are expected to have the same number of rows.
Here is an example of the metadata where:
- We use the Goldilocks field (with degree 2 extension).
- There are 2 variable groups with 5, 10, and 16 variables respectively.
- There are 2 trace segments with 64 and 14 columns respectively.
```json
{
field: {
name: "Goldilocks",
modulus: "18446744069414584321",
root_of_unity: "7277203076849721926",
coset_offset: "7",
extension: {
degree: 2,
polynom: "x^2 - x + 2",
}
},
num_variables: [5, 10, 16],
trace_widths: [64, 14],
}
```
For the `M31` field, the field parameters are specified analogously as follows:
```json
{
field: {
name: "M31",
modulus: "2147483647",
root_of_unity: "(311014874, 1584694829)",
extension: {
degree: 4,
polynom: "x^2 - 2 - i",
}
},
num_variables: [5, 10, 16],
trace_widths: [64, 14],
}
```
The main differences from the `Goldilocks` field format are:
- `root_of_unit` is specified by a tuple of base field elements.
- The `coset_offset` is implied by the `root_of_unity` and therefore is omitted.
- The irreducible polynomial of the degree 4 extension field is specified via the the irreducible polynomial of degree 2 extension field which is $x^2 + 1$. Thus, $i$ is the root of the $x^2 + 1$.
### Zerofiers
Zerofiers define the denominators used for each expression. Each zerofier is specified by an algebraic expression in a string format. For example, below is an array of 5 zerofiers:
```json
[
"x - 1", // first row
"x - g^(n - 1)", // last row
"x^n - 1", // every row
"(x^n - 1) / (x - g^(n - 1))", // every row except for the last row
"x^(n/2) - 1", // every odd row
]
```
Algebraic expressions can involve the following:
- Basic arithmetic operations: addition, multiplication, subtraction, division, and exponentiation.
- Ability to group expression using parenthesis.
- Ability to use constants, which are always in the base field.
- Ability to use 3 special variables:
- $n$ - the size of the multiplicative subgroup (i.e., the length of the trace).
- $g$ - the generator of the multiplicative subgroup of size $n$.
- $x$ - computed as $g^i$ where $i$ is the index of the row in the trace.
Note that $g$ and $x$ cannot be used in the exponent position.
For the `Goldilocks` field, these expressions can be evaluated directly to compute zerofier values at a given trace row. For the `M31` field, they will need to be translated into their equivalent for Circle STARK.
### Periodic columns
A list of periodic columns defines trace columns which have succinct descriptions. For example, below we describe two periodic columns: one with a period of 4 values, and the other one with a period of 8 values.
```json
[
["1", "0", "0", "0"],
["0", "0", "0", "0", "0", "0", "0", "0", "1"],
]
```
Values for periodic columns are always base field elements, and the period length is always a power of $2$ (usually a small power - e.g., 4, 8, 16).
### Expressions
An expression consists of a numerator and an optional denominator:
```json
{
node_id: 10, // node ID of the numerator entrypoint
zerofier_id: 1, // ID of the zerofier; can be omitted if no zerofier
}
```
The list of expressions defines the output of the evaluation. For example, if the list contains 2 expressions, evaluation results will be a matrix with 2 column and $n$ rows where $n$ is the size of the extended domain (i.e., the number of rows in a trace segment).
### Nodes
Each node in the `nodes` array has the following general form:
```json
{
name: "<optional name>",
type: "const | mul | add | sub | trace | var | periodic",
args: "dpends on type",
value: "base | ext",
}
```
Each node type is described in more detail in the sections below.
#### const
The `const` node specifies a constant in the base field.
```json
{
type: "const",
args: {
value: "123456"
},
value: "base", // constants are always in the base field
}
```
#### add, mul, sub
The `mul`, `add`, and `sub` nodes specify binary operations.
```json
{
type: "mul",
args: {
lhs: 10, // node ID of the left-hand-side operand
rhs: 11, // node ID of the right-hand-side operand
},
value: "ext",
}
```
The `value` field must be derived deterministically from the values of the node's children as follows: if the value of either of the children is `ext`, the node's value should also be `ext`. Otherwise, the node's value is `base`.
#### trace
The `trace` node is a relative reference to a value in a trace matrix. It can be specified using a `segment`, `col_offset`, and `row_offset`:
```json
{
type: "trace",
args: {
segment: 0,
col_offset: 1,
row_offset: 0,
},
value: "base",
}
```
The above refers to a base field element located at `traces[0][i][1]`, where $i$ is the current row index.
Trace values can also refer to extension field elements as follows:
```json
{
type: "trace",
args: {
segment: 0,
col_offset: 1,
row_offset: 0,
},
value: "ext",
}
```
The above refers to an extension field element located at `trace[0][i][1..3]`, where $i$ is the current row index.
The `offset` parameter is a positive integer defining the row offset in relation to the current row. For example:
```json
{
type: "trace",
args: {
segment: 0,
col_offset: 1,
row_offset: 1,
},
value: "base",
}
```
Refers to a base field element located at `traces[0][i+1][1]`.
#### var
The `var` node is reference to a value in the `variables` vectors. It can be specified using a `group` and an `offset`:
```json
{
type: "var",
args: {
group: 0,
offset: 1,
},
value: "base",
}
```
The above refers to a single base field element located in `variables[0][1]`.
Variables can also refer to extension field elements as follows:
```json
{
type: "var",
args: {
group: 0,
offset: 1,
},
value: "ext",
}
```
The above refers to an extension field element (defined as 2 base field elements) located in `variables[0][1..3]`.
#### periodic
The `periodic` node is a reference to a value of a periodic column.
```json
{
type: "periodic",
args: {
column: 0,
},
value: "base", // periodic column values are always in the base field
}
```
## Ideation
```rust
pub fn evaluate(
expressions: Expressions,
base_trace: Vec<Matrix<Felt>>,
base_variables: Vec<Vec<Felt>>,
extension_trace: Vec<Matrix<ExtFelt>>,
extension_variables: Vec<Vec<ExtFelt>>,
) -> Matrix
```
```rust
let evaluations = fabric::evaluate(
keccak_kernel_id,
keccak_trace,
variables
);
```
### Potential data structure
```rust
pub struct Expressions {
nodes: Vec<Node>,
roots: Vec<ExpressionRoot>,
};
pub struct Node {
operation: Operation,
result_type: ResultType,
}
pub enum ResultType {
BaseField,
ExtensionField,
}
pub enum Operation {
Value(Value),
Add(usize, usize),
Sub(usize, usize),
Mul(usize, usize),
}
pub enum Value {
Constant(u64),
BaseTrace(TraceRef),
ExtensionTrace(TraceRef)
BaseVariable(VarRef),
ExtensionVariable(VarRef),
}
pub struct TraceRef {
pub stage_idx: usize,
pub column_idx: usize,
pub row_offset: usize,
}
pub struct VarRef {
pub group_idx: usize,
pub var_idx: usize,
}
pub struct ExpressionRoot {
node_idx: usize,
divisor: ExpressionDivisor,
}
pub enum ExpressionDivisor {
None,
FirstRow,
LastRow,
EveryRow,
EveryFrame(usize),
}
```
### Open questions
- How to handle extension field operations?
- All extension field operations are expanded in the AST.
- Extension field operations are inferred from the operands.
- AST explicilty defineds extension field operations.
- ~~Do we support `copy` expressions?~~
- Do we include expression divisors into the calculator?
- Do we want to handle some "stages" separately from others? For example, pre-processed polynomials, periodic columns etc.
- Do we treat trace as several matrixes or as a single "merged" matrix?
### Examples
```
(a + b) + c * (a + b)
nodes = [
0: a,
1: b,
2: add(0, 1),
3: c,
4: mul(3, 2),
5: add(2, 4),
]
```
## Format v1
Below is a description of the draft format for describing the AST. The format is JSON-based, but this can be changed if needed.
At the root level, the AST is defined by the five components described below:
```json
{
metadata: "<field, trace, and variable definitions>",
zerofiers: "<an array denominators for constraint expressions>",
periodic: "<an array of periodic column definitions>",
expressions: "<an array of expressions defined by numerators/denominators>",
nodes: "<an array of all nodes in the AST>",
}
```
Each of these components is described below.
### Metadata
The metadata component describes the parameters of the field used, the number of variables to be provided by field type, and the number of trace segments to be provided by field type.
Here is an example of the metadata where:
- We use the Goldilocks field (with degree 2 extension).
- There are 2 variable groups in the base field. The first group will have 5 variables, and the second group will have 10 variables.
- There are 2 trace segments: one segment in the base field with 64 columns, and one segment in the extension field with 7 columns.
```json
{
field: {
name: "Goldilocks",
modulus: "18446744069414584321",
root_of_unity: "7277203076849721926",
coset_offset: "7",
extension: {
degree: 2,
polynom: ["1", "-1", "2"], // encodes "x^2 - x + 2",
}
},
num_variables: {
base: [5, 10],
extension: [8],
},
trace_widths: {
base: [64],
extension: [7],
}
}
```
### Zerofiers
Zerofiers define the denominators used for each expression. Each zerofier is specified by an algebraic expression in a string format. For example, below is an array of 5 zerofiers:
```json
[
"x - 1", // first row
"x - g^(n - 1)", // last row
"x^n - 1", // every row
"(x^n - 1) / (x - g^(n - 1))", // every row except for the last row
"x^(n/2) - 1", // every odd row
]
```
Algebraic expressions can involve the following:
- Basic arithmetic operations: addition, multiplication, subtraction, division, and exponentiation.
- Ability to group expression using parenthesis.
- Ability to use constants, which are always in the base field.
- Ability to use 3 special variables:
- $n$ - the size of the multiplicative subgroup (i.e., the length of the trace).
- $g$ - the generator of the multiplicative subgroup of size $n$.
- $x$ - computed as $g^i$ where $i$ is the index of the row in the trace.
Note that $g$ and $x$ cannot be used in the exponent position.
### Periodic columns
A list of periodic columns defines trace columns which have succinct descriptions. For example, below we describe two periodic columns: one with a period of 4 values, and the other one with a period of 8 values.
```json
[
["1", "0", "0", "0"],
["0", "0", "0", "0", "0", "0", "0", "0", "1"],
]
```
Values for periodic columns are always base field elements, and the period length is always a power of $2$ (usually a small power - e.g., 4, 8, 16).
### Expressions
An expression consists of a numerator and an optional denominator:
```json
{
node_id: 10, // node ID of the numerator entrypoint
zerofier_id: 1, // ID of the zerofier; can be omitted if no zerofier
}
```
### Nodes
Each node in the `nodes` array has the following general form:
```json
{
name: "<optional name>",
type: "const | mul | add | sub | trace | var | periodic",
args: "dpends on type",
value: "base | ext",
}
```
Each node type is described in more detail in the sections below.
#### const
The `const` node specifies a constant in the base field.
```json
{
type: "const",
args: {
value: "123456"
},
value: "base", // constants are always in the base field
}
```
#### add, mul, sub
The `mul`, `add`, and `sub` nodes specify binary operations.
```json
{
type: "mul",
args: {
lhs: 10, // node ID of the left-hand-side operand
rhs: 11, // node ID of the right-hand-side operand
},
value: "ext",
}
```
The `value` field must be derived deterministically from the values of the node's children as follows: if the value of either of the children is `ext`, the node's value should also be `ext`. Otherwise, the node's value is `base`.
#### trace
The `trace` node is a relative reference to a cell in a trace matrix (either base or extension trace based on the `value` field).
```json
{
type: "trace",
args: {
segment: 0,
column: 1,
row_offset: 0,
},
value: "base",
}
```
#### var
The `var` node is reference to a cell in a variables vector (either base or extension vector based on the `value` field).
```json
{
type: "var",
args: {
group: 0,
variable: 1,
},
value: "ext",
}
```
#### periodic
The `periodic` node is a reference to a value of a periodic column.
```json
{
type: "periodic",
args: {
column: 0,
},
value: "base", // periodic column values are always in the base field
}
```
### Future format enhancements
The format currently does not include:
- Ability to specify constant trace segments.
## Format v2
This format is similar to the v1 format except trace matrices and variables are defined in terms of base field elements.
For example, a `metadata` section equivalent to the previous example would look as follows:
```json
{
field: {
name: "Goldilocks",
modulus: "18446744069414584321",
root_of_unity: "7277203076849721926",
coset_offset: "7",
extension: {
degree: 2,
polynom: ["1", "-1", "2"], // encodes "x^2 - x + 2",
}
},
num_variables: [5, 10, 16],
trace_widths: [64, 14],
}
```
Notice that $8$ extension field variables are now expressed as $16$ base field variables. Similarly, extension trace of $7$ columns is now expressed as a segment of $14$ base field columns. This is because the degree of the extension field is $2$.
`trace` and `var` nodes can still refer to extension field elements, but the references are based on base element offsets. For example:
```json
{
type: "var",
args: {
group: 2,
offset: 1,
},
value: "base",
}
```
Refers to a single base field element located in `variables[2][1]`. However,
```json
{
type: "var",
args: {
group: 2,
offset: 1,
},
value: "ext",
}
```
Refers to an extension field element (defined as 2 base field elements) located in `variables[2][1..3]`.
Similarly,
```json
{
type: "trace",
args: {
segment: 1,
col_offset: 2,
row_offset: 0,
},
value: "base",
}
```
Refers to a base field element located at `traces[1][i][2]`, where $i$ is the current row index. while:
```json
{
type: "trace",
args: {
segment: 1,
col_offset: 2,
row_offset: 0,
},
value: "ext",
}
```
Refers to an extension field element located at `trace[1][i][2..4]`, where $i$ is the current row index.
> [!NOTE]
> This format assumes that trace matrices are always in the row-major form.