# JAM Fuzzer
The JAM Fuzzer enhances the authoring test harness by integrating fuzzing
capabilities for comprehensive testing.
- Generates pseudo-random, valid blocks using the PolkaJam authoring engine.
- Provides the option to inject protocol errors, including both semantic and syntactic errors.
- Ensures fully reproducible execution by utilizing a known startup *seed*.
To avoid repetitive use of the phrase "pseudo-randomly generated data," terms
such as "some," "a," or similar expressions may be used as synonyms of "random"
throughout this document.
---
## Configuration
### Main Settings
- **`config`** *(default: `none`)*
Specifies a configuration file to initialize the fuzzer.
- Fully supports all CLI commands.
- CLI options take precedence over options defined in the configuration file.
- **`verbose`** *(default: `0`)*
Sets the verbosity level of logs:
- **0**: Basic fuzzer logs.
- **1**: Detailed fuzzer logs.
- **2**: Backend logs from PolkaJam components (e.g. PVM, Safrole, etc.), dimmed.
- **3**: Onchain statistics and other aux info.
- **`seed`** *(default: `random`)*
Specifies a 32-octet string or passphrase that is used to derive the former.
This seed initializes a reproducible, standard seedable RNG instance at
startup. The same RNG is utilized throughout the application's lifetime to
generate all pseudo-random data. The seed also contributes to genesis entropy,
immediatelly influencing the genesis state.
- **`source`** *(default: `engine`)*
Defines the source used for execution logic:
- **`engine`**: Uses the internal block authoring engine via PolkaJam author.
- **`trace`**: Imports pre-constructed execution traces (e.g., third-party traces).
*(Note: `trace` mode disables fuzzing and only validates the post-execution state.)*
- **`trace-dir`** *(default: `none`)*
Specifies a file system folder for persistent execution traces.
This folder is used as:
- output when `data-source=engine`
- input when `data-source=trace`
Useful for produce/process test vectors or for offline investigation.
- **`fault-injection`
TBD
### Additional Configuration Options
- **`slot-period`** (default: `1000`)
Defines the floor value for the desired slot period.
Note that extended processing time for certain slow operations may result in
longer periods.
- **`slots`** *(default: `u32::MAX`)*
The number of blocks to produce or import before stopping execution.
May stop earlier on error.
- **`safrole`** *(default: `true`)*
Allow to disable Safrole authoring. Improves fuzzing speed when set to `false`.
- **`max-work-items`** *(default: `16`)*
Specifies the maximum number of work items per package.
- **`max-service-keys`** *(default: `100`)*
Maximum number of keys stored for a service.
---
## Extrinsics
At a high level, the extrinsic content of blocks is randomized to simulate
various scenarios.
Extrinsics are constructed by retrieving information from the authoring
engine's tables, which are normally populated with messages received off-chain.
Therefore, within the fuzzer the randomization primarily applies to how
these tables are populated and how block authors select items during block
construction.
Since blocks are constructed through a (presumably 🤔) correct built-in
authoring engine, faults are thus injected after the block construction process
(see the [Faults Injection](#faults-injection) section for more details).
### Tickets
During each epoch, validators' tickets are generated and retained within the
fuzzer. These tickets are then fetched and supplied with pseudo-randomized
characteristics, such as:
- Delay in delivery,
- Quantity, and
- Picking order.
**Fault Injection:**
- Introduce bad or malformed tickets.
- More TBD.
### Preimages
When a service solicits a preimage, the fuzzer intercepts the request and
provides the preimage with a delay. Occasionally, the fuzzer may ignore
these requests.
Services may also request to forget or lookup a preimage.
These operations (order and method) are likewise executed pseudo-randomly.
Bootstrap service related `Instruction`s: `Solicit`, `Forget`, `Lookup`.
**Fault Injection:**
- More TBD.
### Guarantees
The fuzzer provide guarantees to execute the following services accumulation
(refer to [Services](#services) section for details):
- `Bootstrap`
- `Fuzzy`
Some cores may remain idle. Cores with new work reports (WR) are assigned various
work items from one of the two services, with the choice made pseudo-randomly.
The target functionality for each work item is selected pseudo-randomly.
**Fault Injection:**
- Assign the same WR to multiple cores.
- Assign multiple WRs to the same core.
- More TBD.
### Assurances
Outstanding work reports may be eventually assured and thus their accumulation
function executed.
**Fault Injection:**
- Bad assurances.
- More TBD.
### Disputes
**TODO**
---
## Services
The fuzzer includes services designed to exercise and expose potential flaws in
host calls and PVM. These services offer diverse behaviors, ranging from well-behaved
operations to malicious activities aimed at triggering faults.
Service actions are invoked based entirely on the internal rng state. For
example, up to `max-work-items` can be executed per package, targeting multiple
service or multiple functionalities of the same service without any correlation
between the actions.
### **Bootstrap**
The bootstrap service contains the majority of the instructions required to
exercise host calls. It is a "well-behaved" service, meaning it is expected to
operate correctly and avoid malicious behavior, such as attempting to exploit
unexpected code paths.
#### Example Operations:
- **`SolicitImage`**: Requests an image that is generated and stored within the
fuzzer. The image is eventually provided later (ref. to [Preimages](#preimages) section).
- **`InsertItems`**: Inserts *a* number of random items into the service's storage.
- **`RemoveItems`**: Removes *a* number of random items from the service's storage.
- ...
### **Fuzzy**
The fuzzy service is specifically designed to trigger faults inside and outside
the PVM engine. It is loaded by the bootstrap service at a later stage
after system initialization and provides test cases that are not supported
by the bootstrap service. It attempts to simulate problematic or malicious
scenarios such as:
- **Memory Violations**: Access invalid memory (e.g., invalid pointers or incorrect lengths).
- **Write to Read-Only Memory**: Attempt writes into memory regions marked as read-only.
- **Gas Exhaustion**: Enter infinite loop to cause excessive gas consumption.
- **Host Calls**: Explore possible vulnerabilities in the host calls
- Order of host calls errors check matters and should be aligned to GP
- All host calls should be excercised
- ... TBD
Also excercise logic related to transfers and authorizations.
---
## Faults Injection
Blocks produced by the built-in authoring engine are assumed to be semantically
and syntactically correct by default. However, after block production - but
before block import - a range of faults can be injected in the block to simulate
invalid or unexpected behavior.
Blocks that undergo these alterations must be (re)signed by the block author to
ensure they contain a valid signature before importing.
### Gray Box
Faults are injected with the help of a third party smart fuzzy engine (at the moment libfuzzer).
Given a well formed block (*corpus*), the engine leverages instrumented code to try to
increase code covered by the test.
Faults that are difficult to predict may be triggered (*"unknown unknowns"*).
### White Box
White box faults intentionally alter specific aspects of the protocol at a
higher level, testing scenarios where block properties diverge from standard
behavior.
Examples:
- Adding an **extra ticket** that exceeds the valid extrinsic capacity.
- Duplicating **work items** across multiple cores, violating protocol rules.
- Altering the block authoring process, such as:
- Block authoring from an **unexpected validator**.
- Fallback mechanisms executing where **primary authorization** is required.
- A block containing a **valid signature** where the VRF output does not match the expected ticket.
- Assigning an **unexpected block author** to the block header.
- ...
---
## Protocol Conformance Testing
The fuzzer can also function as a JAM protocol conformance testing tool,
enabling validation of third-party implementations (the "target") against
expected behaviors.
Through targeted testing, the fuzzer exercises the target implementation,
verifying its conformance with the protocol by comparing key elements
(state root, key-value storage, etc.) against locally computed results.
In this case, the testing approach is strictly **black-box**, with no knowledge
of or access to the internal structure of the system under test
### Workflow
The conformance testing process follows these steps:
1. Select a **run seed** to guarantee deterministic and reproducible execution.
2. Generate a block using the internal authoring engine (or also a precomputed
trace for a different reference).
3. Optionally inject faults into the block before processing (as detailed in
[Fault Injection](#fault-injection)).
4. Locally import the block.
5. Forward the block to the target implementation endpoint for processing.
6. Retrieve the **posterior state root** from the target and compare it with the
locally computed one: If the roots match, move on to the next iteration (step 2).
7. Attempt to read the target's full **key/value storage** (if implemented by the target).
8. Terminate the execution and produce an execution **report** containing:
- **Seed**: The used seed value for deterministic reproduction.
- **Inputs and Results**: Prior state, block, and the locally computed
posterior state.
- **Target Comparison**: If the target's posterior state is available,
generate a diff against the expected posterior state.
The resulting report can be used to construct a precise, specialized test
vector designed to immediately reproduce the discrepancy observed in the target
implementation.
The communication protocol between the fuzzer and the target implementation
is intentionally kept simple and currently we're planning for a simple named pipe.
### No Reference Implementation
As there will never be a definitive reference implementation, and the Graypaper
is the only authoritative specification, treating the local harness as a
reference is inaccurate. A mismatch between the harness and the target does not
automatically imply a fault in the target.
In case of discrepancy, the test vector must be examined and the expected
behavior verified against the Graypaper to resolve the inconsistency.