# Poseidon2 migration plan (Quantus external miner)
Objective:
Align all mining engines in this repository with the upstream QPoW changes that switch the nonce-score mapping to Poseidon2 and alter (m, n) derivation, as specified in mining.md. This enables the external miner to remain effective against the updated node.
References:
- Upstream algorithm change summary: mining.md
- Key deltas to adopt:
- Poseidon2-256 and Poseidon2-512 permutations are used instead of SHA2/SHA3.
- Derivation of (m, n) from the mining hash now uses Poseidon2 with deterministic re-roll.
- The target/nonce mapping used in distance computation is Poseidon2-512(G), not SHA3-512(G).
- All other mechanics (incremental exponentiation, XOR distance, threshold, seal length, engine ID) remain per mining.md.
---
## 1) Summary of algorithm changes to implement
From mining.md:
- Group parameter derivation:
- m = U256 produced by Poseidon2-256(header_bytes), then widened into U512 (lower 256 bits set).
- n = U512 produced by Poseidon2-512(m_bytes); reroll n = Poseidon2-512(n_bytes) until:
- n is odd
- n > m
- gcd(m, n) = 1
- n is composite (Miller–Rabin negative) with deterministic Poseidon2-derived bases
- Hash-to-group mapping and distance:
- g(h, s) = m^(h + s) mod n
- G(h, s) = Poseidon2-512(to_bytes(g(h, s)))
- T = G(h, 0)
- X(s) = G(h, s)
- D(s) = T XOR X(s) as U512
- Validity: D(s) <= threshold (U512)
- Mining loop:
- Incremental exponentiation still applies: v_{i+1} = (v_i * m) mod n
- Guards:
- 64-byte nonce (U512)
- Nonce 0 is invalid
- Determinism and endianness must be consistent (see Appendix A)
The external miner must mirror this bit-for-bit.
---
## 2) Cross-crate impact matrix
- crates/pow-core: High impact
- Replace SHA2/SHA3-based (m, n) derivation with Poseidon2-256/512 flows and reroll semantics
- Replace SHA3-512 mapping with Poseidon2-512 in distance computation
- Expose a clean, tested Poseidon2 API for engines
- Keep incremental exponentiation APIs unchanged
- crates/engine-cpu: Low impact
- No loop changes; `pow-core` update is sufficient
- Update tests for Poseidon2 semantics
- crates/engine-montgomery: Low impact
- No loop changes; `pow-core` update is sufficient
- Update tests for Poseidon2 semantics
- crates/engine-gpu-cuda: Medium impact
- G1 path (device Montgomery multiply + host mapping) remains correct once `pow-core` is updated
- G2 path (device SHA3 + threshold) must move to device Poseidon2-512; until then, default to G1
- Add device Poseidon2-512 kernel and parity tests as a follow-up
- crates/engine-gpu-opencl: Planning impact (placeholder)
- Outline device Poseidon2-512 implementation strategy matching CUDA G2 once implemented
- crates/miner-service, crates/miner-cli, crates/metrics: Very low impact
- Service already uses `distance_threshold` (U512 decimal). Keep as-is
- Clean up protocol/docs to avoid ambiguity with “difficulty”
- EXTERNAL_MINER_PROTOCOL.md: Doc alignment
- Remove remaining references to “difficulty” as the control input; confirm we operate on `distance_threshold` (decimal U512)
---
## 3) Detailed changes by crate
### 3.1. pow-core (authoritative math and API)
Work items:
- Introduce a new `poseidon2` module with:
- `poseidon2_256(input: &[u8]) -> [u8; 32]` (32-byte output)
- `poseidon2_512(input: &[u8]) -> [u8; 64]` (64-byte output)
- Deterministic base generation for Miller–Rabin: e.g., `poseidon2_512(next) % (n-4) + 2` repeated to get 32 bases
- Replace (m, n) derivation:
- m := U256 from `poseidon2_256(header_be32)` placed into U512 (low 256 bits)
- n := U512 from `poseidon2_512(m_bytes_be32)`; reroll with `poseidon2_512(n_be64)` while constraints fail
- Replace target/nonce mapping:
- `hash_to_group_bigint_sha_impl` and `sha3_512_impl` are deprecated
- New: `poseidon_map_512(y: U512) -> U512` using big-endian 64-byte encoding of y
- `JobContext::new` computes `target = poseidon_map_512(m^(h+0) mod n)`
- `distance_from_y(ctx, y)` computes `poseidon_map_512(y)` and XOR with `ctx.target`
- Keep incremental exponent interfaces:
- `init_worker_y0`, `step_mul`, `distance_from_y`, `distance_for_nonce`, `is_valid_distance` — unchanged signatures
- Keep guards:
- Zero nonce -> invalid during validation paths
- Endianness:
- to_bytes(g) = big-endian 64-byte encoding (see Appendix A)
- Tests:
- Replace SHA3-specific tests with Poseidon2-based tests:
- `compat_distance_matches_context_distance`
- `incremental_step_matches_pow_plus_one`
- A property test that rerolling n terminates and yields a valid composite
- Miller–Rabin bases are reproducible from Poseidon2 sequence
- Golden-vector tests against upstream (once samples are available)
- Feature gating (optional for rollout):
- Add `legacy-sha3` feature that flips the mapping back for bisect/debug only; default off
- Dependency choice:
- Use a well-maintained Poseidon2 permutation implementation with fixed, upstream-compatible parameters
- Do not invent constants — mirror upstream node’s Poseidon2 parameters exactly
Acceptance criteria:
- Engines using only `pow-core` produce targets and distances matching upstream rules in mining.md
- All existing engine tests (baseline/fast/montgomery) pass unmodified (aside from expectations that depended on SHA3)
Notes:
- Remove `sha2`/`sha3` dependencies once migration completes unless still used in tests or utilities
### 3.2. engine-cpu
Work items:
- No algorithmic updates (loops call into `pow-core`)
- Ensure both `BaselineCpuEngine` and `FastCpuEngine` parity tests pass with the Poseidon2 mapping
- Add a regression test:
- Fixed header, permissive threshold: ensure both engines find the same nonce and distance under Poseidon2
Acceptance criteria:
- Existing tests succeed without code changes other than updated expectations, if any
### 3.3. engine-montgomery
Work items:
- No algorithmic updates (loop uses `pow-core::distance_from_y`)
- Confirm Montgomery multiply path (y_hat advancement) is independent of mapping
- Add parity test vs `FastCpuEngine` under Poseidon2 (existing tests should cover this)
Acceptance criteria:
- Parity with `FastCpuEngine` on small ranges under Poseidon2
### 3.4. engine-gpu-cuda
Mode overview:
- G1: Device computes y progression (Montgomery domain) and returns y; host computes mapping and threshold check
- G2: Device computes y progression and mapping + threshold check with early-exit
Work items:
- Short term (safe default):
- Keep G1 as the default path; it will remain correct once `pow-core` is switched to Poseidon2 (host mapping)
- Set `MINER_CUDA_MODE` default to `g1` (it already is)
- Add a runtime warning if `g2` is requested, stating that device Poseidon2 is not yet implemented and the engine will fallback to `g1`
- Mid term (enable G2 for Poseidon2):
- Implement Poseidon2-512 device function:
- Input: 64-byte big-endian encoding of y (normalized from Montgomery domain on device)
- Output: 64-byte result used directly for XOR distance
- Use the exact upstream Poseidon2 parameters
- Performance: ensure reasonable throughput; initial focus is correctness and parity
- Replace device SHA3 in `qpow_montgomery_g2_kernel` with Poseidon2-512
- Keep constant memory wiring (n, R2, m_hat, n0_inv, target, threshold) unchanged
- Add device/host parity harness:
- Sample k outputs: y -> device Poseidon2 -> dist vs host Poseidon2 -> dist
- Assert equality and that early-exit winners match in small ranges
- Tests:
- Update `gpu_g1_parity_with_cpu_on_small_range` to expect Poseidon2 distances
- Add `gpu_g2_parity_with_cpu_on_small_range` once Poseidon2 kernel is in place
Acceptance criteria:
- G1 path unchanged: parity with CPU Fast engine under Poseidon2
- G2 path (when implemented) produces identical winners and distances as CPU for small ranges and finds winners correctly in larger ranges
### 3.5. engine-gpu-opencl
Work items (placeholder crate):
- Document that device-side Poseidon2-512 must be implemented similar to CUDA G2 before enabling a G2-like OpenCL path
- Keep prepare_context() unchanged — host precompute stays in `pow-core`
Acceptance criteria:
- N/A (engine is a scaffold); ensure documentation comments reflect Poseidon2 migration
### 3.6. miner-service, miner-cli, metrics
Work items:
- miner-service:
- No algorithm changes; it already consumes `distance_threshold` (decimal U512)
- Update any log messages that mention SHA3 or old terms
- miner-cli:
- No changes required for flags/engine selection
- metrics:
- No algorithm changes; optional label updates if they mention SHA3 by name
Acceptance criteria:
- End-to-end jobs execute and return results under Poseidon2 mapping
- Result revalidation done in `handle_result_request` continues to match engine results (it calls into `pow-core`)
### 3.7. EXTERNAL_MINER_PROTOCOL.md (documentation)
Work items:
- Replace remaining references to “difficulty” as the primary control parameter. The node provides `distance_threshold` (decimal U512); difficulty is derived on-chain
- Clarify that `work` is the 64-byte winning nonce (seal) and success derives from D(s) <= threshold under Poseidon2 mapping
- Note determinism and the requirement to use the exact upstream Poseidon2 parameters
Acceptance criteria:
- Protocol doc matches mining.md terminology, types, and semantics
---
## 4) Rollout plan
- Phase 0: Land pow-core Poseidon2 implementation behind a temporary `legacy-sha3` feature. Default: Poseidon2 on; legacy flag for bisect/debug only
- Phase 1: Update engine tests and adjust expectations (should mostly remain unchanged because parity tests are relative engine-to-engine)
- Phase 2: Make CUDA default to G1 with a warning if G2 is requested; document that G2 will be re-enabled after device Poseidon2 lands
- Phase 3: Implement device Poseidon2-512 in CUDA; add parity tests; allow G2 mode again
- Phase 4: Clean up: remove `legacy-sha3` feature, prune unused SHA crates (unless used elsewhere), finalize docs
Milestones:
- M1: Node accepts external miner results under Poseidon2 when running CPU engines
- M2: CUDA G1 parity with CPU under Poseidon2
- M3: CUDA G2 parity with CPU under Poseidon2; early-exit functional
- M4: Documentation aligned; protocol doc no longer references old terms
---
## 5) Risks and mitigations
- Poseidon2 parameter mismatch with upstream:
- Mitigation: mirror upstream node constants/parameters exactly; add golden vectors from upstream once available
- Endianness inconsistencies:
- Mitigation: centralize big-endian encoding in `pow-core` and reuse it in all engines; see Appendix A
- Miller–Rabin false positives (primality):
- Mitigation: use 32 deterministic bases derived from Poseidon2; identical to upstream; add tests comparing `is_prime` with upstream where possible
- CUDA device Poseidon2 performance:
- Mitigation: prioritize correctness first (G1 path is already correct); optimize G2 later; keep G2 optional initially
---
## 6) Deliverables checklist
- pow-core
- [ ] Poseidon2-256/512 implementation with upstream constants
- [ ] (m, n) derivation replaced with Poseidon2 flows and reroll
- [ ] Mapping replaced: Poseidon2-512(y) instead of SHA3-512(y)
- [ ] Tests: parity, reroll, Miller–Rabin determinism, golden vectors
- engine-cpu
- [ ] Parity tests with pow-core Poseidon2 mapping
- engine-montgomery
- [ ] Parity tests with pow-core Poseidon2 mapping
- engine-gpu-cuda
- [ ] Default to G1 with Poseidon2 host mapping (no code change beyond log tweaks)
- [ ] Device Poseidon2-512 implementation (follow-up)
- [ ] G2 parity tests after implementation
- engine-gpu-opencl
- [ ] Update documentation to reflect Poseidon2 requirement (future work)
- miner-service / miner-cli / metrics
- [ ] Minor log/doc updates; no algorithm changes
- EXTERNAL_MINER_PROTOCOL.md
- [ ] Replace mentions of “difficulty” as input with “distance_threshold (U512 decimal)”
- [ ] Clarify Poseidon2 mapping and seal semantics
---
## Appendix A: Endianness and byte encoding
To avoid ambiguity and ensure cross-implementation determinism:
- h (header/pre-hash):
- Input as 32-byte big-endian
- Interpret as U512 via `U512::from_big_endian(&[u8; 32])` (upper 256 bits will be zero)
- m:
- 32-byte output of Poseidon2-256(header_be32)
- Treat as U256, widen to U512 by placing into the low 256 bits (high 256 bits zero)
- n:
- 64-byte output of Poseidon2-512(m_be32 or n_be64 during reroll)
- Treat as U512 from big-endian bytes
- y = m^(h + s) mod n:
- Encode to bytes for permutation as big-endian 64 bytes
- Poseidon2 outputs:
- For both 256-bit and 512-bit variants, treat the permutation output as raw bytes in big-endian order when converting to integers
- Distance:
- XOR is performed on U512 values equivalent to XOR on 64-byte big-endian arrays
Note: Device implementations (CUDA/OpenCL) must conform to this encoding and byte order when converting between limb arrays and permutation inputs/outputs.
---
## Appendix B: Testing strategy
- Unit-level:
- pow-core: (m, n) derivation determinism with reroll; Miller–Rabin deterministic bases from Poseidon2 sequence
- Mapping parity: `distance_for_nonce` equals `target XOR poseidon_map_512(y)` for arbitrary nonces
- Incremental: `distance_from_y(step_mul(y)) == distance_for_nonce(nonce+1)`
- Cross-engine:
- cpu-baseline vs cpu-fast vs cpu-montgomery parity on small ranges
- cuda G1 vs cpu-fast parity on small ranges
- cuda G2 (post-implementation) vs cpu-fast parity on small ranges; early-exit winners match
- End-to-end (service):
- POST /mine with permissive threshold finds solutions
- GET /result revalidates by recomputing distance with Poseidon2 mapping
- Strict threshold ranges exhaust without false positives
---
By executing this plan, the external miner will produce QPoW seals that conform to the updated Poseidon2-based rules and remain effective and interoperable with the upstream node.