# Poseidon2 migration plan (Quantus external miner) Objective: Align all mining engines in this repository with the upstream QPoW changes that switch the nonce-score mapping to Poseidon2 and alter (m, n) derivation, as specified in mining.md. This enables the external miner to remain effective against the updated node. References: - Upstream algorithm change summary: mining.md - Key deltas to adopt: - Poseidon2-256 and Poseidon2-512 permutations are used instead of SHA2/SHA3. - Derivation of (m, n) from the mining hash now uses Poseidon2 with deterministic re-roll. - The target/nonce mapping used in distance computation is Poseidon2-512(G), not SHA3-512(G). - All other mechanics (incremental exponentiation, XOR distance, threshold, seal length, engine ID) remain per mining.md. --- ## 1) Summary of algorithm changes to implement From mining.md: - Group parameter derivation: - m = U256 produced by Poseidon2-256(header_bytes), then widened into U512 (lower 256 bits set). - n = U512 produced by Poseidon2-512(m_bytes); reroll n = Poseidon2-512(n_bytes) until: - n is odd - n > m - gcd(m, n) = 1 - n is composite (Miller–Rabin negative) with deterministic Poseidon2-derived bases - Hash-to-group mapping and distance: - g(h, s) = m^(h + s) mod n - G(h, s) = Poseidon2-512(to_bytes(g(h, s))) - T = G(h, 0) - X(s) = G(h, s) - D(s) = T XOR X(s) as U512 - Validity: D(s) <= threshold (U512) - Mining loop: - Incremental exponentiation still applies: v_{i+1} = (v_i * m) mod n - Guards: - 64-byte nonce (U512) - Nonce 0 is invalid - Determinism and endianness must be consistent (see Appendix A) The external miner must mirror this bit-for-bit. --- ## 2) Cross-crate impact matrix - crates/pow-core: High impact - Replace SHA2/SHA3-based (m, n) derivation with Poseidon2-256/512 flows and reroll semantics - Replace SHA3-512 mapping with Poseidon2-512 in distance computation - Expose a clean, tested Poseidon2 API for engines - Keep incremental exponentiation APIs unchanged - crates/engine-cpu: Low impact - No loop changes; `pow-core` update is sufficient - Update tests for Poseidon2 semantics - crates/engine-montgomery: Low impact - No loop changes; `pow-core` update is sufficient - Update tests for Poseidon2 semantics - crates/engine-gpu-cuda: Medium impact - G1 path (device Montgomery multiply + host mapping) remains correct once `pow-core` is updated - G2 path (device SHA3 + threshold) must move to device Poseidon2-512; until then, default to G1 - Add device Poseidon2-512 kernel and parity tests as a follow-up - crates/engine-gpu-opencl: Planning impact (placeholder) - Outline device Poseidon2-512 implementation strategy matching CUDA G2 once implemented - crates/miner-service, crates/miner-cli, crates/metrics: Very low impact - Service already uses `distance_threshold` (U512 decimal). Keep as-is - Clean up protocol/docs to avoid ambiguity with “difficulty” - EXTERNAL_MINER_PROTOCOL.md: Doc alignment - Remove remaining references to “difficulty” as the control input; confirm we operate on `distance_threshold` (decimal U512) --- ## 3) Detailed changes by crate ### 3.1. pow-core (authoritative math and API) Work items: - Introduce a new `poseidon2` module with: - `poseidon2_256(input: &[u8]) -> [u8; 32]` (32-byte output) - `poseidon2_512(input: &[u8]) -> [u8; 64]` (64-byte output) - Deterministic base generation for Miller–Rabin: e.g., `poseidon2_512(next) % (n-4) + 2` repeated to get 32 bases - Replace (m, n) derivation: - m := U256 from `poseidon2_256(header_be32)` placed into U512 (low 256 bits) - n := U512 from `poseidon2_512(m_bytes_be32)`; reroll with `poseidon2_512(n_be64)` while constraints fail - Replace target/nonce mapping: - `hash_to_group_bigint_sha_impl` and `sha3_512_impl` are deprecated - New: `poseidon_map_512(y: U512) -> U512` using big-endian 64-byte encoding of y - `JobContext::new` computes `target = poseidon_map_512(m^(h+0) mod n)` - `distance_from_y(ctx, y)` computes `poseidon_map_512(y)` and XOR with `ctx.target` - Keep incremental exponent interfaces: - `init_worker_y0`, `step_mul`, `distance_from_y`, `distance_for_nonce`, `is_valid_distance` — unchanged signatures - Keep guards: - Zero nonce -> invalid during validation paths - Endianness: - to_bytes(g) = big-endian 64-byte encoding (see Appendix A) - Tests: - Replace SHA3-specific tests with Poseidon2-based tests: - `compat_distance_matches_context_distance` - `incremental_step_matches_pow_plus_one` - A property test that rerolling n terminates and yields a valid composite - Miller–Rabin bases are reproducible from Poseidon2 sequence - Golden-vector tests against upstream (once samples are available) - Feature gating (optional for rollout): - Add `legacy-sha3` feature that flips the mapping back for bisect/debug only; default off - Dependency choice: - Use a well-maintained Poseidon2 permutation implementation with fixed, upstream-compatible parameters - Do not invent constants — mirror upstream node’s Poseidon2 parameters exactly Acceptance criteria: - Engines using only `pow-core` produce targets and distances matching upstream rules in mining.md - All existing engine tests (baseline/fast/montgomery) pass unmodified (aside from expectations that depended on SHA3) Notes: - Remove `sha2`/`sha3` dependencies once migration completes unless still used in tests or utilities ### 3.2. engine-cpu Work items: - No algorithmic updates (loops call into `pow-core`) - Ensure both `BaselineCpuEngine` and `FastCpuEngine` parity tests pass with the Poseidon2 mapping - Add a regression test: - Fixed header, permissive threshold: ensure both engines find the same nonce and distance under Poseidon2 Acceptance criteria: - Existing tests succeed without code changes other than updated expectations, if any ### 3.3. engine-montgomery Work items: - No algorithmic updates (loop uses `pow-core::distance_from_y`) - Confirm Montgomery multiply path (y_hat advancement) is independent of mapping - Add parity test vs `FastCpuEngine` under Poseidon2 (existing tests should cover this) Acceptance criteria: - Parity with `FastCpuEngine` on small ranges under Poseidon2 ### 3.4. engine-gpu-cuda Mode overview: - G1: Device computes y progression (Montgomery domain) and returns y; host computes mapping and threshold check - G2: Device computes y progression and mapping + threshold check with early-exit Work items: - Short term (safe default): - Keep G1 as the default path; it will remain correct once `pow-core` is switched to Poseidon2 (host mapping) - Set `MINER_CUDA_MODE` default to `g1` (it already is) - Add a runtime warning if `g2` is requested, stating that device Poseidon2 is not yet implemented and the engine will fallback to `g1` - Mid term (enable G2 for Poseidon2): - Implement Poseidon2-512 device function: - Input: 64-byte big-endian encoding of y (normalized from Montgomery domain on device) - Output: 64-byte result used directly for XOR distance - Use the exact upstream Poseidon2 parameters - Performance: ensure reasonable throughput; initial focus is correctness and parity - Replace device SHA3 in `qpow_montgomery_g2_kernel` with Poseidon2-512 - Keep constant memory wiring (n, R2, m_hat, n0_inv, target, threshold) unchanged - Add device/host parity harness: - Sample k outputs: y -> device Poseidon2 -> dist vs host Poseidon2 -> dist - Assert equality and that early-exit winners match in small ranges - Tests: - Update `gpu_g1_parity_with_cpu_on_small_range` to expect Poseidon2 distances - Add `gpu_g2_parity_with_cpu_on_small_range` once Poseidon2 kernel is in place Acceptance criteria: - G1 path unchanged: parity with CPU Fast engine under Poseidon2 - G2 path (when implemented) produces identical winners and distances as CPU for small ranges and finds winners correctly in larger ranges ### 3.5. engine-gpu-opencl Work items (placeholder crate): - Document that device-side Poseidon2-512 must be implemented similar to CUDA G2 before enabling a G2-like OpenCL path - Keep prepare_context() unchanged — host precompute stays in `pow-core` Acceptance criteria: - N/A (engine is a scaffold); ensure documentation comments reflect Poseidon2 migration ### 3.6. miner-service, miner-cli, metrics Work items: - miner-service: - No algorithm changes; it already consumes `distance_threshold` (decimal U512) - Update any log messages that mention SHA3 or old terms - miner-cli: - No changes required for flags/engine selection - metrics: - No algorithm changes; optional label updates if they mention SHA3 by name Acceptance criteria: - End-to-end jobs execute and return results under Poseidon2 mapping - Result revalidation done in `handle_result_request` continues to match engine results (it calls into `pow-core`) ### 3.7. EXTERNAL_MINER_PROTOCOL.md (documentation) Work items: - Replace remaining references to “difficulty” as the primary control parameter. The node provides `distance_threshold` (decimal U512); difficulty is derived on-chain - Clarify that `work` is the 64-byte winning nonce (seal) and success derives from D(s) <= threshold under Poseidon2 mapping - Note determinism and the requirement to use the exact upstream Poseidon2 parameters Acceptance criteria: - Protocol doc matches mining.md terminology, types, and semantics --- ## 4) Rollout plan - Phase 0: Land pow-core Poseidon2 implementation behind a temporary `legacy-sha3` feature. Default: Poseidon2 on; legacy flag for bisect/debug only - Phase 1: Update engine tests and adjust expectations (should mostly remain unchanged because parity tests are relative engine-to-engine) - Phase 2: Make CUDA default to G1 with a warning if G2 is requested; document that G2 will be re-enabled after device Poseidon2 lands - Phase 3: Implement device Poseidon2-512 in CUDA; add parity tests; allow G2 mode again - Phase 4: Clean up: remove `legacy-sha3` feature, prune unused SHA crates (unless used elsewhere), finalize docs Milestones: - M1: Node accepts external miner results under Poseidon2 when running CPU engines - M2: CUDA G1 parity with CPU under Poseidon2 - M3: CUDA G2 parity with CPU under Poseidon2; early-exit functional - M4: Documentation aligned; protocol doc no longer references old terms --- ## 5) Risks and mitigations - Poseidon2 parameter mismatch with upstream: - Mitigation: mirror upstream node constants/parameters exactly; add golden vectors from upstream once available - Endianness inconsistencies: - Mitigation: centralize big-endian encoding in `pow-core` and reuse it in all engines; see Appendix A - Miller–Rabin false positives (primality): - Mitigation: use 32 deterministic bases derived from Poseidon2; identical to upstream; add tests comparing `is_prime` with upstream where possible - CUDA device Poseidon2 performance: - Mitigation: prioritize correctness first (G1 path is already correct); optimize G2 later; keep G2 optional initially --- ## 6) Deliverables checklist - pow-core - [ ] Poseidon2-256/512 implementation with upstream constants - [ ] (m, n) derivation replaced with Poseidon2 flows and reroll - [ ] Mapping replaced: Poseidon2-512(y) instead of SHA3-512(y) - [ ] Tests: parity, reroll, Miller–Rabin determinism, golden vectors - engine-cpu - [ ] Parity tests with pow-core Poseidon2 mapping - engine-montgomery - [ ] Parity tests with pow-core Poseidon2 mapping - engine-gpu-cuda - [ ] Default to G1 with Poseidon2 host mapping (no code change beyond log tweaks) - [ ] Device Poseidon2-512 implementation (follow-up) - [ ] G2 parity tests after implementation - engine-gpu-opencl - [ ] Update documentation to reflect Poseidon2 requirement (future work) - miner-service / miner-cli / metrics - [ ] Minor log/doc updates; no algorithm changes - EXTERNAL_MINER_PROTOCOL.md - [ ] Replace mentions of “difficulty” as input with “distance_threshold (U512 decimal)” - [ ] Clarify Poseidon2 mapping and seal semantics --- ## Appendix A: Endianness and byte encoding To avoid ambiguity and ensure cross-implementation determinism: - h (header/pre-hash): - Input as 32-byte big-endian - Interpret as U512 via `U512::from_big_endian(&[u8; 32])` (upper 256 bits will be zero) - m: - 32-byte output of Poseidon2-256(header_be32) - Treat as U256, widen to U512 by placing into the low 256 bits (high 256 bits zero) - n: - 64-byte output of Poseidon2-512(m_be32 or n_be64 during reroll) - Treat as U512 from big-endian bytes - y = m^(h + s) mod n: - Encode to bytes for permutation as big-endian 64 bytes - Poseidon2 outputs: - For both 256-bit and 512-bit variants, treat the permutation output as raw bytes in big-endian order when converting to integers - Distance: - XOR is performed on U512 values equivalent to XOR on 64-byte big-endian arrays Note: Device implementations (CUDA/OpenCL) must conform to this encoding and byte order when converting between limb arrays and permutation inputs/outputs. --- ## Appendix B: Testing strategy - Unit-level: - pow-core: (m, n) derivation determinism with reroll; Miller–Rabin deterministic bases from Poseidon2 sequence - Mapping parity: `distance_for_nonce` equals `target XOR poseidon_map_512(y)` for arbitrary nonces - Incremental: `distance_from_y(step_mul(y)) == distance_for_nonce(nonce+1)` - Cross-engine: - cpu-baseline vs cpu-fast vs cpu-montgomery parity on small ranges - cuda G1 vs cpu-fast parity on small ranges - cuda G2 (post-implementation) vs cpu-fast parity on small ranges; early-exit winners match - End-to-end (service): - POST /mine with permissive threshold finds solutions - GET /result revalidates by recomputing distance with Poseidon2 mapping - Strict threshold ranges exhaust without false positives --- By executing this plan, the external miner will produce QPoW seals that conform to the updated Poseidon2-based rules and remain effective and interoperable with the upstream node.