# 🔧 Lessons Learned: Using a Req/Ack Handshake Protocol for CPU-Accessible Control & Status Registers (CSR) - PART II In Part I, we discussed the basics of using a `req/ack` handshake protocol to access hardware control/status registers. However, using a **1-cycle pulse `ack`** can be problematic when the initiator is a CPU, especially when polling from software. This article presents a robust solution: using an `ack_counter` to **track request completions** reliably without relying on precise timing. --- ## ❓ The Problem: CPU May Miss the 1-Cycle `ack` In typical req/ack implementations: - The target module asserts `ack` for **only one clock cycle** - Software (CPU) polls `ack` periodically - If the CPU **misses the pulse**, it may block forever --- ## ✅ The Solution: Use an `ack_counter` Instead of relying on `ack` pulses, we expose a **monotonically increasing counter** (`ack_counter`) that increments **every time an ack event occurs**. ### 🔧 Interface Signals | Signal | Direction | Description | |----------------|-------------------|-------------| | `req` | CPU → Target | Asserted by CPU to initiate request (held high) | | `ack_counter` | Target → CPU | Counter incremented on each `ack` event | | `ack_data` | Target → CPU | Optional data returned on read completion | --- ## 🔁 CPU Polling Flow Example ```c // Step 1: Read current ack_counter value uint8_t prev_ack = read(ACK_COUNTER); // Step 2: Assert req to start the transaction write(REQ, 1); // Step 3: Poll until ack_counter changes while (read(ACK_COUNTER) == prev_ack); // Step 4: Ack observed → clear req write(REQ, 0); // Step 5 (Optional): Read returned data uint32_t result = read(RDATA); ``` This guarantees that the software never misses the ack, even if it arrives asynchronously. ## 💡 Advantages of ack_counter * ✅ Safe for CPU/software polling — no risk of missing an ack * ✅ Works in mixed-clock or bus-based systems (e.g., APB, AXI-lite) * ✅ Can carry additional context (e.g., sequence ID, error status) * ✅ Can be extended to support out-of-order or buffered responses ### 📈 Timing Diagram – Real Example The diagram below illustrates the `req/ack` protocol with a persistent `cpu_req` signal and an `cpu_ack_cnt` mechanism. - `cpu_req` is held high until the CPU observes `cpu_ack_cnt` change (from 0 → 1) - `module_ack` is a 1-cycle pulse triggered by internal completion - `module_ack_data` is written in response (e.g., `"pass"`) - `cpu_data` holds the received value, and `cpu_ack_cnt` increments ![image](https://hackmd.io/_uploads/SycW8hM7ll.png) - Wavedrom Style ``` json {signal: [ {name: 'cpu_clk', wave: 'P........', period:3}, {name: 'module_ready', wave: '01...|...', period:3}, {name: 'cpu_req', wave: '0.1..|..0', period:3}, {name: 'cpu_req_cmd', wave: 'x.6..|..x',data:["CMD"], period:3}, {name: 'cpu_data', wave: 'x....|5..',data:["pass"], period:3}, {name: 'cpu_ack_cnt', wave: '2....|5..',data:[0,1], period:3}, {}, {name: 'module_clk', wave: 'P.............', period:2}, {name: 'module_req', wave: '0...1...0.....', period:2}, {name: 'module_cmd', wave: 'x...6...x.....',data:["CMD"], period:2}, {name: 'module_ack', wave: '0......10.....', period:2}, {name: 'module_ack_data', wave: 'x......2x.....',data:["pass"], period:2}, ]} ``` * NOTE: * I use the different clock domain crossing to imply the case. * We can easily isolate the clock domain with this method. * We also can use interrupt to solve the question. I just give another thinking for solving the issue. ## 🧠 Summary By replacing a fragile 1-cycle ack pulse with a persistent ack_counter, we: * Eliminate the risk of software missing the acknowledgment * Make the handshake CPU-friendly and timing-agnostic * Enable debug visibility into how many requests were acknowledged This simple enhancement significantly improves the robustness and observability of your hardware-software interface. --- #RTLdesign #HandshakeProtocol #CPUInterface #SoCIntegration #FPGA #CSRdesign #EmbeddedSystems #VerificationFriendly