EPF - Week 8 - HackMD

# rated-list-specs - part 1.1 ## Background Description: This specification describe the implementation of rated-list based on DHT. Some background is needed to understand this new proposal: - Chord / Kademlia DHTs - libp2p - discv5 Problem to solve: - Robustness issue: Every “hop” can be malicious and make a request fail - Scalability: How to manage millions of nodes? Rated-list Key Features: - Limited Key Space: Use an array structure rather than a hash table. - Node List: Maintain a list of nodes rather than a DHT for better control. - Discovery Path Tracking: Track discovery paths to identify and eliminate malicious subtrees. - P2P Tree Discovery: Discover nodes through a peer-to-peer tree structure up to a certain depth. ## Protocol illustration: ![Screenshot 2024-08-01 at 16.34.16](https://hackmd.io/_uploads/ryff6GKKC.png) ![Screenshot 2024-08-01 at 16.34.40](https://hackmd.io/_uploads/BJ9GpMtKA.png) ## Algorithm: Notation: - $\mathcal{N}$ : Set of all nodes in the network. - $\mathcal{P}_k$ : Set of peers at level $k$ in the P2P tree. - $\mathcal{L}(p)$ : List of peers known by peer $p$. - $\mathcal{C}(p)$ : Set of children of peer $p$. - $\Pi(p)$ : Set of parents of peer $p$. - $S(p)$ : Score of peer $p$. - $S_{th}$ : Threshold score for marking a node as defunct. - $\mathcal{I}$ : Node ID space. - $S$ : Set of sample numbers. - $s \in S$ : Sample. - $f(s)$ : Function mapping sample $s$ to a node ID. - $d$ : Radius within which peers are responsible for maintaining a sample. - $\operatorname{dist}(p, f(s))$ : Distance between peer $p$ and the mapped node ID $f(s)$. - $S_B(p)$: score of per block. Algorithm Description: 1. Initialization and List Construction: - Node List: Maintain a list $\mathcal{N}$ of all nodes in the network. - P2P Tree Formation: Distribute nodes into a P2P tree with maximum depth $T$. Start with a compiled list of all current peers as the root (Level 1). - Hierarchical Discovery: For each node in Level $k$ (where $1 \leq k<T$ ), query for their peers to form Level $k+1$. Ensure no peer reappears in higher levels if it has already appeared in a lower level. - Parent/Child Relationship Update: Maintain and update parent-child relationships: - For a peer $p \in \mathcal{P}_k$, where $\mathcal{P}_k$ is the set of peers at level $k$ : $$ \begin{aligned} & \mathcal{C}(p)=\left\{p^{\prime} \mid p^{\prime} \in \mathcal{P}_{k+1} \text { and } p \in \mathcal{L}\left(p^{\prime}\right)\right\} \\ & \Pi(p)=\left\{p^{\prime} \mid p^{\prime} \in \mathcal{P}_{k-1} \text { and } p \in \mathcal{L}\left(p^{\prime}\right)\right\} \end{aligned} $$ - Peer Removal: Remove peers with zero parents: $p$ is removed if $\Pi(p)=\emptyset$ 2. Sample Mapping and Maintenance: - Let $\mathcal{I}$ be the node ID space, and $\mathcal{S}$ be the set of sample numbers. - Define the mapping function $f: \mathcal{S} \rightarrow \mathcal{I}$. - For each sample $s \in \mathcal{S}, f(s)$ gives the corresponding node ID. - Each node downloads and custodies a minimum of `CUSTODY_REQUIREMENT` rows and `CUSTODY_REQUIREMENT` columns per slot. - A node utilizes `get_custody_columns` helper to determine which peer(s) it could request from, identifying a list of candidate peers for each selected column. And query for samples from selected peers via `DataColumnSidecarsByRoot `request. - All peers $p \in \mathcal{N}$ within distance $d$ of $f(s)$ are responsible for maintaining sample $s$ : $\forall p \in \mathcal{N}$, if $\operatorname{dist}(p, f(s)) \leq d$ then $p$ maintains $s$. CAN YOU PROVIDE ME THE EXACT FUNCTION? - Where $\operatorname{dist}(p, f(s))$ is the distance between peer $p$ and the mapped node ID $f(s)$ in the node ID space. 3. Node Status and rating: - Inactive State: Block that do not respond to sample requests are marked as inactive. - Active State: Block that respond positively to sample requests are marked as active. - Rating Calculation: The rating of a node is determined by the average rating of all its active descendant nodes (children, grandchildren, etc.): $$ S_B(p)=\frac{1}{|\mathcal{C}(p)|} \sum_{c \in \mathcal{C}(p)} S(c) $$ - Defunct Nodes: Nodes with ratings below a certain threshold $S_{t h}$ are marked as defunct: $p$ is defunct if $S(p)<S_{t h}$. All nodes thst are only descendents of defunct nodes are no longer contacted for sample requests for the current block. ## Python code: ``` import random import math class Node: def __init__(self, node_id): self.node_id = node_id self.peers = set() self.children = set() self.parents = set() self.active = False self.defunct = False self.score = 0 def __repr__(self): return f"Node({self.node_id}, Active={self.active}, Defunct={self.defunct}, Score={self.score})" class RatedListAlgorithm: def __init__(self, max_depth, sample_radius, score_threshold): self.nodes = {} self.max_depth = max_depth self.sample_radius = sample_radius self.score_threshold = score_threshold def add_node(self, node_id): if node_id not in self.nodes: self.nodes[node_id] = Node(node_id) def initialize_peers(self, peer_relationships): for node_id, peers in peer_relationships.items(): self.add_node(node_id) for peer_id in peers: self.add_node(peer_id) self.nodes[node_id].peers.add(peer_id) self.nodes[peer_id].peers.add(node_id) def build_p2p_tree(self): for depth in range(1, self.max_depth): current_level = {node_id for node_id in self.nodes if self.nodes[node_id].active is False} next_level = set() for node_id in current_level: for peer_id in self.nodes[node_id].peers: if peer_id not in current_level and peer_id not in next_level: self.nodes[node_id].children.add(peer_id) self.nodes[peer_id].parents.add(node_id) next_level.add(peer_id) if not next_level: break def query_peers(self): for node in self.nodes.values(): node.active = False node.children.clear() node.parents.clear() self.build_p2p_tree() to_remove = [node_id for node_id, node in self.nodes.items() if not node.parents] for node_id in to_remove: del self.nodes[node_id] def map_sample_to_node(self, sample): node_ids = list(self.nodes.keys()) return node_ids[sample % len(node_ids)] def assign_sample(self, sample): responsible_node = self.map_sample_to_node(sample) for node in self.nodes.values(): if self.calculate_distance(node.node_id, responsible_node) <= self.sample_radius: node.active = True def calculate_distance(self, node_id1, node_id2): return abs(node_id1 - node_id2) def update_node_status(self, sample_requests): for sample in sample_requests: self.assign_sample(sample) def calculate_node_scores(self): for node in self.nodes.values(): if node.active: node.score = self.calculate_average_score(node) node.defunct = node.score < self.score_threshold def calculate_average_score(self, node): if not node.children: return 1 if node.active else 0 return sum(self.nodes[child_id].score for child_id in node.children) / len(node.children) def run_algorithm(self, peer_relationships, sample_requests): self.initialize_peers(peer_relationships) self.query_peers() self.update_node_status(sample_requests) self.calculate_node_scores() for node in self.nodes.values(): print(node) ## Example usage peer_relationships = { 1: [2, 3, 4], 2: [5, 6], 3: [7, 8], 4: [9, 10], 5: [11, 12], 6: [13, 14], # Add more nodes and their peers as needed } sample_requests = [15, 20, 25, 30] # Example samples algorithm = RatedListAlgorithm(max_depth=3, sample_radius=5, score_threshold=0.5) algorithm.run_algorithm(peer_relationships, sample_requests) ``` Explanation: - `Node Class:` Represents a node in the network, holding its ID, peers, children, parents, active status, defunct status, and score. - `RatedListAlgorithm Class:` Implements the Rated List algorithm. - `add_node: `Adds a new node to the network. - `initialize_peers:` Initializes the peer relationships. - `build_p2p_tree:` Builds the P2P tree up to the specified depth. - `query_peers:` Periodically queries peers and updates parent/child relationships. - `map_sample_to_node:` Maps a sample to a node ID. - `assign_sample: `Assigns a sample to nodes within a certain radius. - `calculate_distance:` Calculates the distance between two node IDs. - `update_node_status:` Updates the status of nodes based on sample requests. - `calculate_node_scores:` Calculates the scores of nodes based on their active descendants. - `run_algorithm:` Runs the entire algorithm with given peer relationships and sample requests. Example Usage - `peer_relationships:` Dictionary representing the initial peer relationships. - `sample_requests:` List of sample requests. - `algorithm.run_algorithm:` Runs the algorithm with the provided data and prints the final state of each node.