KadRTT ver.2 - HackMD

# KadRTT ver.2 Author: Hidehiro Kanemitsu, Tokyo University of Technology, Japan. ## 1. Objective We realize a DHT-based routing scheme with underlay-aware lookup optimization for large scale overlay network, named KadRTT. The objectives of KadRTT are as follows: **1. Lookup optimization with the forwarding with RTT and overlay hop count in Kademlia.** - Coordinate each k-bucket in order that each ID is uniformly distributed not to bias a specific ID range to suppress redundant lookup hops. f - For each iterative lookup, a client tries to send request to the next hops according to increasing order of RTT if it satisfies the logarithmic condition: $\frac{d(p_i^{RTT}, CID)}{d(p_i^{kad}, CID)}<2$. **2. Joining time duration / popularity based replication priority for churn resilience.** As for the evaluation, we develop both simulations and a real-world module. Simulations are based on OverSim and Testground, and the real-world module is the one for go-libp2p. ---------- ## 2. Basics and assumption libp2p uses iterative routing in Kademlia, and then the total lookup time can be bounded as follows: $T_{query}^{kad} (p_x ,CID,D_{all} ) \le \sum\limits_{\scriptstyle i \le \log \left( {d(p_{r,0} ,CID,)} \right), \atop \scriptstyle p_x \in D_{all} } {t_{x,i}^{RTT} } ,$ where $p_x$ is a lookup client, CID is Content ID, $p_{r,0}$ is a peer in $r$-th entries of k-bucket on $p_x$ in the following figure. ![Figure 1: Iterative lookup in Kademlia, where $r$ is the maximum hop count and it is decremented by each iterative hop.](https://paper-attachments.dropbox.com/s_D8AC57935117C4198CFABB69F65219C8CF5C262701249CB1AC9D03E01EB2C0D0_1612753765226_image.png "Figure 1: Iterative lookup in Kademlia, where $r$ is the maximum hop count and it is decremented by each iterative hop.") For more detailed lookup, it is shown in the following figure, where CID hit with $r$-th k-bucket on the client peer. ![](https://i.imgur.com/Pnq2KRn.png) Since the actual hop count cannot be known, we estimate the upper bound and then the following condition must be satisfied for the lookup optimization in KadRTT. *$Max. \ hop\ \sharp \ in\ KadRTT = Max. \ hop\ \sharp \ in\ Kademlia$* ---------- ## 3. Condition for suppressing the max. hop count in KadRTT In KadRTT, at least the maximum hop cont must be equals to the one in Kademlia in order to suppress the total lookup time. Here, we formulate the difference of the maximum hop count. Let the client peer as $p_x$, and CID is the content ID. At the $i$-th iterative hop count, let $p_x$ determines the $s$-th k-bucket record is the nearest among all k-bucket records in the routing table. Thus, suppose $PeerID(p_x) + 2^s \leq CID < PeerID(p_x) + 2^{s+1}$, and $p_i^{kad}$ is the entry in $s$-th k-bucket record selected by Kademlia. $p_i^{RTT}$ is the one selected by KadRTT. Then we assume $PeerID(p_x) + 2^s \leq p_i^{kad}, p_i^{RTT} < PeerID(p_x) + 2^{s+1}$, and $d(p_i^{kad}, CID) \leq d(p_i^{RTT}, CID)$, because $p_i^{RTT}$ is the selected one by RTT, not by Key distance. Fig. 2 shows bounds on the difference in terms of the key distance among Kademlia and KadRTT. ![Figure2: DIfference in terms of key distance between Kademlia and KadRTT.](https://paper-attachments.dropbox.com/s_D8AC57935117C4198CFABB69F65219C8CF5C262701249CB1AC9D03E01EB2C0D0_1612847319551_image.png "Figure2: DIfference in terms of key distance between Kademlia and KadRTT.") From Fig. 2, bounds on difference in terms of the key distance can be formulated as follows: Condition for equality of the max. lookup hop count between Kademlia and KadRTT: $\log(d(p_i^{RTT}, CID)) - \log(d(p_i^{kad}, CID))=\log \frac{d(p_i^{RTT}, CID)}{d(p_i^{kad}, CID)}=0 \Leftrightarrow \frac{d(p_i^{RTT}, CID)}{d(p_i^{kad}, CID)}<2$. Thus, if the above condition is satisfied by selecting $p_i^{RTT}$, the maximum hop count is the same as the one in Kademlia. Furthermore, RTT for each iterative lookup is reduced by selecting $p_i^{RTT}$ for each $i$-th lookup. ## 4. k-bucket entry re-arrangement for small RTT and uniform ID distribution In Kademlia, a new entry is added during iterative lookup if the k-bucket is not full. That is, the original Kademlia accepts any IDs and it drops the new entry if the k-bucket is full. As a result, there are possibility that many IDs are located in the specific ID range in the k-bucket and if a CID is in “void” area, the lookup up hop count may takes higher. Thus, an idea for make IDs be distributed uniformly in order not to create void area is necessary. At the same time such IDs has small RTT with the peer. In KadRTT, if a new entry $p_{new}$ having the following condition, is swapped with the existing one $p_{old}$. $T_{rtt}(p_x, p_{new}) < T_{rtt}(p_x, p_{old})$, and $\sigma ^2 (b_s^x - \{ p_{old} \} + \{ p_{new} \} ) = \mathop {\min }\limits_{p_i \in b_s^x } \left\{ {\sigma ^2 (b_s^x - \{ p_i \} + \{ p_{new} \} )} \right\}.$ where $p_x$ is the lookup client and $p_{old} \in b_s^x$, $b_s^x$ is the $s$-th k-bucket at $p_x$. **$\sigma ^2$ is the variance in terms of the ID distance among adjuscent peers $d(p_{i}, p_{i+1})$, where $p_i, p_{i+1}\in b_s^x$ .This procedure is performed during each lookup process.** ![Figure 5: Example of ID re-arrangement in KadRTT. When k-bucket is full, this procedure is performed.](https://i.imgur.com/OIAZlg5.png "Figure 5: Example of ID re-arrangement in KadRTT. When k-bucket is full, this procedure is performed.") ## 5. Comparison results We conducted performance comparisons on OverSim in terms of: **1. Lookup hop count** **2. Lookup latency for success/failure.** As for the condition, we set those following parameters before execution. | Parameter | Value | | ------------------------------------------------------- | ---------------------------------------------- | | $\sharp$ of peers | 1000, 2000, 5000 | | Simulation time | 1000(sec) | | Churn model | Pareto churn | | $\alpha$, the number of peers asked in parallel per one lookup | 2, 5, 10, 15, 20 | | $k$, k-bucket size | 20, 30, 40 | | $b$, Network diameter ($O(\log 2^b)$) | 1 | | Lookup type | iterative lookup | | Application type | KBRTest (Key-based Routing), DHTTest (PUT/GET) | The comparison results are shown as follows: ### 5. 1 KBRTest case: ### 5.1.1 Lookup hop count mean with varying $\alpha$: - $k=20$. ![KBRTest/1000 peers](https://i.imgur.com/Beb5Vwx.png "KBRTest/1000 peers") ![KBRTest/2000 peers](https://i.imgur.com/0UBDjqV.png "KBRTest/2000 peers") ![KBRTest/3000 peers](https://i.imgur.com/MCkY5pP.png "KBRTest/3000 peers") ### 5.1.2 Lookup success latency mean with varying $\alpha$: ![KBRTest/1000 peers](https://paper-attachments.dropbox.com/s_0C44288645FCE0667A4052D6D2EA3386801DFEE2C2CD98D8C1F54CEE741E583E_1613959866898_image.png "KBRTest/1000 peers") ![KBRTest/2000 peers](https://i.imgur.com/NmPnc3r.png "KBRTest/2000 peers") ![KBRTest/3000 peers](https://i.imgur.com/mHc6Jg1.png "KBRTest/3000 peers") ### 5.1.3 Lookup total latency mean (success + failure) with varying $\alpha$: ![KBRTest/1000 peers](https://paper-attachments.dropbox.com/s_0C44288645FCE0667A4052D6D2EA3386801DFEE2C2CD98D8C1F54CEE741E583E_1613960055233_image.png "KBRTest/1000 peers") ![KBRTest/2000 peers](https://i.imgur.com/k45fIyb.png "KBRTest/2000 peers") ![KBRTest/3000 peers](https://i.imgur.com/blQ9YIm.png "KBRTest/3000 peers") ### 5.1.4 Lookup success latency mean with varying $k$ and $\alpha$: -- ${(k, \alpha)=(20,5), (30, 5), (40, 5),(50,5),(60,5)}$ **-- Optimal of ${(k, \alpha) = (30,5)}$ at KadRTT.** ![KBRTest/3000 peers](https://i.imgur.com/7hYntTd.png, "KBRTest/3000 peers") -- ${(k, \alpha)=(10,10),(20,10), (30, 10), (40, 10),(50,10),(60,10)}$ **-- Optimal of ${(k, \alpha) = (20,10)}$ at KadRTT.** ![KBRTest/3000 peers](https://i.imgur.com/O1alE81.png, "KBRTest/3000 peers") -- ${(k, \alpha)=(20,20), (30, 20), (40, 20),(50,20), (60,20)}$ **-- Optimal of ${(k, \alpha) = (30,20)}$ at KadRTT.** ![KBRTest/3000 peers](https://i.imgur.com/4PFmXAS.png, "KBRTest/3000 peers") ![](https://i.imgur.com/4XgDuCP.png) From those results by varying ${k}$, we can see that: * Lookup latencies of Kademlia decreases and approaches to that of KadRTT. * KadRTT takes the minimum at certain point of ${k}$, i.e., local minima. * if ${k}$ takes a large, the time taken to each k-bucket is full becomes larger, i.e., the contents of each k-bucket of KadRTT is similar with Kademlia if ${k}$ is larger ### 5.1.5 Lookup time with varying $t$(sec) (3000 peers, k=60, $\alpha=20$). ![KBRTest/3000 peers](https://i.imgur.com/ZaAvepb.png "KBRTest/3000 peers") - We can see that each ID is re-arranged in same space if $t$ is larger. - If $k$ is large, the time taken to complete the ID re-arrangement becomes larger. ### 5.1.6 Lookup success ratio with varying $\alpha$: ![KBRTest/1000 peers](https://paper-attachments.dropbox.com/s_0C44288645FCE0667A4052D6D2EA3386801DFEE2C2CD98D8C1F54CEE741E583E_1613960260479_image.png "KBRTest/1000 peers") ![KBRTest/2000 peers](https://i.imgur.com/Js68Kgs.png "KBRTest/2000 peers") ![KBRTest/3000 peers](https://i.imgur.com/fWAPnmp.png "KBRTest/3000 peers") ### 5.1.7 FindNode() message bytes/s with varying $\alpha$: ![KBRTest/3000 peers](https://i.imgur.com/GcqdhHb.png "KBRTest/3000 peers") ### 5.1.8 findNode message $\sharp$ with varying $\alpha$: ![KBRTest/3000 peers](https://i.imgur.com/PYeivm4.png "KBRTest/3000 peers") - Message ${\sharp}$ with varying $k$. ![](https://i.imgur.com/B32oZ7S.png) - If $\alpha$ is larger, the message $\sharp$ becomes larger. - If $k$ is larger with fixing $\alpha$ and $\beta$, the message $\sharp$ is smaller. From the results in $\sharp$ of peers = 1000 in KBRTest, $\alpha=5$ is optimal in terms of total lookup latency and hop count, though lookup success latency is not optimal on that value. Thus, lookup parallelism affects on the success ratio, not on the latency itself. ### 5.1.9 Lookup time/success ratio comparisons by varying lookup timeout. - The timeout is varied in the range of 1, 5, 10, 20, 30. Default timeout value in OverSim is 10. ![](https://i.imgur.com/Eokfg63.png) ![](https://i.imgur.com/nQQF6KE.png) ![](https://i.imgur.com/Es8gjAK.png) ![](https://i.imgur.com/DYuEY01.png) ![](https://i.imgur.com/JNYv7Zr.png) ![](https://i.imgur.com/ENa4QOI.png) From those obtained result, we can say: - In the case of low $\alpha$, lookup failure occurs if the time out is low. - Even if the timeout is low, a large the lookup is success if $\alpha$ is large. ### 5.1.10 Lookup time comparisons by varying ${\sharp}$ of next hops. $\sharp$ of next hops (${\beta}$) affects on: - ${\sharp}$ of messages including findNode req/response. - Lookup latency - Lookup hop count However, we can see that KadRTT outperforms Kademlia in terms of lookup hop count. Thus, we avaluate the first two poionts. ![](https://i.imgur.com/qWFBp4N.png) ![](https://i.imgur.com/Nw7kBuK.png) - From the obtained result, we can see: - increasing the number of next hops can decrease the looup latency, but more messages are exchanged. - If the number of next hops is a certain point or more, the lookup latency is almost the same (e.g., 5 or more). ### 5.1.11 Lookup time comparisons by varying routing table refresh frequency. ![](https://i.imgur.com/whM8u8Y.png) - From the obtained results, we can see: - Lookup hop increases if the reflesh time is larger, but the lookup latency has convex nature, i.e., a optimal time duration exists (in this case, "200" is optimal). - Both ${\sharp}$ of messages in total and findNode decreases if the time duration increases. ### 5. 2 DHT GET/PUT case: - GET latency with varying $\alpha$: ![DHTApp/1000 peers](https://paper-attachments.dropbox.com/s_0C44288645FCE0667A4052D6D2EA3386801DFEE2C2CD98D8C1F54CEE741E583E_1613969902339_image.png "DHTApp/1000 peers") ![DHTApp/2000 peers](https://i.imgur.com/U63rkZk.png "DHTApp/2000 peers") - GET success ratio with varying $\alpha$: ![DHTApp/1000 peers](https://paper-attachments.dropbox.com/s_0C44288645FCE0667A4052D6D2EA3386801DFEE2C2CD98D8C1F54CEE741E583E_1613970046027_image.png "DHTApp/1000 peers") ![DHTApp/2000 peers](https://i.imgur.com/a5FOrEa.png "DHTApp/2000 peers") - PUT latency with varying $\alpha$: ![DHTApp/1000 peers](https://paper-attachments.dropbox.com/s_0C44288645FCE0667A4052D6D2EA3386801DFEE2C2CD98D8C1F54CEE741E583E_1613970143113_image.png "DHTApp/1000 peers") ![DHTApp/2000 peers](https://i.imgur.com/vp2IQXB.png "DHTApp/2000 peers") From those obtained results, we can see that $\alpha$ affects on the lookup success rate rather than lookup latency. However, we may confirm the effect on the lookup latency by varying $\alpha$ if other factors are varied. # 6. Theoretical analysis ## 6.1 Bounds on lookup time Let the total lookup time per one CID lookup for Kademlia and KadRTT as $T_{LT}^{kad}$, $T_{LT}^{RTT}$, respectively. Then let define the mean RTT per one lookup RPC for Kademlia as $t_{ave}^{kad}$. Since the initial distance between CID and $p_i^r$ in $r$-th k-bucket is bounded by ${d(CID, p_i^r) \leq 2^r}$, the lookup hop count is bounded as follows: \begin{eqnarray} Hop\ \sharp \ for\ Kademlia \ \leq \log(\varepsilon 2^r). \end{eqnarray} Thus, the total lookup time for Kademlia is bounded as follows: \begin{eqnarray} T_{LT}^{kad} \leq t_{ave}^{kad}\log\left( \varepsilon 2^r\right)\leq \frac{k}{\alpha}t_{ave}^{kad}\log(\varepsilon 2^r) = \overline{T_{LT}^{kad}}, \end{eqnarray} where $\frac{k}{\alpha}$ is the maximum lookup index when all entries in $r$-th k-bucket are traced. ${\varepsilon}$ is the ratio of the minimiun ID distance to the k-bucket size, i.e., ${2^r}$. Next as for KadRTT, let define the RTT ratio to ${t_{ave}^{kad}}$ as ${0<\rho<1}$. The initial ID distance between CID and ${p_i^r}$ is bounded by ${\frac{2^r}{k}}$ with the uniform ID re-arrangement, then ${T_{LT}^{RTT}}$ is bounded by as follows: \begin{eqnarray} T_{LT}^{RTT} \leq \rho \times t_{ave}^{kad}\frac{k}{\alpha}\log\left(\frac{2^r}{k}\right) = \overline{T_{LT}^{RTT}}. \end{eqnarray} Thus, we have \begin{eqnarray} \overline{T_{LT}^{kad}} - \overline{T_{LT}^{RTT}} = \frac{t_{ave}^{kad}k}{\alpha}\{\log \varepsilon + (1-\rho)r + \rho \log{k}\}. \end{eqnarray} If the following condition is satisfied, we have ${\overline{T_{LT}^{kad}} - \overline{T_{LT}^{RTT}} >0}$. \begin{eqnarray} \log \varepsilon > (\rho-1)r - \rho \log k. \end{eqnarray} **Thus, the upper bound for the total lookup in KadRTT outperforms the one in Kademlia.** ## 6.2 Bounds on total message $\sharp$ for findNode(). ### 6.2.1 Total message $\sharp$ per one lookup iteration with pool entry exchange probability=1. At first, the client selects ${\alpha}$ closest peers from its own $r$-th k-bucket, and it sends findNode() to them. Then, $\alpha$ peers send findNode() response including ${\beta}$ next hops (${\beta}$ is "**resiliency**" in libp2p). Thus, in this one round trip, ${2\alpha}$ messages occurs. Each response has ${\beta}$ next hops and they are added into the pool if they have closer ID than existing ones. Let the size is $|PL|$ (typically, ${|PL|=k}$). Then each entris in the pool is asked with ${\alpha}$ concurrent queries. Thus, the actual amount of the pool is ${\min\{\alpha \beta , |PL|\}}$. The client tries to send findNode() to every peer in the pool, and at the same time it send STORE messages. Thus, in this phase, ${2\min\{\alpha \beta , |PL|\}}$ messages are required. Thus, in total, ${\sharp}$ of messages regading findNode() and STORE for **per one iteration** is: \begin{eqnarray} 2\left(\alpha + \sum_{i=1}^{\log\left(\frac{2^r}{k}\right)-1}{{\min\{\alpha \beta , |PL|\}}}\right) =2\left\{\alpha + \min\{\alpha \beta , |PL|\}\left(\log\left(\frac{2^r}{k}\right)-1\right)\right\}. \end{eqnarray} In particular, if ${\beta=|PL|=k}$, the above formula is: \begin{eqnarray} 2\left\{\alpha + k\left(\log\left(\frac{2^r}{k}\right)-1\right)\right\}. \end{eqnarray} The following plot is the ${\sharp}$ of messages with ${\beta=|PL|=k}$. ![${\sharp}$ of messages with ${\beta<k}$.](https://i.imgur.com/A7UvGlE.png) The following plot is the ${\sharp}$ of messages with ${\beta < k}$. ![${\sharp}$ of messages with](https://i.imgur.com/QY5FCMU.png) ### 6.2.2 Total message $\sharp$ per one lookup iteration with pool entry exchange probability<1 (General case). In this case, we define the probability $P$ meaning the exchange probability that a new next hop is inserted into the pool by it having shorter ID distance than the existing one in the pool. Since the total $\sharp$ of next hops returned from $\alpha$ is ${\alpha \beta}$, we can consider two cases: 1. Case of ${\alpha \beta > |PL|}$, where $k$ is the entry ${\sharp}$ of the pool: The expected value in in terms of ${\sharp}$ of exchange is: \begin{eqnarray*} &&N_E(\alpha, \beta, P)=\sum_{i=1}^{|PL|}{\left\{i\times {}_{\alpha \beta} C _iP^i(1-P)^{\alpha \beta -i}\right\}} . \end{eqnarray*} 2. Case of ${\alpha \beta \leq |PL|}$: Since ${\sharp}$ of entries in the pool is at most ${\alpha \beta}$, we have the expected value as \begin{eqnarray*} &&N_E(\alpha, \beta, P)=\sum_{i=1}^{\alpha \beta}{\left\{i\times {}_{\alpha \beta} C _iP^i(1-P)^{\alpha \beta -i}\right\}} . \end{eqnarray*} Thus, the total ${\sharp}$ of messages per one lookup iteration is bounded by \begin{eqnarray} 2\left\{\alpha + N_E(\alpha, \beta, P)\left(\log\left(\frac{2^r}{k}\right)-1\right)\right\}. \end{eqnarray} ## 6.3 Optimal $k$ and $\alpha$: ### 6.3.1 Bounds for Lookup time. ${T_{LT}^{RTT} \leq \rho t_{ave}^{kad}\frac{k}{\alpha}\log\left(\frac{2^r}{k}\right)=\overline{T_{LT}^{RTT}}}$ is an ideal lookup time when every ID in the $r$-th k-bucket is arranged such that each adjacent ID space is equal. However, from the result in Section 5.1.5, the time taken to become such an ideal state depends on $k$. That is, if $k$ takes larger, the time also becomes larger. Thus, effects on setting $k$ are as follows, where $t$ is the time when every ID in $r$-th k-bucket is arranged. * Initial state: If $k$ is larger, the time duration for the state each ID is not arranged becomes larger. * Ideal state: If $k$ is larger, ${\frac{2^r}{k}}$ converge to 0, i.e., as one-hop DHT. Here, we define the arrival rate per time unit for adding a new entry to $r$-th k-bucket as ${\lambda}$. Initially, a new peer obtaines $k$ peers from others(initial phase). Then, at least more $k$ arrivals are necessary for an ideal ID re-arrangement. if the arrivals occurs by Poisson distribution after the initial phase, the possibility of ${k}$ add k-bucket requests occurs during ${t}$ is defined by: \begin{eqnarray} P(x=k,t) = \frac{(\lambda t)^{k}}{(k)!}e^{-\lambda t} \end{eqnarray} Actually, the required possibility for ID re-arrangement is "**${k}$ or more k-bucket add requests have arrived during $t$**", that is derived by: \begin{eqnarray} P(x\geq k+1, t) = 1-\sum_{l=1}^{k}{P(x=l,t)}. \end{eqnarray} Thus, ${\overline{T_{LT}^{RTT}}}$ can be rewritten as follows: \begin{eqnarray} \overline{T_{LT}^{RTT}} &=& \rho t_{ave}^{kad}\frac{k}{\alpha}\log\left(\left(1-\sum_{l=1}^{k}{P(x=l,t)}\right)\frac{2^r}{k}+\epsilon 2^rP(x<=k,t) \right) \\ &=& \frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\left(1-\sum_{l=1}^{k}{\frac{(\lambda t)^{l}}{l!}e^{-\lambda t}}\right)\frac{2^r}{k}+\epsilon 2^r\sum_{l=1}^{k}{\frac{(\lambda t)^{l}}{l!}e^{-\lambda t}} \right) \\ &=& \frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\left(1-\int_{t}^{\infty}{\frac{(\lambda t)^k}{(k)!}e^{-\lambda t}dt}\right)\frac{2^r}{k}+\epsilon 2^r\int_{t}^{\infty}{\frac{(\lambda t)^k}{(k)!}e^{-\lambda t}dt} \right). \end{eqnarray} This formula is developed with *Incomplete gamma function* ${\gamma (k + 1,t)}$ and ${\Gamma (k + 1,t)}$ as follow: \begin{eqnarray} \overline{T_{LT}^{RTT}} &=& \frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{2^r}{k}\int_{0}^{t}{\frac{(\lambda t)^k}{(k)!}e^{-\lambda t}dt}+\epsilon 2^r\int_{t}^{\infty}{\frac{(\lambda t)^k}{(k)!}e^{-\lambda t}dt} \right)\\ &=&\frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k!}\left(\frac{1}{k}\gamma(k+1,t)+\epsilon \Gamma(k+1,t)\right)\right)\\ &=&\frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k!}\left(\frac{1}{k}\left(k\gamma(k,t)-t^ke^{-t}\right)+\epsilon \left(k\Gamma(k,t)+t^k e^{-t}\right)\right)\right). \end{eqnarray} Here, if we set $\epsilon=1$ as a worst ID distance in Kademlia and ${t\rightarrow \infty}$ as the ideal state for KadRTT, we have \begin{eqnarray} \gamma(k,t) \rightarrow \Gamma(k), \Gamma(k,t) \rightarrow 0. \end{eqnarray} Thus, \begin{eqnarray} \overline{T_{LT}^{RTT}} &\rightarrow& \frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k!}\left(\Gamma(k)+\epsilon kt^ke^{-t} - \frac{1}{k}t^ke^{-t}\right)\right) \\ &=&\frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k!}\Gamma(k)\right) \\ &=&\frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k!}(k-1)!\right)\\ &=&\frac{k\rho t_{ave}^{kad}}{\alpha}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k}\right)\\ &=&T_{kadRTT}^{t\rightarrow \infty}(r). \end{eqnarray} In particular, since it is conceived that only one iteration is needed for the lookup, we can define the lookup time for only one iteration, i.e., \begin{eqnarray} T_{kadRTT}^{t\rightarrow \infty}(r) = \log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k}\right). \end{eqnarray} The following plot(x-axis is $k$) is the one of ${T_{kadRTT}^{t\rightarrow \infty}(r)}$. - Plot for the lookup time for all iteration (${\frac{k}{\alpha}}$ iteration.) ![](https://i.imgur.com/OGZVFoo.png) - Plot for the lookup time for one iteration (without ${\frac{k}{\alpha}}$ iteration.) ![](https://i.imgur.com/gDJOZla.png) $k$ is supposed to have two solutions. The one is 0, and the other is a positive value. Since the number of messages decreases with increasing $k$, the second one is the candidate of the optimal $k$. ### 6.3.2 Optimal value of $k$ with unknown value $\lambda$. The solution of ${T_{kadRTT}^{t\rightarrow \infty}=0}$ other than ${k=0}$ is, \begin{eqnarray} k_{opt}^\prime(r) &=& -\frac{W_n(-2^re^{-\lambda} \log_e\lambda)}{\log_e\lambda}\\ &\approx& -\frac{W(-2^re^{-\lambda} \log_e\lambda)}{\log_e\lambda}, \end{eqnarray} where ${W_n(x)}$ is *Lambert W function*. Since $k$ takes positive integer value and a larger $k$ leads to the reduction of messages, we set the optimal $k$, i.e. ${k_{opt}}$ is: \begin{eqnarray} k_{opt}(r) &=& \left\lceil-\frac{W(-2^re^{-\lambda} \log_e\lambda)}{\log_e\lambda}\right\rceil. \end{eqnarray}  ### 6.3.3 Total lookup time with expected iteration count Though the maximum ${\sharp}$ of ieteration is defeind as ${\frac{k}{\alpha}}$, typically such a case is rare. Thus, we derive the expected iteration count. If we define the expected value for the iteration count as ${C_E(\alpha, \beta, k)}$, the following formula is rewritten as: \begin{eqnarray} T_{kadRTT}^{t\rightarrow \infty}(r) &=& \rho t_{ave}\times C_E(\alpha, \beta, k)\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k}\right). \end{eqnarray} We derive ${C_E(\alpha, \beta, k)}$ as follows: Let the propability for content hit with one query as ${P_1}$. For example, ${P_1}$ stands for the one that the ID distance becomes 0. At the pool, in total ${\frac{|PL|}{\alpha}}$ lookup iterations are performed, and for each iteration, ${\alpha \beta}$ next hops are returned without overlap. Since the condition that the process goes to the next iteration is all lookup fails when ${\alpha \beta}$ next hops are asked, its probability is: \begin{eqnarray} (1-P_1)^{\alpha \beta}. \end{eqnarray} That is, the content hit probability is ${1-(1-P_1)^{\alpha \beta}}$.Thus, probability for no content hit through all iteration in the pool is defined by: \begin{eqnarray} \overline {P_{pool}} = \left\{(1-P_1)^{\alpha \beta} \right\}^{\frac{|PL|}{\alpha}}=(1-P_1)^{\beta |PL|}. \end{eqnarray} In the k-bucket, the next iteration for the next ${\alpha}$ entries goes by ${\overline {P_{pool}}}$. Since the content hit probability for one iteration in the pool is ${1-\overline {P_{pool}}}$, the expected value for the iteration count is as follows: \begin{eqnarray} C_E(\alpha, \beta, k) &=& \sum_{i=1}^{\frac{k}{\alpha}}{i \times \overline {P_{pool}}^{i-1}(1-\overline {P_{pool}})}\notag \\ &=&\sum_{i=1}^{\frac{k}{\alpha}}{i\times (1-P_1)^{(i-1)\beta |PL|}\left\{1-(1-P_1)^{\beta|PL|}\right\}}\notag \\ &=&\sum_{i=1}^{\frac{k}{\alpha}}{i\times\left\{(1-P_1)^{(i-1)\beta |PL|}-(1-P_1)^{i\beta|PL|}\right\}}\notag \\ &=&\sum_{i=1}^{\frac{k}{\alpha}-1}{(1-P_1)^{(i-1)\beta |PL|}}-\frac{k}{\alpha}(1-P_1)^{\frac{k}{\alpha}\beta |PL|}\notag \\ &=& \frac{1-(1-P_1)^{\frac{k}{\alpha}\beta |PL|}}{1-(1-P_1)^{\beta |PL|}}-\frac{k}{\alpha}(1-P_1)^{\frac{k}{\alpha}\beta |PL|}. \end{eqnarray} Thus, the expected lookup time is defined as follows: \begin{eqnarray} T_{kadRTT}^{t\rightarrow \infty}(r) &=& \rho t_{ave}\times C_E(\alpha, \beta, k)\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k}\right)\notag \\ &=&\rho t_{ave}\times \left\{\frac{1-(1-P_1)^{\frac{k}{\alpha}\beta |PL|}}{1-(1-P_1)^{\beta |PL|}}-\frac{k}{\alpha}(1-P_1)^{\frac{k}{\alpha}\beta |PL|}\right\}\log\left(\frac{\lambda^{k}e^{-\lambda}2^r}{k}\right). \end{eqnarray} As for ${P}$, i.e., the probability for entry exchange in the k-bucket can be derived as by considering two cases. The first one is that whether the new entry has shorter RTT than others in the k-bucket or not. The acceptance probability is ${\frac{1}{2}}$. The next one is that whether the variance by swithing the new entry with an exsisting one is reduced or not. Also the acceptance probability is ${\frac{1}{2}}$. Thus, we define ${P}$ as ${\frac{1}{4}\lambda_{msg}}$, where ${\lambda_{msg}}$ is the probability of findNode message arrival per time unit. For ${P_1}$, it is the probability of content hit per one lookup. Since the closest ID to CID may exist in the 1st k-bucket in the last lookuped peer, we have \begin{eqnarray} P_1 &=& \frac{1}{k_{opt}(1)}. \end{eqnarray} Thus, the actual expected lookup time is \begin{eqnarray} T_{kadRTT}^{t\rightarrow \infty}(r) &=&\rho t_{ave}\times \left\{\frac{1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k}{\alpha}\beta |PL|}}{1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\beta |PL|}}-\frac{k}{\alpha}\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k}{\alpha}\beta |PL|}\right\}\log\left(\frac{(\frac{1}{4}\lambda_{msg})^{k}e^{-\frac{1}{4}\lambda_{msg}}2^r}{k}\right) \notag \\ %&=&\rho t_{ave}\times \left\{\frac{1-(1-P_1)^{\frac{k}{\alpha}\beta |PL|}}{1-(1-P_1)^{\beta |PL|}}-\frac{k}{\alpha}(1-P_1)^{\frac{k}{\alpha}\beta |PL|}\right\}\log\left(\frac{\left(\frac{\alpha \beta }{4|PL|^2}\right)^{k}e^{-\left( \frac{\alpha \beta }{4|PL|^2}\right)}2^r}{k}\right). \end{eqnarray} From the lookup hop count term, we derive the optimal ${k}$ as \begin{eqnarray} k_{opt}(r) &=& \left\lceil-\frac{W(-2^re^{-\frac{1}{4}\lambda_{msg}} \log_e{\frac{1}{4}\lambda_{msg}})}{\log_e{\frac{1}{4}\lambda_{msg}}}\right\rceil. \end{eqnarray} The condition for minimizing the number of iteration is defined by: \begin{eqnarray} 0< Ite(\alpha,\beta)=\left\{\frac{1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}}{1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\beta |PL|}}-\frac{k_{opt}(r)}{\alpha}\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}\right\} \leq 1 \end{eqnarray} Thus, we can say that both ${\alpha}$ and ${\beta}$ depend on the k-bucket index, i.e., ${r}$. However, the pool is shared among every lookup procedures with the fixed buffer size. Thus, ${|PL|}$ should be a fixed value. Actually, ${|PL|}$ affects ths probability of the incomming next hop discardings. An incomming next hop is discarded if the same entry with it is already included in the pool, or no entry in the pool has a larger ID distance than it when the pool size is full. In ${Ite(\alpha,\beta)}$, its very hard to derive both ${\alpha}$ and ${\beta}$ directly. Thus, we approximate ${Ite(\alpha,\beta)}$ as follows: \begin{eqnarray} Ite(\alpha,\beta)&=&\left\{\frac{1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}}{1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\beta |PL|}}-\frac{k_{opt}(r)}{\alpha}\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}\right\} \\ &=&\left\{1-\frac{k_{opt}(r)}{\alpha}\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}\right\}, \end{eqnarray} where we define the first term as follows: \begin{eqnarray} 1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}\approx 1, \\ 1-\left(1-\frac{1}{k_{opt}(1)}\right)^{\beta |PL|}\approx 1 \end{eqnarray} Then we obtain ${\alpha(r)}$ and ${\beta(r)}$ when ${Ite(\alpha, \beta)}$ is minimized as follows: \begin{eqnarray} \alpha(r) &=& |PL|\sqrt{\frac{2k_{opt}(r)\log_e{\left(1-\frac{1}{k_{opt}(1)}\right)}}{W\left(\frac{2}{k_{opt}(r)}|PL|^2\log_e{\left(1-\frac{1}{k_{opt}(1)}\right)}\right)}}, \\ \beta(r) &=& \sqrt{\frac{W\left(\frac{2}{k_{opt}(r)}|PL|^2\log_e{\left(1-\frac{1}{k_{opt}(1)}\right)}\right)}{2k_{opt}(r)\log_e{\left(1-\frac{1}{k_{opt}(1)}\right)}}}. \end{eqnarray} We can see that ${\alpha(r)}$, ${\beta(r)}$ and ${k(r)}$ depend on the k^bucket index ${r}$, and ${k(r)}$ requires the message arrival rate ${\lambda_{msg}}$. Initially, we set those initial values for each k-bucket, and then update according to ${\lambda_{msg}}$. ![](https://i.imgur.com/CyrlFZN.png) ### 7. General analysis for optimal values for ${\alpha}$ and ${\beta}$. ### 7.1 Iteration count model If we define the probability for content hit per one query as $P_{query}$, ${Ite(\alpha, \beta)}$ can be rewritten as follows: \begin{eqnarray} Ite(\alpha,\beta)&=&\left\{\frac{1-\left(1-P_{query}(r)\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}}{1-\left(1-P_{query}(r)\right)^{\beta |PL|}}-\frac{k_{opt}(r)}{\alpha}\left(1-P_{query}(r)\right)^{\frac{k_{opt}(r)}{\alpha}\beta |PL|}\right\} \\ \end{eqnarray} Next, we derive ${P_{query}}$ as follows. When a peer tries to put itself as the content provider in the network, it fetches ${k}$ nearest peers with the CID from k-bucket, and it send STORE messages to ${k}$ peers. Thus, each stored peer has the provider's ID in a specific k-bucket index for the CID. If the provider's ID is included in ${r}$-th k-bucket a next hop peer, the probability for the content hit is ${\frac{1}{k_{opt}(r)}}$ at the next hop peer, not the client. Though ${k_{opt}}$ depends on the findNode message arrival rate, we assume that the arrival rate is equal among peer for deriving the content hit probability for deriving the approximated content hit probability. During the lookup procedures, each step ${i}$ has the content hit probability as ${\frac{1}{k_{opt}(i)}}$. Thus, the actual content hit probability is \begin{eqnarray} P_{query}(r) = \sum_{i=0}^r{\prod_{k=1}^{i-1} \left(1-\frac{1}{k_{opt}(k)}\right)\frac{1}{k_{opt}(i)}}. \end{eqnarray} Next, we define ${\beta}$. Since the ${\beta}$, i.e., the ${\sharp}$ of next hops is determined at the next hop peer, not the client. If ${\beta}$ takes larger, the content hit probability in ${\alpha \beta}$ next hops increases. However, the ${\sharp}$ of messages regarding findNode increases. Since ${k_{opt}}$ regulates the number of each k-bucket size, we define ${\beta}$ as follows: \begin{eqnarray} \beta(r) = \min\left(k_{opt}(r), |PL| \right). \end{eqnarray} Since the closest next hops are included in the one k-bucket, ${\sharp}$ of next hops is bounded by the k-bucket size for achieving the higher content hit probability, while the size can be restricted by the pool size, i.e., ${\sharp}$ of newly put entries is bounded by ${|PL|}$. This is for not increasing ${\sharp}$ of findNode messages. Finally, we derive the optimal value for ${\alpha}$ as ${\alpha(r)}$. from ${Ite(\alpha, \beta)=1}$, we derive ${\alpha(r)}$ as follows: \begin{eqnarray} \alpha(r) = \frac{\beta(r)k_{opt}(r)|PL|\log_e(1-p_{query})}{W\left( -\frac{|PL|(1-p_{query})^{|PL|\beta(r)+\frac{|PL|\beta(r)}{1-(1-p_{query})^{|PL|\beta(r)}}}\beta(r)\log_e(1-p_{query})}{(1-p_{query})^{|PL|\beta(r)}-1}\right)-\frac{\beta(r)|PL|\log_e(1-p_{query})}{1-(1-p_{query})^{\beta(r)|PL|}}}. \end{eqnarray}

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.