# Data Communication
NTNU 資料通訊
##### [Back to Note Overview](https://reurl.cc/XXeYaE)
{%hackmd @sophie8909/pink_theme %}
###### tags: `NTNU` `CSIE` `選修` `Data Communication`
<!-- tag順序 [學校] [系 必選] or [學程 學程名(不含學程的 e.g. 大師創業)] [課程] -->
## Score
- Homework 45%
- First Exam 20%
- Final Exam 25%
- Participation 5%
- Attendance 5%

## Course
- Data communication as a network of flows
- Point-to-point communcation vs. end-to-end communication
- ARQ at the data-link layer vs. Window flow control at the transport layer
- Error detection vs. error correction vs. retransmission
- Data retransmission and/or Passive replication
- Free-for-all multiaccess vs. Perfectly-scheduled multiaccess
- CAN, Aloha, TDMA and its extension to improve throughput
- Theory and models
- Data communication as a network of flows (a graph)
- Network flow algorithm, shortest-path algorithm, etc.
- Little's throrem, Queueing theory, Markov chains, Poisson process
- Scheduling for timely data delivery and/or data-loss tolerance
- Protocols and systems
- CRC, ARQ, multiplexing, CAN, Aloha, TDMA and its extension
- Real-time communication
- Fault-tolerant communication
- flow controls, time synchronization
## OSI layers

## Error Detection and Correction
1. single-parity check
2. horizontal and vertical parity check
3. cyclic redundancy check (CRC)
4. Hamming error detection/correction
### Metrics to evaluate error detection design
1. the minimum distance of a code
2. the probability that a random bit string will be accepted as error-free
3. the burst-detecting capability
- the largest integer $B$ such that a code can detect all burst of length $B$ or less
## Hamming Code and Error Correction/Detection
- a **parity check code (or linear code)** is a linear transformation from the string of data bits to the string of data bits and parity checks
- In general there can be $K$ data bits, and $L$ parity checks
- a **code word** is the result of the transform of a certain parity check code </br>$\to$ a **code word** is the data bits plus parity checks for:
1. single parity check
2. horizontal & vertical parity check
3. cyclic redundancy check(CRC)
### 2.3.1 Single Parity Checks
- Example
- 1011000 $\to$ 10110001
### 2.3.2 Horizontal and Vertical Parity Checks
### 2.3.3 Parity Check Codes
### 2.3.4 Cyclic Redundancy Checks
1. represent a code word by $X(D)$.
$$
X(D)=S(D)\cdot D^L+C(D)
$$
- S(D):its coefficients are data bits
- C(D):its coefficients are parity checks
2. choosing a generating polynomial, $g(D)$ and compute $\frac{S(D)\cdot D^L}{g(D)}$, we have:
$$
S(D)\cdot D^L=g(D)\cdot z(D)+C'(D)
$$
3. use the remainder polynomial $C'(D)$ as parity checks
$$
\begin{align}
C(D)&=C'(D)\\
\implies X(D)&=S(D)\cdot D^L+C(D)\\
&=g(D)\cdot z(D)+C'(D)+C(D)\\
&=g(D)\cdot z(D)
\end{align}
$$
which means $g(D)$ divides $X(D)$.
4. send $X(D)$ to the receiver.
5. Now, use $e(D)$ to represent the errors introduce along the sending path. The the receiver will get $Y(D)=X(D)+e(D)$
6. compute $\frac{Y(D)}{g(D)}$, and see whether the $e(D)$ is $0$ or not. If not, we know there's error.
### Hamming Code
#### Definition
- Definition 1: ==**Hamming distance**==
- between two code words is the number of different bits between the tow denoted by d(u,v) for code words u and v.
- Definition 2: ==**Hamming Weight**==
- of a code word is the number of non-zero bits in it, denoted by wt(u) code word u.
- Hamming weight for a **linear code** is **the minimum Hamming weight of any non-zero code word** in the linear word.
- Definition 3: ==**The Nearest-Neighbor Rule**==
- we perform error correction by converting the **received bit vector** into the code word that has **the smallest Hamming distance** to it.

#### Theorem
- If the Hamming weight of a linear code is at least 2t+1, then the neatest-neighbor rule can correctly correct any t or fewer errors; alternatively, it can detect any 2t or fewer errors.
## 2.4 ARQ: RETRANSMISSION STRATEGIES
ARQ: Automatic Repeat Request
- Model and definition, assuming Node A sends frames to Node B 
### An initial idea of Stop-and-Wait ARQ
- Node A sends a new packet only if it received an acknowledgement from Node B (called an Ack)
- Node B sends an Ack if the received frame is error-free; otherwise, it sends a negative acknowledgement (called an Nak)
<!-- ## 3.1 INTRODUCTION
### 3.1.1 Multiplexing of Traffic on a Communication Link
## 3.2 QUEUEING MODELS-LITTLE'S THEOREM
### 3.2.1 Little's Theorem
### 3.2.2 Probabilistic Form of Little Theorem
### 3.2.3 Applications of Little's Theorem -->
## Queueing Model and Little's Theorem
### Link delay
- The **processing delay** between the time the packet is
- correctly received at the head node of the link
- the time the packet is assigned to an outgoing link queue for transmission.
- (In some systems, we must add to this delay some additional processing time at the DLC and physical layers.)
- The **queueing delay** between the time the packet is
- assigned to a queue for transmission
- the time it starts being transmitted
- During this time, the packet waits while other packets in the transmission queue are transmitted.
- The **transmission delay** between the times that
- the first and last bits of the packet are transmitted.
- The **propagation delay** between the time
- the last bit is transmitted at the head node of the link
- the time the last bit is received at the tail node.
- This is proportional to the physical distance between transmitter and receiver
- it can be relatively substantial, particularly for a satellite link or a very high speed link.
### Standard Queueing Theory Nomenclature
1. The **first letter** indicates the nature of the arrival process \[e.g., $M$ stands for memoryless, which here means a Poisson process (i.e., exponentially distributed interarrival times), $G$ stands for a general distribution of interarrival times, $D$ stands for deterministic interarrival times].
2. The **second letter** indicates the nature of the probability distribution of the service times (e.g., $U$, $G$, and $D$ stand for exponential, general, and deterministic distributions, respectively). In all cases, successive interarrival times and service times are assumed to be statistically independent of each other.
3. The **last number** indicates the number of servers.
- the memoryless property of exponential distribution:
$P\{x>s+t|x>s\} = P\{x>t\}$
$x$: interarrival time
### Analyzing efficiency of the stop-and-wait
#### Way #1
- Let $T$ be the total time between the transmission of a packet and reception of its Ack
- If error free, $T=T_1=T_t+2T_d+T_f$
>$T_t$:packet transmission time
>$T_d$:delay in propagation and processing
>$T_f$:feedback transmission time

- With chances of error, $T=T_2=T_1+T_{to}(EX-1)$.
>$EX=q^{-1}$
>$EX$:the expected number of times a packet must be transmitted for a stop-and-wait system
>$q$:the probability that a packet is correctly received and acked on a given transmission
>$T_{to}$:length of a timer(duration before a time-out)
>$T_t$:packet transmission time
>$T_2$:total time between the transmission of a packet and reception of its **Ack**

- setting $T_{to}>T_{1}$, we have $T_2=T_1+T_{to}(q^{-1}-1)>q^{-1}T_1$.
$$
\begin{align}
T_{to}&>T_1\\
T_{to}(q^{-1}-1)&>(q^{-1}-1)T_1\\
T_1+T_{to}(q^{-1}-1)&>q^{-1}T_1
\end{align}
$$
- setting $T_{to}<T_{1}$ will cause additional redundancy, but then we have $T_2<q^{-1}T_1$, a way to bound the latency.
#### Way #2
- Define efficiency $E$ as the expected number of packets delivered to node $B$ per frame from $A$ to $B$
- Efficiency $E = \frac{1}{EX} = q = 1-p$
>$EX$:the expected number of transmissions before the packet is correctly received
>$p$:the probability of frame error
>$q$:the probability of no frame error (q=1-p)

## 3.3 THE $M/M/1$ QUEUEING SYSTEM
### 3.3.1 Main Results
#### the Poisson Process
A stochastic process $\{A(t)|t > 0\}$ with rate $\lambda$ and
1. $A(t)$ = \# of arrivals in $[0,t]$
2. \# of arrivals occurring in disjoint time intervals are independent
3. \# of arrivals in any intend of length $\tau$ is Poisson distributed with parameter $\lambda\tau$
$$P\{A(t+\tau) - A(t)=n\}=e^{-\lambda\tau}\frac{(\lambda\tau)^n}{n!},\forall t,\tau > 0$$
######


$P_{ij} = P\{N_{k+1}= j| N_k = i\}$
### 3.4.1 $M/M/m$: The m-Server Case
## 3.7 TIME REVERSIBILITY-[BURKE'S THEOREM](https://en.wikipedia.org/wiki/Burke%27s_theorem)
The analysis of the $M/M/1$, $M/M/m$, $M/M/\infty$, and $M/M/m/m$ systems was based on the equality of the steady-state frequency of transitions from $j$ to $j + 1$, that is, $p_jP_{j(j+1)}$, with the steady-state frequency of transitions from $j + 1$ to $j$, that is, $p_{j+1}P_{(j+1)j}$.
- [detailed balance equations](https://en.wikipedia.org/wiki/Detailed_balance)
- valid for any Markov chain between with integer states in which transitions can occur only between neighboring states
- [birth-death processes](https://en.wikipedia.org/wiki/Birth%E2%80%93death_process)
- time reversibility
### Properties of the Reversed Chain:
1. The reversed chain is irreducible(不可約的), aperiodic(非週期性的), and has the same stationary distribution as the forward chain.
2.
3.
<!-- p217還有一些東西 -->
### Burke's Theorem
- Consider an $M/M/1$, $M/M/m$, or $M/M/\infty$ system with arrival rate $\lambda$.
- Suppose that the system starts in steady-state.
- Then the following hold true:
1. The departure process is ***Poisson*** with rate $\lambda$.
2. At each time $t$, the number of customers in the system is independent of the sequence of departure times prior to $t$
- Proof on text book 
---
##### [Bernoulli's Principle](https://en.wikipedia.org/wiki/Bernoulli%27s_principle): $P\{S_n = k\} = \binom{n}{k}p^k(1-p)^{n-k}$
##### [Poisson distribution](https://en.wikipedia.org/wiki/Poisson_distribution)
*[ARQ]: Automatic Repeat Request
*[irreducible]:不可約的
*[aperiodic]:非週期性的
## 4.1
## 4.2 Slotted Multiaccess and the Aloha System

$\begin{cases}
P_0(P_{02}+P_{03}+P_{04})&=P_1(P_{10})\\
P_1(P_{10}+P_{12}+P_{13}+P_{14})&=P_2(P_{21}) \\
P_2(P_{21}+P_{23}+P_{24})&=P_0(P_{02})+P_1(P_{12})+P_3(P_{32})\\
P_3(P_{32}+P_{34})&=P_0(P_{03})+P_1(P_{13})+P_2(P_{23})+P_4(P_{43})\\
P_4(P_{43})&=P_0(P_{04})+P_1(P_{14})+P_2(P_{24})+P_3(P_{34})\\
P_0+P_1+P_2+P_3+P_4&=1
\end{cases}\\
\implies\begin{cases}
P_1(P_{10})&=P_0(P_{02}+P_{03}+P_{04})\\
P_2(P_{21})&=P_0(P_{02}+P_{03}+P_{04})+P_1(P_{12}+P_{13}+P_{14})\\
P_3(P_{32})&=P_0(P_{03}+P_{04})+P_1(P_{13}+P_{14})+P_2(P_{23}+P_{24})\\
P_4(P_{43})&=P_0(P_{04})+P_1(P_{14})+P_2(P_{24})+P_3(P_{34})\\
P_0+P_1+P_2+P_3+P_4&=1
\end{cases}$
## 1207
### TDMA Scheduling
- Collision-free TDMA schedule
- Primary interference
- Secondary interference
### Graph Theorem
- Terminologies
- $d(v)$ : degree of node $v$
- $T_i$ : a set of assigned time slots for node $i$ in a given cycle
- $|T_i|$ : # of time slots assigned to node $i$ in a given cycle

- Defines
1. **Saturation Condition**
node $i$ is said to be "saturated" if all links to node $v \in d(i)$ has been assigned with at least one time slot.
2. **Link Scheduling Criteria**
$T_u\cap T_i\cap T_j=\emptyset,\forall\text{ node }u\text{ and }i,j\in d(u),i\neq j$
- distributing algorithm
- M. Luby's algorithm
- decide independent nodes
1. generating random number for every nodes
2. start from any non-finish node
3. compare itself number to its neighbors
3.1 if it's the smallest amoung those, color it( is independent node)
3.2 if not, do nothing
4. do step 3 to check all non-finish node
5. a color is finished, repeat step 2.-4. until all node is colored
## 1214
### Industrial Internet of Things (IIoT)
- Synergizing sensing, analytics, and control
- Cloud
#### Low-latency → Real-time
- Conceptually, in data communication, **fast enough is good enough**
- Deadline: a way to specify what we meant by fast enough
- e.g. 自駕車反應時間
- Soft deadlines vs. hard deadlines
- Missing a soft deadline is not desirable but may be acceptable
- e.g. life video
- Missing a hard deadline will lead to disastrous consequences
- Soft/hard real-time system: a system that meets soft/hard deadlines
#### Real-time data communications
- For networked applications,
#### Data communication intermediaries
- Purposes
- Decoupling senders and receivers
- Simplifying
- Example intermediaries: TAO, MQTT, NSQ, ...
#### A model for event processing
- latency requirements
- Temporal semantics:
- Absolute time consistency on an event's elapse time since creation
- Relative time consistency on the difference between even's creation time
#### Real-time cyber
## 1221
- Real-time Fault-Tolerant Data Communication
- 
- 
- 
- 
- ARQ
- MQTT→Mosquitto
- QoS 0
- QoS 1: at-least-one semantics
- QoS 2
- Different layer deal with different thing ??????
- strategy 1: Active data replication
- 傳給所有,最後得到共識
- state-machine approach
- pros: low latency
- cons: high cost
- strategy 2: Passive data replication
- primary-back approach
- 傳給 Primary,Primary 傳給其他 Backup
- when Primary out of service, backup change to be new primary
- pros: low cost
- cons: high latency
- *lemma 1*
- $R\leq(N+L)T-\Delta PB - \Delta BB - X$ is the least upper bound of the deadline to do data replication to the backup server.
- $\Delta PB$ from pub to primary
- $\Delta BB$ from primary to backup
## 1228
### Network Flow Control
textbook 6.1~6.2.1, 6.3
#### Motivation
- To let the data transmit in the data network can handle by its capacity
- maximum flow, maximum cut, bla bla bla...
#### Challenge

- service requirements
1. fault tolerance(preventing data loss)
2. timing delay/data rate(FPS)
#### Objectives
1. limiting delay an buffer overflow


2. fairness

1. fair in terms of rate
→ $n+1$ flows each having rate of 0.5
→ overall throughput = $(n+1)\over 2$
2. fair in terms of resource usage
→ $n\over{(n+1)}$ capacity usage
#### Approaches
1. window flow control

- $W_{AB}$ : **an upper bound** on the # of data units that have been sent by $A$ and are not yet known by $A$(data sender) to have been received by $B$(data receiver).
- $k$ : received request number can send data up to seq# $k+w-1$

- Delay analysis

- $N = T \cdot \lambda$
- $\rho \cdot W = t \cdot \lambda$, where $0 \leq \rho \leq 1$
- if w increasing, t increasing
- case 1 : $d<w\cdot x\implies$ data rate $={1\over x}$
- 
- w is not important, because the sender is keep sending the message. And every $x$ time only get 1 message, so the data rate is $1\over x$
- case 2 : $d>w\cdot x\implies$ data rate $={w\over d}$
- 
- every $d$ time receive/send $w$ #msg
3. rate control
- type 1:
z 
- $r$ : minimum data rate, $r=min\{{1\over x},{w\over d}\}$
- type 2(0104)
## 0104
### Leaky Bucket Flow Control
- Time Synchronization Protocols
- use Markov chain
- 

- state $i$ : $w-i$ permit available, if $i\leq w$, $i-w$ data pending, otherwise.
- the probability of $k$ packet arrival at rate $r$ (i.e. in $1\over r$ seconds)
$$
a_k=\frac{e^{-\frac{\lambda}{r}}({\lambda\over r})^k}{k!}
$$
- special feature : only return state at most 1, because within 1 time step, only 1 new permit available
- $$
\begin{align}
&P_{0i}=\begin{cases}
a_{i+1}&,\text{if }i\geq 1\\
a_0+a_1&,\text{if }i=0
\end{cases}\\
\text{for }j\geq 1,&P_{ji}=\begin{cases}
a_{i-j+1}&,\text{if }j\leq i+1\\
0&,\text{otherwise}
\end{cases}
\end{align}
$$
- 
- average delay : 
- NTP accuracy
- Conventionally, can achieve milliseconds accuracy
- With improvement, may achieve an accuracy up to tens of microseconds
- Primary time servers
- Synced to national standards by radio
- Accuracy: tens of microseconds
- Secondary time severs
- Synced to primary time servers
- Accuracy: a few hundred microseconds to a few of milliseconds
- Clock strata
- NTP topology example
- Synchronizing clocks along the synchronization paths
- NTP synchronization basics
- Suppose that host A is going to synchronize its clock to that of host B
- Notation
- $\delta$ : one-way delay
- $\theta$ : offset between two clocks
- The mean offset between two clocks is determined by message exchange between A and B
- 
- Synchronization is performed by gradually reducing the mean offset
- PTP: precision time protocol
- Designed to achieve microsecond to sub-microsecond accuracy and precision
- Spatially localized
- Administration free
- Accessible for both high-end devices and low-cost, low-end devices
- 
- 
- 
## Final exam
NTP (p10)