###### tags: `Paper`
# Exploring the performance of ROS2 論文閱讀
Topics
1. Exploring the performance of ROS2
2. ROS2 QoS
3. ROS2 intraprocess communication
- [ ] ROS2 middleware
- [ ] SLAM
## Exploring the performance of ROS2
source: https://ieeexplore.ieee.org/document/7743223
測試環境:
<center>

</center>
論文所用的測試組態:
<center>

</center>
<br>
下表整理測試結果,有些情況會造成傳輸失敗:
ROS1:
- 不論是 remote 還是 local,第一次發布的資料都會遺失。
ROS2:
- Connext:
- 在**不同機器間**使用 Reliable QoS 傳輸超過 64KB 的資料
- 在**同一機器上**使用 Reliable QoS 傳輸超過 64KB 的資料
- FastRTPS
> string of 256B exceeds the maximum length in FastRTPS

> These constraints on large data originate from the fact that the maximum payload of Connext and OpenSplice is 64 KB. This is the maximum packet size of IP protocol.
---
以下介紹論文中使用的測量延遲方法,量測發送到接收之間的延遲。
在 talker (publisher) 內使用 `clock_gettime()` 紀錄時間,並同時利用一個 socket,當 listener (subscriber) 收到訊息並呼叫 callback 時,會寫入 socket 告訴 talker (publisher) 已經收到訊息,並完成一次延遲的計算。
> 寫入 socket 時,寫入的資料並不會立刻發送,中間會有一小段延遲,如果要降低延遲須使用 `TCP_NODELAY` 或 `TCP_CORK`,用法可參考底下 [References](##references)。
talker_client_interprocess.cpp:
```cpp src/interprocess_remote_eval/src/talker_client_interprocess.cpp
clock_gettime(CLOCK_REALTIME,&tp1)
...
chatter_pub->publish(msg);
len = read(sock, buf, sizeof(buf));
...
clock_gettime(CLOCK_REALTIME,&tp1)
```
listener_server_interprocess.cpp:
```cpp
void chatterCallback(const std_msgs::msg::String::SharedPtr msg) {
...
write(sock, "x", 1);
...
}
```
### socket 傳輸效能
論文中是使用 `write()`,但根據 man page 說明當 `send()` 沒有指定任何旗標時 (i.e. 旗標參數為 0),這時後會等同於 `write()`。
範例程式參考:https://www.cs.cmu.edu/~srini/15-441/S10/lectures/r01-sockets.pdf
### 追蹤 **`send()`**
先看看 `send()` 系統呼叫的定義
```c /net/socket.c
SYSCALL_DEFINE4(send, int, fd, void __user *, buff, size_t, len,
unsigned int, flags)
{
return __sys_sendto(fd, buff, len, flags, NULL, 0);
}
```
<details><summary>ftrace function_graph 結果</summary>
<p>
```c
CPU 1 is empty
CPU 2 is empty
CPU 3 is empty
cpus=4
2.737 us | mutex_unlock();
| __sys_sendto() {
| sockfd_lookup_light() {
| __fdget() {
1.702 us | __fget_light();
3.576 us | }
5.803 us | }
| sock_sendmsg() {
| security_socket_sendmsg() {
| apparmor_socket_sendmsg() {
| aa_sock_msg_perm() {
| aa_sk_perm() {
| _cond_resched() {
0.594 us | rcu_all_qs();
1.874 us | }
3.892 us | }
5.481 us | }
6.790 us | }
8.235 us | }
| inet_sendmsg() {
| tcp_sendmsg() {
| lock_sock_nested() {
| _cond_resched() {
0.606 us | rcu_all_qs();
1.922 us | }
0.602 us | _raw_spin_lock_bh();
0.651 us | __local_bh_enable_ip();
5.690 us | }
| tcp_sendmsg_locked() {
0.714 us | tcp_rate_check_app_limited();
| tcp_send_mss() {
| tcp_current_mss() {
0.923 us | ipv4_mtu();
0.859 us | tcp_established_options();
4.635 us | }
6.224 us | }
| sk_stream_alloc_skb() {
| __alloc_skb() {
| kmem_cache_alloc_node() {
| _cond_resched() {
0.642 us | rcu_all_qs();
1.966 us | }
0.694 us | should_failslab();
0.923 us | memcg_kmem_put_cache();
7.649 us | }
| __kmalloc_reserve.isra.54() {
| __kmalloc_node_track_caller() {
0.803 us | kmalloc_slab();
| _cond_resched() {
0.618 us | rcu_all_qs();
1.922 us | }
0.747 us | should_failslab();
0.863 us | memcg_kmem_put_cache();
8.198 us | }
9.543 us | }
1.055 us | ksize();
+ 21.345 us | }
0.811 us | sk_forced_mem_schedule();
+ 24.788 us | }
| skb_entail() {
0.658 us | tcp_chrono_start();
| tcp_cwnd_restart() {
0.642 us | tcp_init_cwnd();
0.690 us | bictcp_cwnd_event();
4.399 us | }
7.573 us | }
0.694 us | skb_put();
| __check_object_size() {
0.743 us | check_stack_object();
1.011 us | __virt_addr_valid();
0.826 us | __check_heap_object();
5.815 us | }
0.694 us | tcp_tx_timestamp();
| tcp_push() {
| __tcp_push_pending_frames() {
| tcp_write_xmit() {
0.855 us | ktime_get();
0.806 us | tcp_small_queue_check.isra.32();
| __tcp_transmit_skb() {
| skb_clone() {
| __skb_clone() {
0.991 us | __copy_skb_header();
2.677 us | }
4.587 us | }
0.674 us | tcp_established_options();
0.718 us | skb_push();
| smp_irq_work_interrupt() {
| irq_enter() {
0.626 us | rcu_irq_enter();
1.962 us | }
| __wake_up() {
| __wake_up_common_lock() {
0.638 us | _raw_spin_lock_irqsave();
0.702 us | __wake_up_common();
0.626 us | _raw_spin_unlock_irqrestore();
5.012 us | }
6.340 us | }
| __wake_up() {
| __wake_up_common_lock() {
0.722 us | _raw_spin_lock_irqsave();
| __wake_up_common() {
| autoremove_wake_function() {
| default_wake_function() {
| try_to_wake_up() {
1.288 us | _raw_spin_lock_irqsave();
| select_task_rq_fair() {
0.779 us | available_idle_cpu();
0.863 us | update_cfs_rq_h_load();
| select_idle_sibling() {
0.899 us | available_idle_cpu();
2.279 us | }
7.729 us | }
0.750 us | _raw_spin_lock();
0.911 us | update_rq_clock();
| ttwu_do_activate() {
| activate_task() {
| psi_task_change() {
0.799 us | record_times();
0.770 us | record_times();
0.858 us | record_times();
0.791 us | record_times();
8.832 us | }
| enqueue_task_fair() {
| enqueue_entity() {
0.843 us | update_curr();
| __update_load_avg_se() {
0.762 us | __accumulate_pelt_segments();
2.733 us | }
0.770 us | __update_load_avg_cfs_rq();
0.679 us | update_cfs_group();
0.963 us | account_entity_enqueue();
0.975 us | __enqueue_entity();
+ 13.596 us | }
| enqueue_entity() {
0.734 us | update_curr();
0.730 us | __update_load_avg_se();
0.863 us | __update_load_avg_cfs_rq();
| update_cfs_group() {
1.344 us | reweight_entity();
3.163 us | }
0.939 us | account_entity_enqueue();
0.686 us | __enqueue_entity();
+ 12.504 us | }
0.642 us | hrtick_update();
+ 30.411 us | }
+ 41.908 us | }
| ttwu_do_wakeup() {
| check_preempt_curr() {
| resched_curr() {
| native_smp_send_reschedule() {
| x2apic_send_IPI() {
0.842 us | __x2apic_send_IPI_dest();
2.428 us | }
3.929 us | }
5.509 us | }
7.027 us | }
9.647 us | }
+ 53.654 us | }
0.590 us | ttwu_stat();
0.634 us | _raw_spin_unlock_irqrestore();
+ 72.635 us | }
+ 73.968 us | }
+ 75.420 us | }
+ 77.206 us | }
0.630 us | _raw_spin_unlock_irqrestore();
+ 81.054 us | }
+ 82.315 us | }
| irq_exit() {
0.767 us | idle_cpu();
0.698 us | rcu_irq_exit();
3.760 us | }
+ 99.871 us | }
| smp_irq_work_interrupt() {
| irq_enter() {
0.642 us | rcu_irq_enter();
2.006 us | }
| __wake_up() {
| __wake_up_common_lock() {
0.626 us | _raw_spin_lock_irqsave();
0.654 us | __wake_up_common();
0.706 us | _raw_spin_unlock_irqrestore();
4.788 us | }
6.119 us | }
| __wake_up() {
| __wake_up_common_lock() {
0.626 us | _raw_spin_lock_irqsave();
| __wake_up_common() {
| autoremove_wake_function() {
| default_wake_function() {
| try_to_wake_up() {
0.671 us | _raw_spin_lock_irqsave();
| select_task_rq_fair() {
0.703 us | available_idle_cpu();
0.682 us | update_cfs_rq_h_load();
| select_idle_sibling() {
0.814 us | available_idle_cpu();
2.058 us | }
6.637 us | }
0.634 us | _raw_spin_lock();
0.767 us | update_rq_clock();
| ttwu_do_activate() {
| activate_task() {
| psi_task_change() {
0.726 us | record_times();
0.682 us | record_times();
0.662 us | record_times();
0.795 us | record_times();
7.160 us | }
| enqueue_task_fair() {
| enqueue_entity() {
0.663 us | update_curr();
0.807 us | __update_load_avg_se();
0.762 us | __update_load_avg_cfs_rq();
0.642 us | update_cfs_group();
0.871 us | account_entity_enqueue();
0.863 us | __enqueue_entity();
9.944 us | }
| enqueue_entity() {
0.602 us | update_curr();
0.783 us | __update_load_avg_se();
0.746 us | __update_load_avg_cfs_rq();
| update_cfs_group() {
0.955 us | reweight_entity();
3.026 us | }
1.533 us | account_entity_enqueue();
0.786 us | __enqueue_entity();
+ 12.577 us | }
0.646 us | hrtick_update();
+ 26.076 us | }
+ 35.386 us | }
| ttwu_do_wakeup() {
| check_preempt_curr() {
| resched_curr() {
| native_smp_send_reschedule() {
| x2apic_send_IPI() {
0.827 us | __x2apic_send_IPI_dest();
2.066 us | }
3.342 us | }
5.145 us | }
6.505 us | }
8.531 us | }
+ 45.940 us | }
0.682 us | ttwu_stat();
0.678 us | _raw_spin_unlock_irqrestore();
+ 61.748 us | }
+ 63.156 us | }
+ 64.613 us | }
+ 66.130 us | }
1.156 us | _raw_spin_unlock_irqrestore();
+ 71.371 us | }
+ 72.647 us | }
| irq_exit() {
0.767 us | idle_cpu();
0.606 us | rcu_irq_exit();
3.696 us | }
+ 88.475 us | }
0.787 us | tcp_options_write();
0.827 us | __tcp_select_window();
| tcp_v4_send_check() {
0.667 us | __tcp_v4_send_check();
1.990 us | }
0.759 us | bictcp_cwnd_event();
| __ip_queue_xmit() {
| __sk_dst_check() {
0.670 us | ipv4_dst_check();
2.307 us | }
0.666 us | skb_push();
0.670 us | ip_copy_addrs();
| ip_local_out() {
| __ip_local_out() {
0.727 us | ip_send_check();
2.135 us | }
| ip_output() {
| ip_finish_output() {
0.951 us | __cgroup_bpf_run_filter_skb();
0.710 us | ipv4_mtu();
| ip_finish_output2() {
| dev_queue_xmit() {
| __dev_queue_xmit() {
| netdev_pick_tx() {
| ieee80211_netdev_select_queue() {
| ieee80211_select_queue() {
1.104 us | sta_info_get();
0.955 us | sta_info_get();
0.951 us | cfg80211_classify8021d();
0.850 us | ieee80211_downgrade_queue();
8.154 us | }
+ 10.008 us | }
+ 12.059 us | }
0.598 us | _raw_spin_lock();
| sch_direct_xmit() {
| validate_xmit_skb_list() {
| validate_xmit_skb() {
| netif_skb_features() {
0.658 us | skb_network_protocol();
2.741 us | }
| skb_csum_hwoffload_help() {
| skb_checksum_help() {
| skb_checksum() {
| __skb_checksum() {
| csum_partial() {
0.887 us | do_csum();
2.291 us | }
3.848 us | }
5.100 us | }
6.826 us | }
8.383 us | }
0.642 us | validate_xmit_xfrm();
+ 15.020 us | }
+ 16.337 us | }
0.706 us | _raw_spin_lock();
| dev_hard_start_xmit() {
| dev_queue_xmit_nit() {
| skb_clone() {
| kmem_cache_alloc() {
0.759 us | should_failslab();
0.722 us | memcg_kmem_put_cache();
4.487 us | }
| __skb_clone() {
0.742 us | __copy_skb_header();
2.272 us | }
8.948 us | }
0.983 us | ktime_get_with_offset();
| packet_rcv() {
0.786 us | skb_push();
0.754 us | consume_skb();
5.153 us | }
| packet_rcv() {
0.810 us | skb_push();
| consume_skb() {
| skb_release_all() {
0.719 us | skb_release_head_state();
0.710 us | skb_release_data();
4.771 us | }
| kfree_skbmem() {
1.091 us | kmem_cache_free();
2.540 us | }
+ 10.310 us | }
+ 14.467 us | }
+ 33.978 us | }
| ieee80211_subif_start_xmit() {
| __ieee80211_subif_start_xmit() {
| ieee80211_lookup_ra_sta() {
0.883 us | sta_info_get();
2.701 us | }
| ieee80211_skb_resize() {
| pskb_expand_head() {
| __kmalloc_reserve.isra.54() {
| __kmalloc_node_track_caller() {
0.710 us | kmalloc_slab();
0.746 us | should_failslab();
0.638 us | memcg_kmem_put_cache();
5.000 us | }
6.489 us | }
0.815 us | ksize();
0.822 us | skb_release_data();
0.827 us | skb_headers_offset_update();
+ 13.849 us | }
+ 15.571 us | }
0.694 us | skb_push();
| ieee80211_tx_h_rate_ctrl() {
| rate_control_get_rate() {
0.762 us | _raw_spin_lock_bh();
| rs_get_rate() {
0.858 us | __iwl_dbg();
0.907 us | rate_control_send_low();
5.438 us | }
| _raw_spin_unlock_bh() {
0.746 us | __local_bh_enable_ip();
2.259 us | }
| ieee80211_get_tx_rates() {
1.509 us | rate_control_cap_mask();
3.973 us | }
+ 17.669 us | }
+ 20.021 us | }
1.248 us | ieee80211_queue_skb();
1.260 us | ieee80211_xmit_fast_finish();
| ieee80211_tx_frags() {
0.658 us | _raw_spin_lock_irqsave();
0.687 us | _raw_spin_unlock_irqrestore();
| iwlagn_mac_tx() {
| iwlagn_tx_skb() {
0.963 us | ieee80211_hdrlen();
| kmem_cache_alloc() {
0.823 us | should_failslab();
0.751 us | memcg_kmem_put_cache();
4.443 us | }
0.827 us | _raw_spin_lock();
0.666 us | __iwl_dbg();
| iwl_trans_pcie_tx() {
0.778 us | ieee80211_hdrlen();
0.919 us | _raw_spin_lock();
0.915 us | iwl_queue_space();
1.485 us | iwl_pcie_txq_build_tfd();
0.803 us | dma_direct_map_page();
0.686 us | iwl_pcie_txq_build_tfd();
| iwl_fill_data_tbs() {
0.714 us | dma_direct_map_page();
0.683 us | iwl_pcie_txq_build_tfd();
3.604 us | }
0.658 us | __iwl_dbg();
| iwl_trans_ref() {
0.706 us | iwl_trans_pcie_ref();
2.621 us | }
| iwl_pcie_txq_inc_wr_ptr() {
0.711 us | __iwl_dbg();
| iwl_write32() {
0.955 us | iwl_trans_pcie_write32();
2.472 us | }
5.405 us | }
+ 28.288 us | }
+ 43.746 us | }
+ 45.499 us | }
+ 50.239 us | }
+ 99.907 us | }
! 101.480 us | }
! 138.440 us | }
0.750 us | _raw_spin_lock();
! 160.471 us | }
| __qdisc_run() {
0.787 us | fq_codel_dequeue();
3.074 us | }
0.799 us | __local_bh_enable_ip();
! 183.053 us | }
! 184.778 us | }
0.670 us | __local_bh_enable_ip();
! 189.951 us | }
! 195.068 us | }
! 196.781 us | }
! 201.207 us | }
! 209.579 us | }
0.843 us | tcp_update_skb_after_send();
0.642 us | tcp_rate_skb_sent();
! 430.763 us | }
| tcp_event_new_data_sent() {
0.960 us | tcp_rbtree_insert();
| tcp_rearm_rto() {
| sk_reset_timer() {
| mod_timer() {
| lock_timer_base() {
0.791 us | _raw_spin_lock_irqsave();
2.280 us | }
0.991 us | detach_if_pending();
0.834 us | get_nohz_timer_target();
0.944 us | _raw_spin_lock();
| __internal_add_timer() {
0.859 us | calc_wheel_index();
2.565 us | }
0.814 us | trigger_dyntick_cpu.isra.36();
0.879 us | _raw_spin_unlock_irqrestore();
+ 16.393 us | }
+ 18.063 us | }
+ 20.121 us | }
+ 23.793 us | }
0.899 us | tcp_chrono_stop();
| tcp_schedule_loss_probe() {
0.798 us | __usecs_to_jiffies();
0.847 us | jiffies_to_usecs();
0.779 us | __usecs_to_jiffies();
| sk_reset_timer() {
0.858 us | mod_timer();
2.874 us | }
9.824 us | }
! 474.552 us | }
! 476.439 us | }
! 478.152 us | }
0.735 us | sock_zerocopy_put();
! 537.496 us | }
| release_sock() {
0.915 us | _raw_spin_lock_bh();
0.867 us | tcp_release_cb();
| _raw_spin_unlock_bh() {
0.803 us | __local_bh_enable_ip();
2.388 us | }
7.842 us | }
! 554.118 us | }
! 556.097 us | }
! 567.233 us | }
! 581.267 us | }
```
</p>
</details>
---
### ROS1 ROS2, ROS bridge (remote, local)
local 部份效能差異不大,同一台機器內傳輸成本為常數時間。remote 從 16K 延遲便急遽升高。
| Fig.7: Medians of end-to-end latencies with small data in remote and local cases. (256B~64K) | Fig 8: Medians of end-to-end latencies with small data in remote and local cases. (64K~1M) |
| -------- | -------- |
|  |  |
> the behavior of all latencies is constant up to 4 KB. In contrast, the latencies in the remote cases grow sharply from 16 KB, as shown in Figures 7 and 8. This is because <ins>**ROS1 and ROS2 divide a message into 15 KB packets to transmit data through Ethernet**</ins>.
---
### ROS1 ROS2 (local), nodelet, intra-process
> nodelet (ROS1), intra-process (ROS2)
可以觀察到資料大小在 64K 以內都有常數的時間成本,64K 以上成本急遽升高(fragmentation)。
| Fig 9: Medians of end-to-end latencies with small data in local, nodelet, and intra-process cases. <ins>(256B~64K)</ins> | Fig 10: Medians of end-to-end latencies with large data in local, nodelet, and intra-process cases. <ins>(64K~1M)</ins> |
| -------- | -------- |
|  |  |
---
### ROS2 QoS: reliable, best-effort
ROS2 時間成本分析 (使用 OpenSplice):ROS2 每次傳輸都要進行兩次 ROS2 msg 與 DDS 之間資料的轉換,分別是在 publish (ROS2 轉換成 DDS) 和 subscribe (DDS 轉換成 ROS2),而這個時間成本會隨著資料大小而改變,並不是常數時間。
| Fig 11: (2-b) <ins>Reliable</ins> policy breakdown of ROS2 latencies with the OpenSplice. | Figure 12: (2-b) <ins>Best-effort</ins> policy breakdown of ROS2 latencies with the OpenSplice. |
| -------- | -------- |
|  |  |
資料型態的轉換實作在 RMW impl,以 FastRTPS 來說對應的 package 為 `rmw_fastrtps_*`

---
### 不同 DDS vendor 之間的比較 (local)
直接講結論:
1. <ins>OpenSplice</ins> 比 <ins>Connext</ins> 有更短的延遲
2. <ins>Connext</ins> has lower throughtput than <ins>OpenSplice</ins>
3. QoS depth 參數不影響延遲 (OpenSplice)
4. Fragment Size 越小,效能衝擊越大
| <ins>1.</ins> Fig 13: (2-b) Different DDS in ROS2 with best-effort policy. | <ins>2.</ins> Fig 20: (1-a) and (2-b) remote cases throughput with small data. |
| -------- | -------- |
|  |  |
---
### 一個 publisher 對多個 subscriber
結論: ROS1 會對傳輸的資料作排程,導致每個 subscriber 會有不同的延遲。而 ROS2 的差異比較小,對每個 subscriber 較為公平
| Fig 17: (1-b) <ins>ROS1</ins> multiple destinations publisher. | Fig 18: (2-b) <ins>ROS2</ins> multiple destinations with OpenSplice reliable policy. |
| -------- | -------- |
|  |  |
## ROS2 QoS
在建立 publisher 除了 topic 名稱,可以指定 QoS,
```cpp rclcpp/node.hpp
namespace rclcpp
{
/// Node is the single point of entry for creating publishers and subscribers.
class Node
{
...
create_publisher(
const std::string & topic_name,
const rmw_qos_profile_t & qos_profile = rmw_qos_profile_default,
std::shared_ptr<AllocatorT> allocator = nullptr);
```
<!--
```cpp
/// ROS MiddleWare quality of service profile.
typedef struct RMW_PUBLIC_TYPE rmw_qos_profile_t
{
enum rmw_qos_history_policy_t history;
size_t depth;
enum rmw_qos_reliability_policy_t reliability;
enum rmw_qos_durability_policy_t durability;
struct rmw_time_t deadline;
struct rmw_time_t lifespan;
enum rmw_qos_liveliness_policy_t liveliness;
struct rmw_time_t liveliness_lease_duration;
bool avoid_ros_namespace_conventions;
} rmw_qos_profile_t;
```
--->

ROS2 default QoS:
By default, publishers and subscribers are reliable in ROS 2, have volatile durability, and “keep last” history.
| History | Depth | Reliability | Durability |
| -------- | -------- | -------- | ------ |
| keep last | 10 | reliable | volatile |
## intra-process (ROS2)
類似 ROS1 的 nodelet (`boost::shared_ptr`),實現 zero copy。
一開始建立 node 時啟用 intra-process 的旗標:
```cpp
rclcpp::Node::Node(const std::string & node_name,
const std::string & namespace_ = "",
bool use_intra_process_comms = false)
```
使用時須搭搭配 C++11 `unique_ptr` 使用:
```cpp=
std_msgs::msg::String::UniquePtr msg(new std_msgs::msg::String());
// or
auto msg = make_unique<std_msgs::msg::String>();
...
pub_ = this->create_publisher<std_msgs::msg::String>(topic_name, custom_qos_profile)
pub_->publish(msg_pub)
```
## 其他 ROS2 的效能測量工具
ApexAI/performance_test
: https://github.com/ApexAI/performance_test
https://discourse.ros.org/t/latency-and-throughput-in-ros2/4367/4
performance_test 提供已寫好的 publisher subscriber 讓使用者可以調整 QoS 參數測量效能,也可測量機器間的傳輸 (round-trip)。
限制:
1. 只支援單一 publisher subscriber
## References
ros middleware
: https://design.ros2.org/articles/ros_middleware_interface.html
ros2 core stack
: http://docs.ros2.org/dashing/developer_overview.html#id20
MSS
: https://medium.com/fcamels-notes/tcp-maximum-segment-size-%E6%98%AF%E4%BB%80%E9%BA%BC%E4%BB%A5%E5%8F%8A%E6%98%AF%E5%A6%82%E4%BD%95%E6%B1%BA%E5%AE%9A%E7%9A%84-b5fd9005702e
**TCP\_***
- https://medium.com/fcamels-notes/tcp-%E5%8F%83%E6%95%B8%E5%B0%8D%E5%BB%B6%E9%81%B2%E5%92%8C%E5%82%B3%E8%BC%B8%E9%87%8F%E7%9A%84%E5%BD%B1%E9%9F%BF-12910f8d82bd
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/tuning_guide/tcp_nodelay_and_small_buffer_writes
socket programming
: https://www.cs.cmu.edu/~srini/15-441/S10/lectures/r01-sockets.pdf
nagle algorithm
: https://tools.ietf.org/html/rfc896