###### tags: `Profiling` # eBPF for ROS2 / 2019-09-10 Topic: eBPF for ROS2 #1 在我的[上一篇文](https://st9540808.github.io/2019/09/02/ros2-get-started/)中實作了最簡單的 publisher 和 subscriber,而當中的 callback 是使用 C\++11 lambda,現在為了開始作效能分析,其中會需要找到 callback 的 symbol,因為 C\++11 lambda 會是一個 unnamed object,每一個 lambda 都有自己的類型,要找到對應的 lambda 函式執行的位址 (其實是 unnamed object 實作的 `operator()`) 會相當困難,便改成以 member function 實作。 ## Note Rewrite C++11 lambda callback function into class method. #### subscriber: ```c++=1 #include <cstdio> #include <memory> #include <string> #include <vector> #include "rclcpp/rclcpp.hpp" #include "std_msgs/msg/string.hpp" // Create a Listener class that subclasses the generic rclcpp::Node base class. // The main function below will instantiate the class as a ROS node. class Listener : public rclcpp::Node { public: Listener() : Node("listener") { sub_ = this->create_subscription<std_msgs::msg::String>( "chatter", std::bind(&Listener::callback, this, std::placeholders::_1)); } private: void callback(const std_msgs::msg::String::SharedPtr msg) { RCLCPP_INFO(this->get_logger(), "I heard: [%s]", msg->data.c_str()); } rclcpp::Subscription<std_msgs::msg::String>::SharedPtr sub_; }; int main(int argc, char *argv[]) { rclcpp::init(argc, argv); auto node = std::make_shared<Listener>(); auto node2 = rclcpp::Node::make_shared("talker"); rclcpp::spin(node); rclcpp::shutdown(); return 0; } ``` `rclcpp::Node::create_subscription` requires callback functions to be type of function object. *function object (or functor) is a class that defines `operator()`* `callback()` is non-static member function, so it's required to pass `this` pointer. ```c++ std::bind(&Listener::callback, this, std::placeholders::_1) ``` #### publisher: ```c++ #include <chrono> #include <cstdio> #include <memory> #include <string> #include <utility> #include <vector> #include "rclcpp/rclcpp.hpp" #include "std_msgs/msg/string.hpp" using namespace std::chrono_literals; class Talker : public rclcpp::Node { public: Talker() : Node("talker"), count_(0) { pub_ = this->create_publisher<std_msgs::msg::String>("chatter"); timerPtr_ = this->create_wall_timer( 500ms, std::bind(&Talker::callback, this)); } private: void callback(void) { char str[7]; this->timerPtr_->reset(); std::snprintf(str, 7, "%06d", this->count_++); auto message = std_msgs::msg::String(); message.data = "Hello, world! " + std::string(str); RCLCPP_INFO(this->get_logger(), "Publishing: '%s'", message.data.c_str()); this->pub_->publish(message); } int count_; rclcpp::TimerBase::SharedPtr timerPtr_; rclcpp::Publisher<std_msgs::msg::String>::SharedPtr pub_; }; int main(int argc, char *argv[]) { rclcpp::init(argc, argv); auto node = std::make_shared<Talker>(); rclcpp::spin(node); rclcpp::shutdown(); } ``` ## Measure latency on the same machine ### Environment Distro: Dashing Diademata OS: 18.04.1 Ubuntu CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz (2C4T) :::info At ROS2 workspace. The executables are placed under `install` directory. ```bash $ ls build install log src ``` path of talker (publisher) and listener (subscriber): > `ros_course_demo` is the package name. | Executable | path | | -------- | -------- | | talker | `install/ros_course_demo/lib/ros_course_demo/talker` | | listener | `install/ros_course_demo/lib/ros_course_demo/listener` | ::: ### Find out the symbol for `callback()` 目標是找到 `Talker::callback()` 在 ELF 檔中對應的 symbol List all exported symbols in the executable. ```bash $ objdump install/ros_course_demo/lib/ros_course_demo/talker -t install/ros_course_demo/lib/ros_course_demo/talker: file format elf64-x86-64 SYMBOL TABLE: ... 0000000000015d40 w F .text 000000000000004d _ZN6TalkerD1Ev 0000000000011806 w F .text 000000000000015b _ZNSt23_Sp_counted_ptr_inplaceIN6rclcpp9PublisherIN8 std_msgs3msg7String_ISaIvEEES5_EESaIS7_ELN9__gnu_cxx12_Lock_policyE2EEC2IJRPNS0_15node_interfaces17NodeBaseInterfa ceERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEER23rcl_publisher_options_tRKNS0_23PublisherEventCallbacks ERSt10shared_ptrISaIS6_EEEEES8_DpOT_ ... ``` C++ uses name mangling to handle classes, templates, namespaces, etc. Use `c++filt` to demangle symbols: ```bash $ objdump install/ros_course_demo/lib/ros_course_demo/talker -t \ | c++filt | grep callback ... 0000000000008944 w F .text 000000000000031c Talker::callback() ... ``` Retrieve the symbol in ELF file for `Talker::callback()` using address `0000000000008944`. ```bash $ objdump install/ros_course_demo/lib/ros_course_demo/talker -t \ | grep 0000000000008944 0000000000008944 w F .text 000000000000031c _ZN6Talker8callbackEv ``` :::info 其實要找到 demangled symbol 只需要使用 `$ nm -C` 就可以了,感覺根本不需要這麼複雜。 ::: ### Register probe function 使用剛剛找到的 symbol 並將其帶入 `attach_uprobe()` 便能完成註冊,一旦 `Talker::callback()` 被呼叫,註冊的 `talker_probe` 就會被執行。 更多 uprobe (在 userspace 追蹤程式) 的相關的資訊請看底下 [References](##References)。 ```c b.attach_uprobe(name="./ros2_course/install/ros_course_demo/lib/ros_course_demo/talker", sym="_ZN6Talker8callbackEv", fn_name="talker_probe") ```` ### Measure latency 以下為測量方法的時序圖。 首先使用 uprobe 註冊 talker 和 listener 探針函式,分別會在 callback 被呼叫時執行,並測量兩個 callback 之間的時間差。 ![](https://i.imgur.com/aPUiZE5.png) 寫一個 eBPF 的程式追蹤兩個 callback 的呼叫時間: ```python #!/usr/bin/env python from __future__ import print_function from bcc import BPF from time import sleep # load BPF program b = BPF(text=""" #include <uapi/linux/ptrace.h> BPF_HISTOGRAM(dist); BPF_ARRAY(start_time, uint64_t, 1); int talker_probe(struct pt_regs *ctx) { uint64_t curr = bpf_ktime_get_ns(); uint32_t key = 0; start_time.update(&key, &curr); // bpf_trace_printk("talker_probe curr=%lu\\n", curr); return 0; }; int listener_probe(struct pt_regs *ctx) { uint64_t curr = bpf_ktime_get_ns(); uint64_t *prev, lat; uint32_t key = 0; prev = start_time.lookup(&key); if (prev) { lat = (curr - *prev) / 1000; dist.increment(bpf_log2l(lat)); bpf_trace_printk("listener_probe lat=%luusecs\\n", lat); } return 0; } """) b.attach_uprobe(name="./ros2_course/install/ros_course_demo/lib/ros_course_demo/talker", sym="_ZN6Talker8callbackEv", fn_name="talker_probe") b.attach_uprobe(name="./ros2_course/install/ros_course_demo/lib/ros_course_demo/listener", sym="_ZN8Listener8callbackESt10shared_ptrIN8std_msgs3msg7String_ISaIvEEEE", fn_name="listener_probe") while 1: try: try: (task, pid, cpu, flags, ts, msg) = b.trace_fields() except ValueError: continue print("%-18.9f %-16s %-6d %s" % (ts, task, pid, msg)) # print("%-16s %-6d %s" % (task, pid, msg)) except KeyboardInterrupt: break print("") b["dist"].print_log2_hist("usec") ``` 以下為統計結果,每次均 publish 一個長度為 21 的字串。 ![](https://i.imgur.com/4JIOmoo.png) <!-- ```bash $ sudo python first_ebpf.py ... 362862.082012000 listener 7536 listener_probe lat=398usecs ^C usec : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 133 | | 256 -> 511 : 8041 |****************************************| 512 -> 1023 : 1727 |******** | 1024 -> 2047 : 65 | | 2048 -> 4095 : 22 | | 4096 -> 8191 : 7 | | ``` --> ## Future work STM32MP157A-DK1 ![](https://i.imgur.com/l3eCIZW.png) > STM32MP157 dual Cortex®-A7 32 bits + Cortex®-M4 32 bits MPU openSTLinux kernel 4.19 ![](https://wiki.st.com/stm32mpu/nsfr_img_auth.php/9/9c/STM32MPU_Embedded_Software_architecture_overview.png) Build a PREEMPT_RT kernel for stm32mp1 Cross compile ROS2 for ARMv7 ## References BPF In Depth: Communicating with Userspace : https://blogs.oracle.com/linux/notes-on-bpf-3 kprobes : https://www.kernel.org/doc/Documentation/kprobes.txt uprobe-tracer : https://www.kernel.org/doc/Documentation/trace/uprobetracer.txt Meet cute-between-ebpf-and-tracing : https://www.slideshare.net/vh21/meet-cutebetweenebpfandtracing iovisor/bcc : https://github.com/iovisor/bcc