# 2019q1 Homework4 (riscv) contributed by < `lineagech` > ## Code Trace From `temu.c` we can see that the following procedures: ```clike virt_machine_load_config_file(p, path, NULL, NULL); ``` ```clike void virt_machine_load_config_file(VirtMachineParams *p, const char *filename, void (*start_cb)(void *opaque), void *opaque) { VMConfigLoadState *s; s = mallocz(sizeof(*s)); s->vm_params = p; s->start_cb = start_cb; s->opaque = opaque; p->cfg_filename = strdup(filename); config_load_file(s, filename, config_file_loaded, s); } ``` Then it will parse config file, here is `root_9p-riscv64.cfg`. ```clike static void config_file_loaded(void *opaque, uint8_t *buf, int buf_len) { VMConfigLoadState *s = opaque; VirtMachineParams *p = s->vm_params; if (virt_machine_parse_config(p, (char *)buf, buf_len) < 0) exit(1); /* load the additional files */ s->file_index = 0; config_additional_file_load(s); } ``` Finally, it will run virtual machine in infinite loop. ```clike= for(;;) { virt_machine_run(s); } ``` It can be seen that `VirtMachine` maintains a `VIRTIODevice` console_dev, and we can get write length through `virtio_console_get_write_len`. And read and write data through registered API `m->console->read_data(...)` and `m->console->write_data(...)`. ```clike void virt_machine_run(VirtMachine *m) { ...... tv.tv_sec = delay / 1000; tv.tv_usec = (delay % 1000) * 1000; ret = select(fd_max + 1, &rfds, &wfds, &efds, &tv); if (m->net) { m->net->select_poll(m->net, &rfds, &wfds, &efds, ret); } if (ret > 0) { #ifndef _WIN32 if (m->console_dev && FD_ISSET(stdin_fd, &rfds)) { uint8_t buf[128]; int ret, len; len = virtio_console_get_write_len(m->console_dev); len = min_int(len, sizeof(buf)); ret = m->console->read_data(m->console->opaque, buf, len); if (ret > 0) { virtio_console_write_data(m->console_dev, buf, ret); } } #endif ...... virt_machine_interp(m, MAX_EXEC_CYCLE); } ``` Look what is inside `VirtMachine`, it contains `VirtualMachineClass` as well as several `Device`: ```clike typedef struct VirtMachine { const VirtMachineClass *vmc; /* network */ EthernetDevice *net; /* console */ VIRTIODevice *console_dev; CharacterDevice *console; /* graphics */ FBDevice *fb_dev; } VirtMachine; ``` The `vmc` seems to contain several functions to set configurations and get input events: ```clike struct VirtMachineClass { const char *machine_names; void (*virt_machine_set_defaults)(VirtMachineParams *p); VirtMachine *(*virt_machine_init)(const VirtMachineParams *p); void (*virt_machine_end)(VirtMachine *s); int (*virt_machine_get_sleep_duration)(VirtMachine *s, int delay); void (*virt_machine_interp)(VirtMachine *s, int max_exec_cycle); bool (*vm_mouse_is_absolute)(VirtMachine *s); void (*vm_send_mouse_event)(VirtMachine *s1, int dx, int dy, int dz, unsigned int buttons); void (*vm_send_key_event)(VirtMachine *s1, bool is_down, uint16_t key_code); }; ``` ## Problems * [riscv-emu](https://github.com/sysprog21/riscv-emu) 原始程式碼中多次出現 [virtio](https://www.linux-kvm.org/page/Virtio),這樣的機制對於 host 和 guest 兩端有何作用?在閱讀 [Virtio: An I/O virtualization framework for Linux](https://www.ibm.com/developerworks/library/l-virtio/index.html) 一文後,對照原始程式碼,你發現什麼? From the article, it says that virtio provides a standardized interface for the development of emulated devices access to propagate code reuse and increase efficiency. Guest has front-end drivers and hypervisor has back-end drivers and they communicate with virtio. Also, the architecture of virtio of front-end drivers is > virtio_driver: the front-end driver in the guest > virtio_device: a representation of the device in the guest > virtio_config_ops: defines the operations for configuring the virtio device > virtqueue > virtqueue_ops ![](https://www.ibm.com/developerworks/library/l-virtio/figure2.gif) In temu, it seems to have similar things. It defines ```clike struct VIRTIODevice { PhysMemoryMap *mem_map; PhysMemoryRange *mem_range; /* PCI only */ PCIDevice *pci_dev; /* MMIO only */ IRQSignal *irq; VIRTIOGetRAMPtrFunc *get_ram_ptr; int debug; uint32_t int_status; uint32_t status; uint32_t device_features_sel; uint32_t queue_sel; /* currently selected queue */ QueueState queue[MAX_QUEUE]; /* device specific */ uint32_t device_id; uint32_t vendor_id; uint32_t device_features; VIRTIODeviceRecvFunc *device_recv; void (*config_write)(VIRTIODevice *s); /* called after the config is written */ uint32_t config_space_size; /* in bytes, must be multiple of 4 */ uint8_t config_space[MAX_CONFIG_SPACE_SIZE]; }; ``` Here is an example of console device in temu: ```clike int virtio_console_write_data(VIRTIODevice *s, const uint8_t *buf, int buf_len) { int queue_idx = 0; QueueState *qs = &s->queue[queue_idx]; int desc_idx; uint16_t avail_idx; if (!qs->ready) return 0; avail_idx = virtio_read16(s, qs->avail_addr + 2); if (qs->last_avail_idx == avail_idx) return 0; desc_idx = virtio_read16(s, qs->avail_addr + 4 + (qs->last_avail_idx & (qs->num - 1)) * 2); memcpy_to_queue(s, queue_idx, desc_idx, 0, buf, buf_len); virtio_consume_desc(s, queue_idx, desc_idx, buf_len); qs->last_avail_idx++; return buf_len; } ``` It can be found that there is `QueueState`, like `virtqueue`, to deal with data movement between guest and hypervisor. Communication between guest and hypervisor is through buffer. And `virtio_consume_desc` is to signal description index and buffer size and then set a irq. * 透過 `$ temu root-riscv64.cfg`, 我們在 RISCV/Linux 模擬環境中,可執行 `gcc` 並輸出對應的執行檔,而之後我們則執行 `riscv64-buildroot-linux-gnu-gcc`,這兩者有何不同? (提示: cross-compiler, 複習 [你所不知道的 C 語言: 編譯器和最佳化原理篇](https://hackmd.io/s/Hy72937Me) * 在 Guest 端透過 `$ dmesg | grep 9pnet` 命令,我們可發現 `9P2000` 字樣,這和上述 [VirtFS](https://wiki.qemu.org/Documentation/9psetup) 有何關聯?請解釋運作原理並設計實驗 [9P (protocol)](https://en.wikipedia.org/wiki/9P_(protocol)) 9P is a network protocal for the Plan 9 from Bell Labs. 9P was revised for the 4th edition of Plan 9 under the name 9P2000. According to [VirtFS—A virtualization aware File System pass-through](chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/https://landley.net/kdocs/ols/2010/ols2010-pages-109-120.pdf), VirtFS provides functionality that is somewhat similar to a traditional network file systems (NFS/CIFS). The VirtFS server export a portion of its file system hierarchy, and the client on the guest mounts this using 9P2000 protocal. So virtual machine can see the mount point just like any of the local file systems. * 在 [TinyEMU System Emulator by Fabrice Bellard](https://bellard.org/tinyemu/readme.txt) 提到 "Network block device",你能否依據說明,嘗試讓 guest 端透過 host 存取到網際網路呢? * tap, bridge, NAT, iptables According to the instructions, we need to execute netinit.sh firstly. But there is a place, `internet_ifname`, that we need to change: ```sh= # host network interface connected to Internet (change it) #internet_ifname="enp0s20f0u1" internet_ifname="wlp3s0" # setup bridge interface ip link add br0 type bridge # create and add tap0 interface to bridge ip tuntap add dev tap0 mode tap user $USER ip link set tap0 master br0 ip link set dev br0 up ip link set dev tap0 up ifconfig br0 192.168.3.1 # setup NAT to access to Internet echo 1 > /proc/sys/net/ipv4/ip_forward # delete forwarding reject rule if present #iptables -D FORWARD 1 iptables -t nat -A POSTROUTING -o $internet_ifname -j MASQUERADE ``` TUN/TAP is a interface of virtual network. Like VPN tunneling is implemented using this way. [TUN/TAP](https://steemit.com/tun/@leonshilu/tun-tap): tap is simulated as a virtual network adaptor working at layer 2. We can use `ip tuntap add name tap0 mode tap` to create an virtual network interface. * 最初實驗輸入 `$ temu https://bellard.org/jslinux/buildroot-riscv64.cfg`,然後就能載入 RISC-V/Linux 系統,背後的原理是什麼呢?請以 VirtIO 9P 檔案系統和 [riscv-emu](https://github.com/sysprog21/riscv-emu) 對應的原始程式碼來解說 > TinyEMU supports the VirtIO 9P filesystem to access local or remote filesystems. For remote filesystems, it does HTTP requests to download the files. > The protocol is compatible with the vfsync utility. In the "mount" command, "/dev/rootN" must be used as device name where N is the index of the filesystem. When N=0 it is omitted. * [riscv-emu](https://github.com/sysprog21/riscv-emu) 內建浮點運算模擬器,使用到 [SoftFP Library](https://bellard.org/softfp/),請以 `sqrt` 為例,解說 `sqrt_sf32`, `sqrt_sf64`, `sqrt_sf128` 的運作機制,以及如何對應到 RISC-V CPU 模擬器中 Basically the square root of a floating point is the square root of mantissa times e to the power of half exponent. Comment bleow: ```clike= F_UINT sqrt_sf(F_UINT a, RoundingModeEnum rm, uint32_t *pfflags) { ...... /* restore real value to E-127 */ a_exp -= EXP_MASK / 2; /* simpler to handle an even exponent */ /* if the exponent is odd, need to transfer it to even by substract exponent by 1 and multiply mantissa by 2, eqully shifting 1 bit left */ if (a_exp & 1) { a_exp--; a_mant <<= 1; } /* exponent is divided by 2 and add 127 to it */ a_exp = (a_exp >> 1) + EXP_MASK / 2; a_mant <<= (F_SIZE - 4 - MANT_SIZE); /* find the square root of mantissa */ if (sqrtrem_u(&a_mant, a_mant, 0)) a_mant |= 1; return normalize_sf(a_sign, a_exp, a_mant, rm, pfflags); } ``` Find max possible value first, which is `(F_UINT)1 << ((l + 1) / 2)`. And at every iteration, set `s` as the target value and divide original value, the result is `q`. If s is more close to answer, `q` is more likely to be equal to `s`. Because it approaches to the answer from large value to small value, when `u=(q+s)/2` is larger than or equal to `s`, it means that `q` is larger than or equal to `s` and the answer just between `s` and `q`. It takes `s` as the answer which is the most close to the square root. ```clike= static int sqrtrem_u(F_UINT *pr, F_UINT a1, F_UINT a0) { int l, inexact; F_UINT u, s, r, q, sq0, sq1; /* 2^l >= a */ if (a1 != 0) { l = 2 * F_SIZE - clz(a1 - 1); } else { if (a0 == 0) { *pr = 0; return 0; } l = F_SIZE - clz(a0 - 1); } u = (F_UINT)1 << ((l + 1) / 2); for(;;) { s = u; q = divrem_u(&r, a1, a0, s); u = (q + s) / 2; if (u >= s) break; } sq1 = mul_u(&sq0, s, s); inexact = (sq0 != a0 || sq1 != a1); *pr = s; return inexact; } ``` * 在 `root-riscv64.cfg` 設定檔中,有 `bios: "bbl64.bin"` 描述,這用意為何?提示:參閱 [Booting a RISC-V Linux Kernel](https://www.sifive.com/blog/all-aboard-part-6-booting-a-risc-v-linux-kernel) `Berkeley Boot Loader` is refered to as bbl. It will do some tasks as [Booting a RISC-V Linux Kernel](https://www.sifive.com/blog/all-aboard-part-6-booting-a-risc-v-linux-kernel) mentioned: Strip out information that Linux should not be interested in, like SiFive system's power management. Also setting up harts PMP, trap handlers and enter supervisor mode. PMP is set up to access all of memory. bbl will be expected to run in machine mode. So it will set up machine mode trap handlers as well as machine mode stack. The process will execute `mret` to jump from machine mode to supervisor mode. And bbl will jump to the start of linux. * 能否用 buildroot 編譯 Linux 核心呢?請務必參閱 [Buildroot Manual](https://buildroot.org/downloads/manual/manual.html) * `BR2_LINUX_KERNEL_CUSTOM_CONFIG_FILE="/tmp/diskimage-linux-riscv-2018-09-23/patches/config_linux_riscv64"` * 核心啟動的參數 `console=hvc0 root=/dev/vda rw` 代表什麼意思呢?這對應到模擬器內部設計的哪些部分? * `$ cat /proc/loadavg` 的輸出意義為何?能否對應到 Linux 核心原始碼去解釋呢? (提示: 熟悉的 fixed-point 操作) * 為何需要在 host 端準備 e2fsprogs 工具呢?具體作用為何呢? * root file system 在 Linux 核心的存在意義為何?而 [initramfs](http://blog.linux.org.tw/~jserv/archives/001954.html) 的存在的考量為何? * busybox 這樣的工具有何作用?請搭配原始程式碼解說 (提示: 參見 [取得 GNU/Linux 行程的執行檔路徑](http://blog.linux.org.tw/~jserv/archives/002041.html)) ## Kilo Refer [buildroot manual](chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/http://nightly.buildroot.org/manual.pdf) to have it be aware we have new packages to build. Modify `/tmp/buildroot-riscv-2018-10-20/package/Config.in` to show in `make menuconfig`: ``` menu "Text editors and viewers" 1718 source "package/ed/Config.in" 1719 source "package/joe/Config.in" 1720 source "package/kilo/Config.in" ``` And `/tmp/buildroot-riscv-2018-10-20/package/kilo/Config.in`: ``` config BR2_PACKAGE_KILO 2 bool "kilo" 3 help 4 kilo editor ``` As well as `/tmp/buildroot-riscv-2018-10-20/package/kilo/kilo.mk`: ``` 1 ################################################################################ 2 # 3 # kilo 4 # 5 ################################################################################ 6 7 KILO_VERSION = af3919d68cb2e70a3d9a2309596cf290cf6bc1ac 8 KILO_SITE = https://github.com/lineagech/kilo.git 9 KILO_SITE_METHOD = git 10 11 define KILO_BUILD_CMDS 12 $(MAKE) CC="$(TARGET_CC)" LD="$(TARGET_LD)" -C $(@D) all 13 endef 14 15 define KILO_INSTALL_TARGET_CMDS 16 $(INSTALL) -D -m 0755 $(@D)/kilo $(TARGET_DIR)/usr/bin 17 endef 18 19 $(eval $(generic-package)) ``` then retype `make` and then `temu test64.cfg`, `kilo` is under `/usr/bin` and can be directly used.