LinuxCon China 2017

# LinuxCon China 2017 # Day 1 ## ONAP Landscape of Data Modelling Distribution in SDN/NFV 把軟件switch之類的打包成pkg 發給？商，做校驗//應該是發行商針對企業分配業務，針對頻寬等等的需求然後再發給ops，然後就會部署下去自動化生成生態系統的核心 VNF SDK VNF pakcage upload to VNF portal market place 就例如 Google Play 跟 Android App 的感覺要做校驗等等的測試，然後才會放到 market 然後才會到商業化的部署盡可能遵守標準目標聽起來是讓 SDN 可以做到 app-like ## VGPU //只聽到Q&A，下面是Q&A的東西 Intel gpu cgroup vgpu基本上實體gpu可以做到的，都可以做到現在能達到 Host 80% GPU拆越多性能損耗越大（不一定） media越多GPU利用率上升因為pipeline的因素，所以項目差異大，利用率高 ## Quickly debug VM failures in Openstack //新手導向+賣藥 ### No valid host Nuetron Glance Cinder Libvirt QEMU nova-scheduler還沒到compute node上面去 ### NUMA hugepage failure memory通常不會報錯的但是如果有 hugepage ，會有比較大的問題的 ### SR-IOV some flag failure ### VM live migration nfs shared storage check libvirt.log check nova-compute.log first (因為剛剛已經可以conductor了，所以問題在nova-compute上面) 可能是 SELinux (virt_use_nfs) VM dir permission or NFS permission ### VM block migration Not support visrh command 檢查狀態可以取消一些參數，然後實現 migration ### Block migration libvirt.log `virsh command` ABRT debug: qemuProcessHandleMonitorEOF: assuming the doamin crashed ## Openstack on AArch64 - Nova 居多 - Neutron 問題小 - DPDK 比較麻煩 - Cinder 問題小 - Ironic 也成功修改完成了 - 他們那邊並不太在意 VNC 或是 SPICE 操作，因為沒有什麼需要 UX 的機會 - 要一直追 Libvirt & QEMU 版本，以支持 ARM64 的功能 ## Day 1 End - 很多跟 SDN/NFV 相關，有不少 5G 的議題 - 基本上都是建設在虛擬化，或是 Container-based ，來建構新的電信環境 - OpenStack 在中國火熱 - x86 在中國有政策等等的限制 - RSIC-V 等等的非 x86 ，免 IP 會紅一些時間 - 華為有做 ARM64 伺服器，據說比 Cavium 穩一些 - OpenSource 等等的生意在中國需要很注意，不然會步入 Canonical/Ubuntu 後塵 - 被麒麟 Linux 吃走市場，Canonical 幾乎從中國絕跡 - 不少贊助商商業味非常重 - 華為 eSDK - 提供線上的開發平台，提供硬體環境，以及一些 SDK 教學 - Cloud Native - 他們 support 幾個大項目的 Cloud 技術，涵蓋大部分所需 # LinuxCon Day2 # Low-Latency KVM Hypervisor - Adaptive halt-polling - VMX Preemption timer - Message passing workloads - Event-driven workloads - lamp servers - memcache - redis RUN - IDLE - RUN (Switching Overhead) - Inter-process communication - TCP_RR Frequent transitions between running and idle, spends little time processing each message vcpu switch in run and idle - HLT - x86 instruction - stop executing instruction until an interrupt - In KVM - place vCPU thread in wait queue - yield cpu to another task - overhead - 8500 cycle between later `kvm_vcpu_kick` and `kvm_scehd_in` - Never schedule - defeat the purpose of CPU overcommit in cloud - some cloud maangement program will monitor pCPU usage and loadbalance , never schedule just make it mess - Halt-Polling - step1: poll - for up to halt_poll_ns nanoseconds - if a task is waiting to run on our CPU, go step 2 - check if guest intrrupt arrived, if so we are done - repeat - step 2: schedule() - schedule out until it's time to come out of HLT - Pros: - works on short HLT < `halt_poll_ns` - vCPU conitinue to not block the progress of other threads - Cons: - increase CPU uage (14% for each idle pCPU(windows guest) if halt_poll_ns = 500,000ns) - Adaptive polling for guest halt - step 3 : adaptive polling - the poll windows can be adaptuvely shrink/grow according the histroy behavior - grow halt_poll_ns progressively when short halt is detected (we can get benefit from the polling - shrink halt_poll_ns aggressively when long halt is detected (we can't get benefit from the polling ### VMX preemption timer - vmx preemption will count down in vmx non-root mode and VM-exit when it reach zero - it reduce the cost of - hrtimer cost - timer intrrupt ISR - vm-exit/entry -`hrtimr_start/hrtimer_cancel` depends on current situation - timer interrupt ISR has far more cost than VMExit handing - both of lapic timer tsc deadline mode and periiodic # 64-bit ARM Unikernels on uKVM [E] - Wei Chen, ARM - unikernel, single-address-space machine images contructed by using library operating system - unikernel can be designed to run on bare metal directly - two big drawbacks - resource ioslations for multiple unikernels - variety of different devices - fortunately modern hyperviosr privide virtual machine with - consistent set of virtual deices - strong context isolation - why need unikernel - to address issue of traditional workload on cloud - slower initalzion - more resource used - more opportuntities to exploit - container help? - cons: less secure - pros: lightweigh - faster startup - high density - fast deployment - efficient resource utilization - are unikernel better solution - packge only need modules into an image - package into a single address space image (all stuff) - mirgageOS as an example - run on xen or Linux as a guest - VENOM vulnerability - origin bug in vurtual floppy emulation - range millions of VM were potentially at risk - uKVM unikernel monitor - only pack the depend into unikernel and monitor ### uKVM on AArch64 - steup guest CPU - guest mem - geust timer - some thing here - need MMU for guest to share data with host on AArch64 [uKVM](https://github.com/Weichen8/ukvm-solo5-arm64) demo_for_oss_2017 - only support Linux now - virtio support - verify and improve the compatibility of MirageOS libraries on AArch64 Summary - unikernel on uKVM is an approach to make workloads to be smaller faster and have less opportunites to exploit - running unikernel inside the container - single thread