###### tags: `System Software` # WasmEdge ## Background ### Bytecode Alliance - a nonprofit organization dedicated to creating secure new software foundations - building on standards such as WebAssembly and WebAssembly System Interface (WASI). ### Common Uses Cases for WebAssembly performance-critical or computation-intensive cases - 3D rendering - video games - music streaming - encryption - image recognition ### History - WasmEdge was previously maintained by engineers from Skymizer - Foucs on blockchain - Later the team was separated from Skymizer. Today, they are a standalone company called Second State ### WASI: Portable System Interface for WebAssembly release in March, 2019 WASI is a system interface to run WebAssembly outside the web - syscall like function call that do I/O on behalf of the program - provides access to several operating-system-like features, including files and filesystems, Berkeley sockets, clocks, and random numbers - host binding :question: ## Introduction Machine Learning is a big topic nowadays. WasmEdge already provides a set of TensorFlow host functions to enable the ML inference in WebAssembly. However, these TensorFlow host functions are defined by us and they are just a Wasm function binding from the TensorFlow C API. Here comes a standard, the WASI-NN proposal provides a new way to perform neural network inferencing by using a runtime-provided implementation that can leverage host native optimizations, CPU multi-threading, or powerful hardware devices such as GPUs or TPUs. :::info <!-- It's fine to adjust or complete the project goals above. --> ## Milestones Template <!-- Please describe your works in steps, such as verifying problems, implementation, etc. --> <!-- Please also estimate the durations and review the steps with your mentors. --> <!-- It's better to list the probable bottlenecks. --> 1. (For example) Study the WASI-NN proposal: 1 week * (For example) Understanding the workflow of host functions in WasmEdge and ... * (For example) Bottleneck: define the WASI-NN host functions ... 2. (For example) Implementation: 6 weeks 3. <!-- ... --> ::: ## My Milestones 1. Study the WASI-NN proposal: 1 week - Study WITX specification and its relation to WASI-NN proposal - Bottleneck: Understand how binding works in WASI - Bottleneck: Understand how host function interacts with WasmEdge 2. Implement the first binding for WasmEdge: 1-2 weeks - Try to add my first binding for WasmEdge. It won't be in the code base but for me to understand the workflow to add bindings. - Could be the biggest bottleneck for me. - I will also record the steps and complete a write-up for other people to understand how to add a binding. 3. Implement WASI-NN: 5-6 weeks - Bottleneck: Find the proper openVINO implementation for WASI-NN I see that [wasmtime](https://github.com/bytecodealliance/wasmtime) already implements [WASI-NN](https://github.com/bytecodealliance/wasmtime/tree/main/crates/wasi-nn) using openVINO. Should we create "yet another implementation" in WasmEdge using openVINO as the backend? I am considering using [CMSIS-NN](https://github.com/ARM-software/CMSIS_5/tree/develop/CMSIS/NN) for the backend. Since the WasmEdge use cases include IoT devices, it makes sense when users want to run their application on ARM Cortex-M based microcontrollers with maximum resource utilization. - Investigate memory utilization of WasmEdge - Evaluate feasibility for CMSIS-NN binding Also, I found out there exists [arm-NN](https://www.arm.com/products/silicon-ip-cpu/ethos/arm-nn) for Cortex-A processors, it can utilize Mail GPU and Ethos NPU when it is implemented by chip manufacturers in the future. :::info Meeting Minutes - [time=Wed, Sep 15, 2021 3:39 PM] - Implement the first **host function** - Search for a **testing wasm file** to verify authenticity of the result - Will only use C++ in the future, no Rust will be used ::: ## Research on NN wasmtime use Rust binding of openVINO, we can refer to their implementation or alternatively, use open-source Rust bindings for the C++ api of PyTorch. PyTorch has already provide C++ api offically for internal use at Facebook onnx tutorial: https://github.com/onnx/tutorials :::info - ~~Study PyTorch Rust binding~~ Will not need rust to implement host function - ~~Study feasibility of using CMSIS-NN as binding~~ CMSIS-NN is a low level neural network kernel optimized for Cortex-M microcontollers. It needs a machine learning framework like TFLM to be retarted for. ::: ### CMSIS-NN Workflow ```graphviz digraph graphname{ T [label="Trained Model", shape=box] // node T P [label="Quantization", shape=box] // node P Z [label="Transfrom", shape=box] T->P [] // edge T->P P->Z } ``` ### Quantization Quantization takes up a lot of time before, typically several hours on CPU. Nowadays, TFLM has already integrated with CMSIS-NN, making use of DSP and M-Profile Vector Extension (MVE) instructions for hardware accleration. https://blog.tensorflow.org/2021/02/accelerated-inference-on-arm-microcontrollers-with-tensorflow-lite.html CMSIS-NN only supports fixed-point arithmetic, this is due to the fact that many microcontrollers do not have FPU. Therefore, it is necessary to do quantization to convert a floating point model, into fixed size model. Another benefit is that it greatly reduce the size of the model and improve computation efficiency. ### Transform Map the model into CMSIS-NN function call. ### TFLM (TensorFlow Lite for Microcontrollers) tinyML_development_TFL_CMSIS_Ethos-U55.pdf Ethos-U55 is a microNPU, but not yet implemented by any chip manufacturer. ![](https://i.imgur.com/C6TQeo8.png) ### Arm-NN The latest release of Arm-NN supports models created with TensorFlow Lite (TfLite) and ONNX. ![](https://i.imgur.com/vbhRbor.png) ## Research on Host Functions There are 4 components related to Module - start function - table - memory - global ### Imports https://webassembly.github.io/spec/core/syntax/modules.html#syntax-import add.wasm ```w (import "wasi_snapshot_preview1" "fd_write" (func $_ZN4wasi13lib_generated22wasi_snapshot_preview18fd_write17h93016769784eae7aE (type $t8))) ``` wasimodule.cpp ```cpp namespace WasmEdge { namespace Host { WasiModule::WasiModule() : ImportObject("wasi_snapshot_preview1") { ... addHostFunc("fd_write", std::make_unique<WasiFdWrite>(Env)); ... } } // namespace Host } // namespace WasmEdge ``` wasifunc.cpp ```cpp Expect<uint32_t> WasiFdWrite::body(Runtime::Instance::MemoryInstance *MemInst, int32_t Fd, uint32_t IOVsPtr, uint32_t IOVsLen, uint32_t /* Out */ NWrittenPtr) { ... } ``` wasmedger.cpp ==is not the place where `WasmEdge::Host::WasiModule` is instantiated== ```cpp WasmEdge::Host::WasiModule *WasiMod = dynamic_cast<WasmEdge::Host::WasiModule *>( VM.getImportModule(WasmEdge::HostRegistration::Wasi)); ``` vm.cpp ```cpp void VM::initVM() { /// Create import modules from configuration. if (Conf.hasHostRegistration(HostRegistration::Wasi)) { std::unique_ptr<Runtime::ImportObject> WasiMod = std::make_unique<Host::WasiModule>(); InterpreterEngine.registerModule(StoreRef, *WasiMod.get()); ImpObjs.insert({HostRegistration::Wasi, std::move(WasiMod)}); } if (Conf.hasHostRegistration(HostRegistration::WasmEdge_Process)) { std::unique_ptr<Runtime::ImportObject> ProcMod = std::make_unique<Host::WasmEdgeProcessModule>(); InterpreterEngine.registerModule(StoreRef, *ProcMod.get()); ImpObjs.insert({HostRegistration::WasmEdge_Process, std::move(ProcMod)}); } } ``` ### Question 1. What language will programers write their applications in? Rust or C++? 2. Compilers for C++ into Wasm? ## Week 3 ### Milestones - Add host function `compute` in wasi_nn directory (`compute` is defined in WASI-NN) - Why not use `-` (hyphen) in directory? May cause problems in cmake - `cmake ..` takes a lot of time. May switch to workstation at school for development - Successfully build whole project - Successfully excute my custom host function - possible test program for test suite from wasmtime CI - https://github.com/bytecodealliance/wasmtime/blob/main/crates/wasi-nn/examples/classification-example/src/main.rs install rustup Q: What is the difference between rustup and cargo? A: Compiler vs. Package manager? Q: What tool to use for auto generating template from witx? A: See ==`util` directory== [witx-bindgen](https://github.com/bytecodealliance/witx-bindgen)? [WITX-CodeGen](https://github.com/jedisct1/witx-codegen)? ```w (import "wasi_ephemeral_nn" "load" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn4load17h78e0b712ae6cca6dE (type $t8))) (import "wasi_ephemeral_nn" "init_execution_context" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn22init_execution_context17h3f29b2c9b8850d7fE (type $t2))) (import "wasi_ephemeral_nn" "set_input" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn9set_input17h927becfabf8599e3E (type $t7))) (import "wasi_ephemeral_nn" "get_output" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn10get_output17h0fe3fded1a895cc4E (type $t8))) (import "wasi_ephemeral_nn" "compute" (func $_ZN7wasi_nn9generated17wasi_ephemeral_nn7compute17h74f1090b3a05c1cfE (type $t6))) ``` Q: Why use a base class when creating host functions? A: ~~Some data need to be shared among functions. The data to be shared will be declared in `protected` member data.~~ A2: ~~Common data that will be used among a set of host function. Each child class has one.~~ A3: It's actually a ==**reference**== ```cpp template <typename T> class WasmEdgeProcess : public Runtime::HostFunction<T> { public: WasmEdgeProcess(WasmEdgeProcessEnvironment &HostEnv) : Runtime::HostFunction<T>(0), Env(HostEnv) {} protected: WasmEdgeProcessEnvironment &Env; }; ``` Q: How to determine function prototype from witx? It seems there is only four basic types in Wasm. Q: `__WASI_ERRNO_SUCCESS`, `Expect<uint32_t>` What is the return value of host function body? Problem: `cmake ..` takes a lot of time to configure possible test program for test suite https://github.com/bytecodealliance/wasmtime/blob/main/crates/wasi-nn/examples/classification-example/src/main.rs used in wasmtime CI ~~Still~~ encountered error when executing wasm ```w (module (type $t0 (func (param i32) (result i32))) (import "wasi_ephemeral_nn" "compute" (func $wasi_ephemeral_nn.compute (type $t0)))) ``` ```bash $ ./wasmedge --reactor examples/hello_copy.wasm wasi_ephemeral_nn.compute 20 [2021-09-28 16:11:38.963] [error] wasmedge runtime failed: wasm function not found, Code: 0x05 [2021-09-28 16:11:38.964] [error] When executing function name: "wasi_ephemeral_nn.compute" ``` :::info Reason: did not export function name Final wasm hello_copy.wasm ```w (module (type $t0 (func (param i32) (result i32))) (import "wasi_ephemeral_nn" "compute" (func $wasi_ephemeral_nn.compute (type $t0))) (export "wasi_ephemeral_nn.compute" (func $wasi_ephemeral_nn.compute))) ``` ::: ```bash $ ./wasmedge --reactor examples/hello_copy.wasm wasi_ephemeral_nn.compute 222 WasiNNModule::WasiNNModule() WasiNNCompute::body 0 ``` ## Week 4 - Trouble in generating cpp header from **wasi_ephemeral_nn.witx** using **wasi-cpp-header** - Learning Rust syntax in order to improve the header generating tool (wasi-cpp-header) - Study witx rust crate ``` (typename $tensor_data (list u8)) ;;; A tensor. (typename $tensor (record ;;; Describe the size of the tensor (e.g. 2x2x2x2 -> [2, 2, 2, 2]). To represent a tensor containing a single value, ;;; use `[1]` for the tensor dimensions. (field $dimensions $tensor_dimensions) ;; Describe the type of element in the tensor (e.g. f32). (field $type $tensor_type) ;;; Contains the tensor data. (field $data $tensor_data) ) ) ``` > This script is currently only support using 'list' as a function parameter type, not in a record type > [name=Shen-Ta Hsieh] [time=Tue, Oct 5, 2021 2:00 PM] wasi-cpp-header was originally devloped by wasm origanization https://github.com/WebAssembly/wasi-libc/blob/main/tools/wasi-headers/src/c_header.rs > 1. 改 wasi-cpp-header 工具讓他支援那個 type 並且送 PR 回 upstream > 2. 像 wasi-crypto 那個 mentee 一樣,決定自幹一套 pure cpp backend: https://github.com/jedisct1/witx-codegen/pull/11/files > 3. workaround 掉原本的 witx files ,如果改動不是特別大那還可以接受,否則跟 upstream 會很難 sync > [name=hydai] [time=Tue, Oct 5, 2021 3:20 PM] ### Moved to next week - Implement function prototype - Compare that of wasitime and WASI in WasmEdge - Study pytorch C++ API ## Week 6 PyTorch C++ TorchScript PyTorch models need to be converted to TorchScripts to executed purely from C++ ![](https://i.imgur.com/wicdSCt.jpg) ![](https://i.imgur.com/1RUWetU.jpg) ## Week 7 <!-- ![](https://i.imgur.com/J1SangO.png) --> ![](https://i.imgur.com/g8yKWMU.png) https://docs.openvino.ai/latest/openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html ``` (typename $graph_encoding (enum (@witx tag u8) ;;; TODO document buffer order $openvino ) ) ``` ``` (@interface func (export "load") ;;; The bytes necessary to build the graph. (param $builder $graph_builder_array) ;;; The encoding of the graph. (param $encoding $graph_encoding) ;;; Where to execute the graph. (param $target $execution_target) (result $error (expected $graph (error $nn_errno))) ) ``` ### ONNX Runtime ONNX Model Zoo https://github.com/onnx/models ![](https://i.imgur.com/VH91xiJ.png) ## Problems: witx definition ```clojure (module $wasi_ephemeral_nn (import "memory" (memory)) (@interface func (export "load") (param $builder $graph_builder_array) (param $encoding $graph_encoding) (param $target $execution_target) (result $error (expected $graph (error $nn_errno)))) ) ``` Rust backend implementation ```rust impl WasiEphemeralNn for WasiNnOnnxCtx { fn load( &mut self, builder: &GraphBuilderArray, encoding: GraphEncoding, target: ExecutionTarget, ) -> Result<Graph> { ... } ``` wasm raw interface ```as (import "wasi_ephemeral_nn" "load" (func (param I32 I32 I32 I32 I32) (result I32))) ``` Guest ```rust pub unsafe fn load( builder: GraphBuilderArray, encoding: GraphEncoding, target: ExecutionTarget, ) -> Result<Graph> { let mut graph = MaybeUninit::uninit(); let rc = wasi_ephemeral_nn::load( builder.as_ptr(), builder.len(), encoding, target, graph.as_mut_ptr(), ); if let Some(err) = Error::from_raw_error(rc) { Err(err) } else { Ok(graph.assume_init()) } } ``` ## Week 8 ``` $ LD_LIBRARY_PATH=/mnt/c/Users/st954/Desktop/VSCode/st9540808/WasmEdge/thirdparty/onnxruntime-linux-x64-1.9.0/lib \ /mnt/c/Users/st954/Desktop/VSCode/st9540808/WasmEdge/build/tools/wasmedge/wasmedge \ ./target/wasm32-wasi/debug/wasi-nn-examples.wasm Hello, world! ``` ## 11/23 ``` docker exec -e LD_LIBRARY_PATH=/root/WasmEdge/thirdparty/onnxruntime-linux-x64-1.9.0/lib/ unruffled_aryabhata bash -c "cd /root/wasi-nn-examples/ && pwd && /root/WasmEdge/build/tools/wasmedge/wasmedge --dir .:. /root/wasi-nn-examples/target/wasm32-wasi/debug/wasi-nn-examples.wasm" ``` ## Future work Because I participate in the mentorship as part of my master research survey. They are something I can do in my master thesis if I choose WasmEdge as my research subject. - Evaluate the performance of WasmEdge when using WASI-NN. Specifically, the latency of inference, number of operations per second, etc... - Deploy an application on my STM32 board. It might be related to Wasm-C-API proposal because it needs to instantiate a VM instance after booting up the board. ## Appendices - [How to Do Machine Learning on ARM Cortex-M Microcontrollers (in Chinese)](https://armkeil.blob.core.windows.net/developer/Files/pdf/white-paper/arm-how-to-do-machine-learning-on-arm-cortex-m-microcontrollers-tw.pdf) - [WebAssembly/wasi-nn](https://github.com/WebAssembly/wasi-nn) - [A beginner's guide to adding a new WASI syscall in Wasmtime](https://radu-matei.com/blog/adding-wasi-syscall) - [CMSIS-NN: Efficient neural network kernels for Arm Cortex-M CPUs, arXiv:1801.06601](https://arxiv.org/abs/1801.06601) - [Rust bindings for the C++ api of PyTorch](https://github.com/LaurentMazare/tch-rs) https://github.com/ARM-software/CMSIS_5/tree/develop/CMSIS/NN