# Week 12 - Virtualization, Cloud and Serverless Computing ###### tags: `WS2020-IN2259-DS` ## Virtualization > "In computing, **`virtualization`** refers to the act of **creating a `virtual (rather than actual) version` of something**, including virtual computer hardware platforms, storage devices, and computer network resources." > by Wikipedia ### Many Forms of Virtualization * **Server Virtualization**:most popular * Storage Virtualization * Network Virtualization * Desktop Virtualization * I/O Virtualization * 例如:網卡、fibric adapter。 ### Why Virtualizations? * 問題:一台 Server 只能裝一份 OS 或 Application。 * High up-front (前期) costs * High ongoing (持續) costs * Most of the server resources are wasted ### What is Server Virtualization? * The OS is **abstracted** away from hardware * Multiple OS and Applications on one Server runs on the **`virtual layer`**, also known as **`Hypervisor`** * 圖示: ![](https://i.imgur.com/cyyxqCH.png =200x) ### Hypervisor * Creates the **`virtualization layer`** that makes server virtualization possible * Contains the **`Virtual Machine Monitor (VMM)`** * Example of Hypervisors * KVM * Oracle VirtualBox ### Type 1 Hypervisor * **`Native Hypervisor`** or **`Bare-metal Hypervisor`** * Type 1 Hypervisor is loaded directly on the **`hardware`** * Example * Microsoft Hyper-V * VMware ESX/ESXi * XEN * ==KVM (老師搞錯惹)== * 圖示: ![](https://i.imgur.com/ET59W8e.png =x250) ### Type 2 Hypervisor * **`Hosted Hypervisor`** * Type 2 is loaded on an **`OS`** running on the hardware * Example * Oracle VirtualBox * VMware Workstation * 圖示: ![](https://i.imgur.com/CUhqwOY.png =x250) ### Disadvantages of Hypervisors * The Server’s resources are shared among VMs * Execution of OS consumes resources (CPU/RAM/Disk) which is independent of the app * OS can be **`expensive`** and **`requires administration`** ## Cloud Computing ### What is Cloud Computing? * A **`computing service`** you traditionally **did local**, now the service is **performed remotely** (off-premises,機房外備用設備) * **`Cloud computing`** is an approach to computing that leverages the **efficient pooling** of an on-demand, self-managed, virtual infrastructure ### Advantages of Cloud Computing * **`Fast`** and **`efficient`** access to resources that clients actually need * Pay for what you use * No initial capital expenditure (初始資本支出) * 可以直接開始創業。 * Less need for a big administrator team * 大幅減少管理成本。 * **`Self-maintenance`** and **`fault tolerance`** resources * 雲端服務供應商會負責這些。 ### Disadvantages of Cloud Computing * Clients need to **trust the cloud providers** * 雲端服務供應商真的可以信任嗎? * Possible limited access to your data * Possible conflicts with government regulations and restrictions * 個人健康等特殊隱私資料可以儲存於雲端嗎? * Potential data loss * Locked in within the cloud provider’s specification * 照著人家的規則來。 * You may encounter unknown costs (check SLA) * 忘記關閉沒有用的雲端服務。 * 結論:Know who you are dealing with! ### Evolution of Cloud Computing * The 1950s – Mainframe Computing * 1960 – 1990 * Internet, VMs, VPNs * 1999 – Salesforce, the first SaaS * 2006 – Amazon Web Services * 2008 – Microsoft Azure * 2013 – Google Compute Engine * 2014 – IBM Bluemix ### Cloud Comes in Many Shapes! * Infrastructure-as-a-Service (IaaS) * Platform-as-a-Service (PaaS) * Software-as-a-Service (SaaS) * Function-as-a-Service (FaaS) * Database-as-a-Service (DBaaS) * Everything-as-a-Service (*aaS, XaaS) ### Infrastructure as a Service (IaaS) * **`VMs`** offered as a service * 運行於自己 premises (場所) 的 **`Private Clouds`**,如 OpenStack。 * 運行於網際網路的 **`Public Clouds`**,如 AWS。 * **`Hybrid Cloud`** 不是真實存在的架構,而是上述兩種雲的合作。 ![](https://i.imgur.com/3wYtS4k.png =400x) * Virtualization vs. Private Cloud * 雲端計算需要虛擬化技術支援。 * 虛擬化提供了:**`scalability`**、**`fault-tolerance`**、**`high availability`**、**`load balancing`**。 * 私有雲建立於虛擬化的基礎上。 * 私有雲提供:**`abstraction of resources`**、**`secure multi-tenancy`**、**`better separation of concerns`** ### Platform-as-a-Service (PaaS) * 客戶不需要管理 storage、network、OS、database 等等。 * Cloud provider 安裝所有客戶需要的軟體相依。 * Google App Engine * AWS Elastic Beanstalk * MS Azure App Service * Heroku * IaaS vs PaaS * IaaS 即是一台有安裝 OS 的 VM,客戶需要安裝所有自己需要的 packages。 * PaaS 則是基於 Iaas 之上,而客戶可以直接存取「供應商安裝與設定好的 packages 資源」。 * 對於**快速且簡單**的**應用設計與開發**而言,PaaS 是極其適合的方案。 ### Software-as-a-Service (SaaS) * 客戶不需要對硬體與軟體進行任何安裝、設定與維護。 * 在很多案例中,比起類似的選擇,是比較能負擔 (affordable) 的方案。 * 自己買 liscense,還是找 SaaS? * 客戶可以快速地存取最新的功能 (most recent patches)。 * Google Docs * Salesforce * Gmail ### Pizza-as-a-Service (Funny!!!) ![](https://i.imgur.com/OMI7Vt3.png) ## Serverless Computing * 參考閱讀:[Serverless & FaaS. 順帶一提 SaaS, PaaS & IaaS | by 施靜樺 | Medium](https://medium.com/@jinghua.shih/serverless-faas-3b607f0158fe) ### Evolution of Serverless * We have the cloud, Why Serverless computing? ![](https://i.imgur.com/Ii2Dfla.png =400x) * 大型應用的設計演變: * **`Monolithic Application`**:單一應用程式。 * **`Microservices`**:分散流量以支援更高效的處理,大幅增加 processes 數量。 * **`High Availability (HA)`**:備援機制,增加成本。 * 防範 **`regional outages`** (區域中斷):partition 切割機制,會增加管理成本與複雜度。 * 圖示: ![](https://i.imgur.com/AmoG3Jf.png =600x) * 因此,我們需要 **`serverless computing`**,下圖是 serverless 的定位: ![](https://i.imgur.com/57WpRff.png =400x) ### What is Serverless? * 又稱為 **`Function-as-a-Service (FaaS)`**。 * 為了 **`short-running, stateless computation`** 與 **`event-driven applications`** 而設計的 **cloud-native platform**。 * 本質上仍然是 **platform**! * 可以瞬間自動地 scales up and down。 > Auto-scalability and maintenance * 以 millisecond granularity 的量級計算使用量並計算開銷。 > Pay for what you use * Serverless 不表示沒有 servers,而是代表「**不用擔心的伺服器 (worry-less servers)**」。 * 供應商負責這部份的工作。 ### What is Serverless Good For? * 是 **`short-running stateless event-driven operations`** 的救星。 * 一個指令一個動作的那種。 * 不需要儲存以前的狀態以供後續參考。 * 例如: * Microservice * Mobile Backends * Bots, ML Inferencing * IoT * Modest Stream Processing:適度的流量處理 * Service Integration * 對於 **`long-running stateful computationally heavy operations`** 不優。 * 例如: * Databases * Deep Learning Training * Heavy-Duty Stream Analytics:重型流量分析 * Spark/Hadoop Analytics * Video Streaming * Numerical Simulations ### Serverless Platforms * All major cloud providers offer serverless platforms ![](https://i.imgur.com/ktvb5fb.png) ### Apache OpenWhisk Serverless Architecture * Opensource * IBM Cloud Functions is implemented based on OpenWhisk #### OpenWhisk Programming Model ![](https://i.imgur.com/nNtbAMR.png =300x) #### Serverless Actions * Action is a **`stateless function`** that is execution in **response to an `event`** * Node.js 範例: ```javascript= def lambda_handler(event, context): print("hello world") ``` * Python 範例: ```python= function main(params) { console.log(“Hello “ + params.name); return { msg: “Goodbye “ + params.name) }; } def lambda_handler(event, context): print("hello world") ``` #### Triggers, Rules and Sequences * 一個 **`event trigger`** 觸發一個或一連串 **`action (sequences)`** 的執行。 * 一個 **`rule`**,映射一個 **`event trigger`** 至一個 **`action`**。 * 示意圖: ![](https://i.imgur.com/ftigR94.png) ### Apache OpenWhisk 範例: #### Step 1. Entering the system * 客戶端:`POST /api/v1/namespaces/myNamespace/actions/myAction` * 圖示: ![](https://i.imgur.com/l5ntACn.png =600x) #### Step 2. Handle the request * `Controller` 包含了很多 library:`Scala`、`akka`,以此進行對於 **`serverless event` 的處理**。 * 圖示: ![](https://i.imgur.com/qQVvNtN.png =600x) #### Step 3. Authentication + Authorization * `Controller` 向外部的驗證機構進行客戶身分的**驗證**,並給予**授權**。 * `CouchDB` 儲存有客戶身分相關之資料。 * 圖示: ![](https://i.imgur.com/RqrFMea.png =600x) #### Step 4. Get the action * **從 `CouchDB` 提取 `serverless action`**。 * 圖示: ![](https://i.imgur.com/YTMYlT1.png =600x) #### Step 5. Looking for a home * 尋找執行此 `serverless action` 的「**地方**」。 * 「地方」可以是 VM 或是 container,又稱作 **`slave`**。 * `Load balancer`:負責找到適合的「`slave`」。 * `Consul` 儲存有 slave health、slave load 等資訊,相關功能如下: * Sequentially consistent KV store * Replication, Fault Tolerance * Health Check / Monitoring utilities * 圖示: ![](https://i.imgur.com/hpYBZ6k.png =600x) #### Step 6. Get in line! * 找到目標 `slave` 後,先將 `action` 傳送至安裝有有 `kafka` 的虛擬機。 * **`kafka`** 為一 **`Pub/Sub-based Messaging Queue Application`**,會將所有客戶的 `actions` 放進佇列,然後依序傳送至 `slaves` 進行處理。 * High throughput fault-tolerant queues * Point-to-point messages via topics * Explicit load balancing * 圖示: ![](https://i.imgur.com/a2f1vVi.png =600x) #### Step 7. Get to work! * 在 **`slave`** 中,我們有一個 **`invoker`**,它使用 **`container`** 以執行 **`actions`**。 * 容器如 docker。 * **獨立性 (isolation)**:每個 user 都會獲得自己的 container。 * `Containers` 可以重複使用。 * **`Container pool`** 可以對所有容器進行管理: * allocates * garbage collects * 圖示: ![](https://i.imgur.com/jVMSINb.png =600x) #### Step 8. Store the results * 將結果回傳 client。 * 儲存 logs 至 CouchDB。 * 不僅可以查詢紀錄,也可以計算費用。 * 圖示: ![](https://i.imgur.com/ntf2W4T.png =600x)