###### tags: `Meetup`、`Co-writing`
# CNTUG Meetup #22
[TOC]
---
## Announcement

- 請多加參與社群 Meetup 共筆 [HackMD: Cloud Native Taiwan User Group](https://hackmd.io/@CNTUG)
- 多多支持台灣新創 HackMD [Use HackMD professionally and collaborate at scale](https://hackmd.io/pricing)

- CNTUG <3 天瓏書局
> 多多買書回家讀,是充實知識最快的方式
---
## Session 1. How to achieve canary deployment on Kubernetes - John Chen ([Grindr](https://www.grindr.com/))
QA 內部考量?
穩定度,承受服務的負載量
### 藍綠部署 (Blue-Green Deployement)
遇到的問題點
* 連線是有狀態的(帳號密碼)
* 新舊版本不相容
* 叢集系統
* 有持久性的資料儲存: 最怕遇到資料不對
### 金絲雀部署 (Canary Deployment)
逐漸將生產環境流量從版本A切換到版本B。通常流量是按比例分配的。
例如90%的請求流向版本A,10%的流向版本B,轉移的趴數是根據的自己團隊去評估,趴數低不一定能及時看到狀況,如果不想等這麼久就切大量。
適合使用的時機:
* 連線是無狀態
* 沒有 QA 進行測試
* 服務之間的流量和延遲問題
* 大量併發時才會發現的問題
### Original Deployment
* 使用 Route53 做 domain weight 導流
> Deployment 有版本控制,跟 ReplicaSet 功能做分工
### Componnet relationship
| Object Type | 對應比例 |
| -------- | -------- |
| Pod: Container| 1:N |
| Replicaset: Pod| 1:N |
| Deployment: ReplicaSet |1:N|
| Deployement: Service| 1:N |
| Service: Deployment| 1:N |
**Kubernetes Canary Deployment**
可以藉由調整Replicas的方式去調整服務流量的百分比
e.g. 舊的deployement => replicas: 3
新的 => replicas: 1 => 25% new
service 的設定不需要改動
透過給予不同 Label,讓 Service 導向到不同的 Deployment ,以此來做到金絲雀部署:
```yaml
Deployment (Stable Version)
labels:
- app: guestbook
- tier: frontend
- track: stable
Replicas: 9
Deployment (Canary Version)
- app: guestbook
- tier: frontend
- track: canary
Replicas: 1
Service
selector:
- app: guestbook
- tier: frontend
```
### Questions
* 3 deployment 2 service => Canary deployment
* 每次都要改 yaml 再執行 kubectl 指令
* 操作指令很複雜,會不會下錯指令
Github -> Jenkins -> Helm -> Kubernetes
Helm upgrade chart 變更即可,數值的改動由Jenkins進行
### Conclusions
* Script can do, Jenkins can do, either.
* Don't be limited by tools to do CI/CD
* Useful approach is the most important.
* 站穩腳跟再往前
### Questions
* Preprod : production = 1:1?
=> 機器關掉會不會導致上版受影響?
* 流量還是沒有平均分配?
=> 有些服務如 Redis,一但建立好連線就不會重新連了,因此可能會有不管怎麼改比例,流量都不會平均分配的情況。
=> 程式的連線建議用暫時的 cache 就好,三不五時問新的連線(rabbitmq 等等)
=> 也可以使用 Consul 做 Service Discovery
* 為何不用ingress ?
=> AWS 上 type = LoadBalancer 即可,用一個LB屬於一個服務的方式在查問題比較好查,但是成本會比較高
* 為何不用 Jenkins Pipeline ?
=> 工程師學習成本,Jenkins 掛掉的風險
### Supplement
Multiple redis servers in one VM?
### Q&A
* 新版本的服務,Log 如何追蹤?
=> 在 ES(Elaticsearch) 裡面可以用Image去看,但不推薦使用 Image Tag 去區分,因為 Image Tag 可能會隨時一直改變,最好改用固定的 Tag 來監控。
方式: 在yaml的區段會下ENV: staging/prod等tags, 可以透過filter搜尋的方式去找尋是哪個stage的log
* 切換的比例如何拿捏?
=> 用算的,因為每個服務的Pods不見得數量相同。例如50% 25%
---
## Session 2. Massive Bare-Metal Operating System Provisioning Improvement - Date Huang (Engineer, Edgecore Networks)
Focus on Bare Metal Deployment
Unitcast:
* Maas: Metal as a Service
*
Broadcast
Multicast
### Problems need to be resolved
* Unicast is Hard to scale up/out
=> Server bandwith is not enough
* Multicast/Broadcast is NOT always avaiable
##### Solution: BitTorrent
[Peer to Peer]
* Peer can be sender too
* Reduce the stress of origin sender
* Temporary Storage is required
* Can't assure data order
* Image size will be Limit by RAM
#### Solution: [EZIO](https://github.com/tjjh89017/ezio)
* EZIO design
* offset and length are record in file
* "Piece" is minimum unit in bittorrent
* cut continuous block into piece
* Lagacy approach
* One server stuck, all servers stuck.
* BitTorrent Deployment(EZIO)
* Heterogeneous Environment 表現優異
* 就算一台Server死了,還是可以從其他的Server拿資料
#### 搬資料大賽
#### CloneZilla
[下載地址](https://clonezilla.nchc.org.tw/clonezilla-live/download/)
#### QA
* 有用BT, how to track?
private tracker
* For efficiency, how to synchronize data
Broadcast to everyone to let them know what a server has, and retrieve data by rarity.
* 萬一要使用到大量的bare metals?
國網中心大量的叢集電腦,先裝好一台,再把一台複製到大量的電腦
multicast 狀況越來越多,才會有CloneZilla的誕生
#### Reference
* 理論依據: https://www.mdpi.com/2076-3417/9/2/296
- https://github.com/tjjh89017/ezio
- https://www.libtorrent.org/
---
## Session 3. SDN演義 - 初出茅廬 (1/n) - JianHao Chen (SDN 說書人, Edgecore Networks)
代表ONF
### Open Networking & SDN
```
Switch / Router Software - CP
------------------------------- SDN
Switch / Router Software - DP
------------------------------- Open Networking
Switch / Router - Harware
```
### OpenFlow: Enabling
Datacenter to Datacenter => B4 WAN
Cloud Edge to Datacenter => B2 WAN
Google B4 Network: openflow的商業化
SD WAN (software define WAN)
邱牛的B4心得: https://www.hwchiu.com/b4-after.html
(講者說有不對的地方可以打臉)
### SDN 戰國時代
1. SDN Controller - 2010
NOX-Controller :
* C++
* 可以在上面寫一些 C++ Application,然後使用該 Application 去 call OpenFlow API。
POX-Controller :
* Python
* 與 NOX 類似,只是改用 Python 撰寫。
2. Floodlight - 2011 ~ 2012mid,
REST API
Programmed by JAVA
3. Ryu - 2011
Programmed by Python, performance issue, latency, more complicated than NOX and POX.
4. OpenDayLight - 2013
* Support OpenFlow included CISCO, Juniper(?)
* Support Restful API
* Service Abstaction Layer, 很好開發
* OpenStack 介接
* Centralize - controller掛掉之後, switch都不做事
5. ONOS - 2014
* Distribute Architecture
* Support failover mechanism. (The other controllers will take over the works of malfunctioned controller.)
### Data Plane
1. Virtual Network concept
2. TTP/NDM model - 2012
3. OF-DPA - 2014
* models 的設計
* OF-DPA Premium for support effort, limited 100 machines(business case)
4. FlowVisor - 2012
* 中間使用一層 Proxy Controller, Proxy Controller 對 SouthBound 來說是Controller,但是對 NorthBound 來說只是一個 Switch,用這種欺騙的方式。
5. VeRTIGO - 2012
* Topology 的抽象化 A-B-C => 只看得到 A-B
6. OpenVirteX - 2014
* Use virtual mac address
* Edge Switch Concept, host switch mac address 轉換
7. VTN - 2013
* OVS Switch
* Overlay and Underlay Network 的串連
### 問題來了
晶片如何設計等等TBD
### Q&As
## ~~Lightening~~ Heavy Talk 1 . LINE DevDay - Kyle Bai
### LINE Dev Day
### Kubernetes Development Survival kit
* 如何模擬線上的Developing Workflow情境
* Speed up deploy image flow
### K8s local
* Minikube:
* Pros: based on VMtons of options for customizing the cluster
* easily run different k8s versions, container runtimes, and controllers
* VM drivers 選擇
* KIND: (Kubernetes IN Docker) ***最後選擇***
* Fast to start up
* Kubernetes SIGs
* Used to test Kubernetes itself.
* https://github.com/kubernetes-sigs/kind
* Can run in most CI environments (Travis CI, CircleCI, etc.)
* Cons: Push image to cluster very slow.
* MicroK8s:
* Pros: 不需要以VM的方式執行
* container-only
* Linux-only
* K3D(K3s) - (Inside a Docker container):
* K3D runs K3s
* lot of optional and legacy features removed.
* lightweight K8s distro
### KIND 好處
* Customized KIND - node image
* Customized KIND - config
### Observation & Debug
* kubectl port-forward
* kubefwd
* kubebox - check cluster status
* A terminal UI to access cluster
* access k8s easily
* kubespy
### Solution
* Draft
* Garden
* Skaffold **較多**
* workflow:
* Tilt
### Q&A
用什麼東西監控?
* Prometheus **較多**
* CloudWatch
Pod & Pod 之間的保護
* VPC
## ~~Lightening~~ Heavy Talk 2. Kubeflow v0.7 Multi-user Auth with Istio - Jack Lin
### What is Kubeflow ?
How to train AI models on K8s
### Service
可以使用 KubeFlow 直接開啟 Jupyter Notebook
### Multi-users Isolation
### Design Decision
* Istio:
* 讓K8s對於網路的控制更方便
* HTTP Routing
* RBAC authorization
### High Level Service Solution
* Istio Gateway
* Envoy filter -> OIDC AuthService -> 3rd party auth company -> Idp(LDAP)
### Profile Controller
[Component LINK](https://github.com/kubeflow/kubeflow/tree/master/components/profile-controller)