tags: MeetupCo-writing

CNTUG Meetup #22


Announcement

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • CNTUG <3 天瓏書局

多多買書回家讀,是充實知識最快的方式


Session 1. How to achieve canary deployment on Kubernetes - John Chen (Grindr)

QA 內部考量?

穩定度,承受服務的負載量

藍綠部署 (Blue-Green Deployement)

遇到的問題點

  • 連線是有狀態的(帳號密碼)
  • 新舊版本不相容
  • 叢集系統
  • 有持久性的資料儲存: 最怕遇到資料不對

金絲雀部署 (Canary Deployment)

逐漸將生產環境流量從版本A切換到版本B。通常流量是按比例分配的。
例如90%的請求流向版本A,10%的流向版本B,轉移的趴數是根據的自己團隊去評估,趴數低不一定能及時看到狀況,如果不想等這麼久就切大量。

適合使用的時機:

  • 連線是無狀態
  • 沒有 QA 進行測試
  • 服務之間的流量和延遲問題
  • 大量併發時才會發現的問題

Original Deployment

  • 使用 Route53 做 domain weight 導流

Deployment 有版本控制,跟 ReplicaSet 功能做分工

Componnet relationship

Object Type 對應比例
Pod: Container 1:N
Replicaset: Pod 1:N
Deployment: ReplicaSet 1:N
Deployement: Service 1:N
Service: Deployment 1:N

Kubernetes Canary Deployment
可以藉由調整Replicas的方式去調整服務流量的百分比
e.g. 舊的deployement => replicas: 3
新的 => replicas: 1 => 25% new

service 的設定不需要改動

透過給予不同 Label,讓 Service 導向到不同的 Deployment ,以此來做到金絲雀部署:

Deployment (Stable Version)
labels:
 - app: guestbook
 - tier: frontend
 - track: stable
Replicas: 9

Deployment (Canary Version)
 - app: guestbook
 - tier: frontend
 - track: canary
Replicas: 1

Service 
selector:
 - app: guestbook
 - tier: frontend

Questions

  • 3 deployment 2 service => Canary deployment
  • 每次都要改 yaml 再執行 kubectl 指令
  • 操作指令很複雜,會不會下錯指令

Github -> Jenkins -> Helm -> Kubernetes
Helm upgrade chart 變更即可,數值的改動由Jenkins進行

Conclusions

  • Script can do, Jenkins can do, either.
  • Don't be limited by tools to do CI/CD
  • Useful approach is the most important.
  • 站穩腳跟再往前

Questions

  • Preprod : production = 1:1?
    => 機器關掉會不會導致上版受影響?
  • 流量還是沒有平均分配?
    => 有些服務如 Redis,一但建立好連線就不會重新連了,因此可能會有不管怎麼改比例,流量都不會平均分配的情況。
    => 程式的連線建議用暫時的 cache 就好,三不五時問新的連線(rabbitmq 等等)
    => 也可以使用 Consul 做 Service Discovery
  • 為何不用ingress ?
    => AWS 上 type = LoadBalancer 即可,用一個LB屬於一個服務的方式在查問題比較好查,但是成本會比較高
  • 為何不用 Jenkins Pipeline ?
    => 工程師學習成本,Jenkins 掛掉的風險

Supplement

Multiple redis servers in one VM?

Q&A

  • 新版本的服務,Log 如何追蹤?
    => 在 ES(Elaticsearch) 裡面可以用Image去看,但不推薦使用 Image Tag 去區分,因為 Image Tag 可能會隨時一直改變,最好改用固定的 Tag 來監控。

方式: 在yaml的區段會下ENV: staging/prod等tags, 可以透過filter搜尋的方式去找尋是哪個stage的log

  • 切換的比例如何拿捏?
    => 用算的,因為每個服務的Pods不見得數量相同。例如50% 25%

Session 2. Massive Bare-Metal Operating System Provisioning Improvement - Date Huang (Engineer, Edgecore Networks)

Focus on Bare Metal Deployment

Unitcast:

  • Maas: Metal as a Service

Broadcast

Multicast

Problems need to be resolved

  • Unicast is Hard to scale up/out
    => Server bandwith is not enough
  • Multicast/Broadcast is NOT always avaiable
Solution: BitTorrent

[Peer to Peer]

  • Peer can be sender too
    • Reduce the stress of origin sender
  • Temporary Storage is required
    • Can't assure data order
    • Image size will be Limit by RAM

Solution: EZIO

  • EZIO design

    • offset and length are record in file
    • "Piece" is minimum unit in bittorrent
    • cut continuous block into piece
  • Lagacy approach

    • One server stuck, all servers stuck.
  • BitTorrent Deployment(EZIO)

    • Heterogeneous Environment 表現優異
    • 就算一台Server死了,還是可以從其他的Server拿資料

搬資料大賽

CloneZilla

下載地址

QA

  • 有用BT, how to track?
    private tracker
  • For efficiency, how to synchronize data
    Broadcast to everyone to let them know what a server has, and retrieve data by rarity.
  • 萬一要使用到大量的bare metals?
    國網中心大量的叢集電腦,先裝好一台,再把一台複製到大量的電腦
    multicast 狀況越來越多,才會有CloneZilla的誕生

Reference


Session 3. SDN演義 - 初出茅廬 (1/n) - JianHao Chen (SDN 說書人, Edgecore Networks)

代表ONF

Open Networking & SDN

Switch / Router Software - CP
-------------------------------  SDN
Switch / Router Software - DP
-------------------------------  Open Networking
Switch / Router - Harware

OpenFlow: Enabling

Datacenter to Datacenter => B4 WAN
Cloud Edge to Datacenter => B2 WAN

Google B4 Network: openflow的商業化
SD WAN (software define WAN)

邱牛的B4心得: https://www.hwchiu.com/b4-after.html
(講者說有不對的地方可以打臉)

SDN 戰國時代

  1. SDN Controller - 2010

NOX-Controller :

  • C++
  • 可以在上面寫一些 C++ Application,然後使用該 Application 去 call OpenFlow API。

POX-Controller :

  • Python
  • 與 NOX 類似,只是改用 Python 撰寫。
  1. Floodlight - 2011 ~ 2012mid,
    REST API
    Programmed by JAVA

  2. Ryu - 2011
    Programmed by Python, performance issue, latency, more complicated than NOX and POX.

  3. OpenDayLight - 2013

    • Support OpenFlow included CISCO, Juniper(?)
    • Support Restful API
    • Service Abstaction Layer, 很好開發
    • OpenStack 介接
    • Centralize - controller掛掉之後, switch都不做事
  4. ONOS - 2014

    • Distribute Architecture
    • Support failover mechanism. (The other controllers will take over the works of malfunctioned controller.)

Data Plane

  1. Virtual Network concept

  2. TTP/NDM model - 2012

  3. OF-DPA - 2014

    • models 的設計
    • OF-DPA Premium for support effort, limited 100 machines(business case)
  4. FlowVisor - 2012

    • 中間使用一層 Proxy Controller, Proxy Controller 對 SouthBound 來說是Controller,但是對 NorthBound 來說只是一個 Switch,用這種欺騙的方式。
  5. VeRTIGO - 2012

    • Topology 的抽象化 A-B-C => 只看得到 A-B
  6. OpenVirteX - 2014

    • Use virtual mac address
    • Edge Switch Concept, host switch mac address 轉換
  7. VTN - 2013

    • OVS Switch
    • Overlay and Underlay Network 的串連

問題來了

晶片如何設計等等TBD

Q&As

Lightening Heavy Talk 1 . LINE DevDay - Kyle Bai

LINE Dev Day

Kubernetes Development Survival kit

  • 如何模擬線上的Developing Workflow情境
  • Speed up deploy image flow

K8s local

  • Minikube:
    • Pros: based on VMtons of options for customizing the cluster
    • easily run different k8s versions, container runtimes, and controllers
    • VM drivers 選擇
  • KIND: (Kubernetes IN Docker) 最後選擇
    • Fast to start up
    • Kubernetes SIGs
    • Used to test Kubernetes itself.
    • https://github.com/kubernetes-sigs/kind
    • Can run in most CI environments (Travis CI, CircleCI, etc.)
    • Cons: Push image to cluster very slow.
  • MicroK8s:
    • Pros: 不需要以VM的方式執行
    • container-only
    • Linux-only
  • K3D(K3s) - (Inside a Docker container):
    • K3D runs K3s
    • lot of optional and legacy features removed.
    • lightweight K8s distro

KIND 好處

  • Customized KIND - node image
  • Customized KIND - config

Observation & Debug

  • kubectl port-forward
  • kubefwd
  • kubebox - check cluster status
    • A terminal UI to access cluster
    • access k8s easily
  • kubespy

Solution

  • Draft
  • Garden
  • Skaffold 較多
    • workflow:
  • Tilt

Q&A

用什麼東西監控?

  • Prometheus 較多
  • CloudWatch
    Pod & Pod 之間的保護
  • VPC

Lightening Heavy Talk 2. Kubeflow v0.7 Multi-user Auth with Istio - Jack Lin

What is Kubeflow ?

How to train AI models on K8s

Service

可以使用 KubeFlow 直接開啟 Jupyter Notebook

Multi-users Isolation

Design Decision

  • Istio:
    • 讓K8s對於網路的控制更方便
    • HTTP Routing
    • RBAC authorization

High Level Service Solution

  • Istio Gateway
  • Envoy filter -> OIDC AuthService -> 3rd party auth company -> Idp(LDAP)

Profile Controller

Component LINK