Kubernetes 網路的未來是什麼？從 KubeCon EU 2026 Telco Day 看 DRA 與 Multi-Network 的演進

# Kubernetes 網路的未來是什麼？從 KubeCon EU 2026 Telco Day 看 DRA 與 Multi-Network 的演進 :::danger ⚠ **注意**：本篇所提到的「AI」，指的是 AI/ML workload 對 Kubernetes 底層網路能力的需求，與 O-RAN RIC（智慧控制器）上的 AI 無直接關係。 ::: :::spoiler 目錄 [TOC] ::: ## 前言就在今天 2026 年 3 月 23 日，KubeCon + CloudNativeCon Europe 2026 在阿姆斯特丹舉行，其中 **Cloud Native Telco Day** 的一場 Panel 討論「**Kubernetes Networking: Present and Future**」格外精彩。來自 Red Hat、Google、Nokia 的四位 SIG Network 核心成員 Surya Seetharaman（Red Hat）、Antonio Ojea（Google）、Gergely Csatari（Nokia）、Lionel Jouin（Red Hat / 前 Ericsson）圍繞一個核心問題展開激辯： > **Kubernetes 的網路能力，究竟準備好服務電信等級的工作負載了沒有？**[color=#0284c7] 這場 Panel 加上會後的走廊技術深談，涵蓋了 DRA（Dynamic Resource Allocation）、DRANet、Multi-Network 標準化、Multus 遷移困境、以及 Services 擴展至 Secondary Network 的架構性難題。本文是我親自參與這場活動後，根據現場 Panel 逐字稿與會後討論記錄，結合線上資料交叉驗證後所整理的深度技術筆記。 > 畫重點：這場 Panel 最精華的部分，其實是那些講者們「不小心說出口的真話」— 那些在正式演講中不會出現，但在走廊上卻願意坦承的技術困境。[name=蔡秀吉][color=#0284c7] ## 一、Kubernetes 網路全景：高層與低層的兩個世界 Surya 在 Panel 開場時用一張投影片做了 level-set，我覺得這張圖很好地劃分了 Kubernetes 網路的「兩個世界」： ![image](https://hackmd.io/_uploads/rJT53RyoWg.png) ### 高層（High-Level Networking Constructs） | API / 工作組 | 說明 | |---|---| | **Services / EndpointSlices（端點切片）** | 最老牌的 K8s 網路 API，大家天天在用 | | **Network Policies / Cluster Network Policies** | 網路安全策略，Cluster Network Policies 是管理員版本 | | **Gateway API（閘道 API）** | 被稱為 Services 的演進版本，「maybe the future of Services」 | | **Agentic Networking WG** | 探索 AI / Agentic 解決方案對 K8s 網路核心能力的影響 | | **Gateway API Inference Extension WG** | 針對 AI 推論 session（長連線、高資源）的專屬擴展 | ### 低層（Low-Level Networking Capabilities） | 技術 / 工作組 | 說明 | |---|---| | **Network Plumbing WG / NAD** | Multi-networking 生態，NAD（Network Attachment Definition）是 telco 常用的 CRD | | **DRA（Dynamic Resource Allocation，動態資源分配）** | 解決「workload 需排程至特定資源所在節點」的機制 | | **DRANet** | 基於 DRA 的 Kubernetes 網路驅動程式，被定位為「the future of K8s networking」 | > 簡單來說：高層處理「流量怎麼路由」，低層處理「網卡怎麼掛上去」。而 telco 的戰場，在低層。[color=#F4B400] 這場 Panel 聚焦在**低層**，因為用 Surya 的原話來說 > 「AI made it cool, but let's be honest here. Telco has always wanted these capabilities for a really long time.」[color=#0284c7] 翻譯成白話：**Telco 對這些底層網路能力的需求，比 AI 早了不知多少年，只是 AI 來了之後，這些需求終於變「潮」了**。這句話讓現場的 telco 工程師們都笑了。但笑完之後，大家心裡其實都在想同一件事：**那既然需求一樣，能不能搭 AI 的順風車，讓這些能力更快進入 Kubernetes 上游？** ## 二、Network Interface：Telco 與 AI/ML 的共同語言 Antonio（Google, SIG Network Tech Lead）分享了一段很有意思的個人經歷。他從 MPLS / SDN / OpenStack 時代轉到 Kubernetes 後，發現一件事： > 「I asked, can you ping, can you traceroute? No answer.」[color=#F4B400] 在 Google 內部幫團隊 debug 效能問題時，他發現**所有問題最終都收斂到同一個東西 Network Interface（網路介面）**。同一時間，SIG Network 裡由 Gergely（Nokia）和 Lionel（Red Hat）推動的 multi-networking 討論，也在收斂到同一個抽象層：Network Interface。 :::info #### 為什麼 Network Interface 是共通抽象層？因為不管你是 AI/ML workload 想要 RDMA 高速網卡，還是 telco CNF 想要 SR-IOV VF，**你們本質上都在做同一件事：把一張網卡（device）掛進 Pod 裡，然後對它的拓撲屬性（NUMA affinity、PCI root complex）做最佳化**。 ::: 而恰好，NVIDIA、Intel 等硬體廠商已經在用 **DRA（Dynamic Resource Allocation）** 管理 GPU。Antonio 的洞見是： > 「DRA was managing devices or nodes, and the network interface is a device. So that's how we connected AI/ML and telco.」[color=#0284c7] **Network Interface 就是 Device。** 這個等式，是整個 DRA for Networking 故事的起點。 ## 三、DRA：從 GPU 到 Network Interface 的擴展之路 ### 為什麼 Device Plugin 不夠用？ Lionel 指出，目前 telco 生態大多使用 **Multus + Device Plugin** 的組合，但這個組合的控制粒度非常粗糙： > 「With Device Plugin we request like, okay, I want this NIC and that's it.」[color=#F4B400] 你只能說「我要一張網卡」，但你**沒辦法**說「我要一張跟我的 GPU 在同一個 PCI root complex 上的網卡，而且它的 NUMA node 要跟我的 CPU core 對齊」。 ### DRA 的現況 | 已覆蓋 | 計畫中 | |---|---| | GPU | CPU / Memory（native resources） | | FPGA | 更多網路能力（bandwidth / IP allocation） | | Network Interface（透過 DRANet） | | DRA 的核心價值在於**拓撲感知排程（Topology-Aware Scheduling）** 確保 FPGA + NIC + GPU 在同一個 NUMA node、同一個 PCI root complex（PCI 根複合體）上，以達到最佳效能。 > 我覺得這個概念對有過 DPDK 或 SR-IOV 經驗的網路工程師來說不陌生，但把它做進 K8s 的排程器（Scheduler）裡，這是一大步。[name=蔡秀吉][color=#0284c7] ### DRANet：被定位為 Kubernetes Networking 的未來 **DRANet** 是 Antonio 在 Google 內部孵化的專案，後來在 KubeCon NA 2025 捐贈給了 Kubernetes 組織（現位於 `kubernetes-sigs/dranet`）。它的架構非常乾淨： - 透過 **DRA API**（gRPC）與 kubelet 溝通 - 透過 **NRI**（Node Resource Interface，節點資源介面）與 Container Runtime 溝通 - 完全相容於既有的 CNI 外掛 :::info #### DRANet 的多雲支援進度（會後走廊情報）在會後的走廊討論中，Antonio 透露了 DRANet 的多雲支援進度： - **GKE（Google）**：已完成 - **Azure（Microsoft）**：本次 KubeCon 已加入 - **AWS（Amazon）**：Amazon 的人正在進行中他刻意把專案架構設計成**每個雲端供應商有獨立的資料夾**，避免功能膨脹。 ::: 但這裡有一個很重要的「但是」... Antonio 在 Panel 上坦承：「We do use it in production in very large clusters.」但同時，另一位觀眾也指出：「DRA is still also not on that level yet to be used confidently.」 > 翻譯：**DRANet 在特定場景下已經能用了（甚至在大型生產叢集），但 DRA 整體框架還沒有成熟到可以讓所有人放心替換 Multus。** 所以現階段是「共存」而非「取代」。[color=#F4B400] ## 四、Multi-Network 標準化：Bottom-Up 策略 ### 為什麼標準化這麼難？ Antonio 一語道破： > 「IPAM is different in each company, in each implementation. And this is the reason why we are so slow on our multi-network.」[color=#0284c7] 每家的 IPAM 都不一樣、每家的「network」定義也不一樣。Cilium 有自己的 first-class network 概念，OVN-Kubernetes 也有，大家各搞各的。 ### Bottom-Up 策略：先標準化 Interface SIG Network 目前採取的策略是**由下而上（Bottom-Up）**： 1. **先標準化 Network Interface**（已接近 GA） → 這是 AI/ML 和 telco 都能同意的最小公約數 2. **再往上長出 Services on secondary interface** → 這裡牽涉到 DNS、kubelet probes、Network Policy 等深度耦合 3. **最後才是完整的「VPC-like story in Kubernetes」** → 這是終極目標，但「it's super complex」目前正在開發的實驗性 API 是 **Network Class / Network Kind**——讓 Kubernetes 有辦法「認識」secondary network 物件，而不需要破壞既有的實作（如 Multus）。 > Lionel 個人判斷，這個 API 最終會走 **Gateway API 的路線**——以 third-party but official 的形式發佈，而非直接塞進 K8s core。[color=#F4B400] ## 五、Multus → DRA 的遷移路徑：目前沒有答案這是全場最真實（也最尷尬）的時刻。當觀眾問到「Multus 到 DRA 的遷移路徑是什麼？」時： Lionel 的回答是： > 「I don't really have any answer around the migration path.」[color=#0284c7] 他曾經嘗試用 **CNI-DRA driver** 來橋接兩者，但複雜度爆炸。原因在於：CNI 缺乏 validation 和 scheduling 的概念，而不同的 CNI（OVN-Kubernetes / SR-IOV / OVS / VPP）expose device 的方式各異，要做一個統一的遷移工具幾乎不可能。 Surya 隨後補充了一個重要的訊息： > 「The CNI API is never going away. It's there and it's there for use.」[color=#F4B400] **現場的即興投票結果：所有在場的人都還在使用 Multus。** > 這個訊號很重要。對於正在規劃 telco cloud 架構的團隊來說，**短期維持 Multus、中期為特定 workload 引入 DRANet、長期追蹤 Network Class API** 是比較務實的路線。[name=蔡秀吉][color=#0284c7] ## 六、Services on Secondary Network：為什麼這麼難？這是 Antonio 在 Q&A 環節最精彩的一段解釋。一位 Ericsson 的工程師 Roshni 問到：「DRA 能不能整合 Service framework？」 Antonio 花了好幾分鐘解釋為什麼這很難。我把它濃縮成一張因果鏈： > **你說「我只是想在 secondary network 上多一個 Service」** → 但 Service 綁定了 **discovery**（service account token 透過 Kube API Server 的 Service 取得） → 如果 secondary network **沒有 API server 的連通性**，pod 就會 **crash loop** → 然後你還要處理 **resolv.conf**（「a nightmare」，直接從 node 複製，multi-network 時不知道該提供哪個 DNS） → 再加上 **kubelet probes**、**Network Policy on multiple subnets**、**service CIDR IPAM 爆炸**...... [color=#F4B400] Antonio 的原話： > 「It's an open source project. It's not a standard. So it grew with some assumptions. And then try to decouple that to make this multi-dimensional network model. It's super complex.」[color=#0284c7] ## 七、會後走廊精華：NRI Required Plugin 與 DRA Driver 的時序難題 Panel 結束後，Antonio 和幾位觀眾在走廊上繼續了一段非常深入的技術討論。這段對話沒有投影片、沒有錄影，但技術密度極高。 ### 問題核心：DRA Driver 的 NRI Hook 時序競爭一位觀眾（自建 DRA driver 用於 AWS IPVLAN 的 telco 工程師）提出了一個棘手的問題：他的 DRA driver 需要透過 NRI（Node Resource Interface）在容器啟動前完成網路配置。但問題是　**他的 NRI plugin 可能「遲到」，趕不上 container runtime 的處理流程**，導致 race condition。他希望有一個機制能讓 kubelet 告訴 CRI（Container Runtime Interface）：**「這裡有一個 required NRI plugin，在它 hook 完成之前不要往下跑。」** Antonio 的回應很誠實： > 「We all hack. There are hacks that work, and there are hacks that are not sustainable.」 > 「When you have a very large number of nodes and something fails, who is going to find that there is a race between the kubelet and the runtime? That's impossible to debug.」[color=#0284c7] ### Antonio 對自建 DRA Driver 的態度：鼓勵！另一個讓我印象深刻的是，Antonio **積極鼓勵用戶自建輕量 DRA driver**，而非依賴一個「uber driver」： > 「If we try to create an uber driver that does a lot of things, we are going to end in an unmanageable project.」[color=#F4B400] 他甚至說：如果你的 driver 有通用價值，可以帶到 SIG Network，社群會投票接納。 > 你看！這跟 ONAP 當初「想一次攏包包凱（包山包海）」的思路完全相反。K8s 社群在 networking 這塊學到了教訓：**不要做 Uber Encapsulation，要做模組化、可組合的小元件。** 這跟我之前寫 Nephio 那篇文章的教訓二完全一致！[name=蔡秀吉][color=#0284c7] ## 八、Topology-Aware Routing：已 GA，但有陷阱這個知識點比較獨立，但很實用。觀眾問到跨 availability zone 的 data transfer 成本問題，Antonio 回覆： - **Topology-Aware Routing 自 Kubernetes 1.30/1.31 已 GA** - 可透過 Service API 指定流量留在同一 zone，節省跨 zone 傳輸費用 - 但他們**曾經嘗試自動化這個機制，結果「it was a disaster」**——traffic latency、scheduling race、實作複雜度都爆炸 > 現在的做法是：**把選擇權還給使用者**。你自己指定 zone，自己負責確保 scheduling policy 健全。如果你的 region 掛了，workload 就掛了。[color=#F4B400] :::info #### 教訓：自動化越多，透明度越少這不就是我之前在 Nephio 那篇文章裡提到的「教訓一」嗎？！任何形式的自動化都會帶來操作的不可見性。K8s 社群在 topology-aware routing 上也踩了這個坑，想幫使用者做太多自動決策，結果反而製造了更多問題。 ::: ## 九、新增協定（Protocol）的 Blast Radius 最後一個值得記錄的知識點：一位觀眾抱怨 Kubernetes Services 只支援 TCP 和 SCTP，不支援他需要的其他協定。 Antonio 解釋了為什麼新增一個協定這麼難： > 「Everything has a very large blast radius. The other value to the enum cascades to the EndpointSlices, to the Network Policies, to the other things, and then it may not work in RHEL 8.」[color=#0284c7] Surya 補充：Cluster Network Policies 的次世代 API 已經設計為**可擴展的**，為 ICMP 等協定預留了空間。但目前還沒加入。 > 簡單來說：想在 K8s Service 裡加一個新 protocol，不是改一個 enum 那麼簡單。它會像骨牌一樣倒下去，影響到 EndpointSlices、Network Policies、甚至特定作業系統的核心模組穩定性。這個「blast radius」的概念，值得每個做 K8s 二次開發的人記住。[color=#F4B400] ## 結語：Kubernetes Networking for Telco，到底 Ready 了沒？回到 Panel 的核心問題：Kubernetes Networking 的 Present 和 Future，到底走到哪了？我的答案是：**還沒，但方向對了。** 1. **Network Interface 的標準化**已接近 GA，這是 telco 和 AI/ML 的最大公約數 2. **DRANet 已在大型生產叢集運行**，且正快速擴展多雲支援 3. **Multi-Network 的 experimental API（Network Class / Network Kind）**　即將推出 4. 但 **Multus → DRA 的遷移路徑**還是一片空白 5. **Services on secondary network** 牽涉的耦合問題，短期內看不到乾淨的解法對於正在建構 telco cloud 的團隊，我的建議是： - **短期**：維持 Multus，它不會消失 - **中期**：為 AI/ML 或 high-performance workload 引入 DRANet 做 PoC - **長期**：追蹤 Network Class API 和 DRA for CPU/Memory 的 KEP 進度 - **如果你有特殊需求**：自建 DRA driver。Antonio 說了，社群歡迎你 > 最後私心 murmur：這場 Panel 讓我更加確信，**K8s 網路社群正在用正確的方式做困難的事**　不急著做出一個「萬用型」解法，而是先把最底層的 interface 標準化做好，再一層一層往上疊。這跟 ONAP 當年的教訓形成了鮮明對比。希望 Nephio 未來在整合這些能力時，也能秉持同樣的哲學。[name=蔡秀吉][color=#0284c7] --- #### 參考資料 - [Panel: Kubernetes Networking: Present and Future](https://colocatedeventseu2026.sched.com/) — Cloud Native Telco Day, KubeCon EU 2026 - [DRANet (kubernetes-sigs/dranet)](https://github.com/kubernetes-sigs/dranet) — GitHub - [The Kubernetes Network Driver Model (arXiv:2506.23628)](https://arxiv.org/abs/2506.23628) — Research Paper - [KEP-3698: Multi-Network for Pods](https://github.com/kubernetes/enhancements/pull/3700) — Kubernetes Enhancements - [Gateway API Inference Extension](https://gateway-api-inference-extension.sigs.k8s.io/) — Official Docs - [NRI: Node Resource Interface](https://github.com/containerd/nri) — containerd/nri GitHub --- <style> .author-card { display: flex; gap: 16px; align-items: center; background-color: #f7f0ff; border-radius: 12px; padding: 28px 20px; box-shadow: 0 0 0 1px #e0ccff; font-family: sans-serif; max-width: 720px; } .author-name { font-weight: bold; color: #5f3dc4; } .author-desc { margin-top: 8px; color: #444; line-height: 1.8; } </style> <div class="author-card"> <div> <div class="author-name">蔡秀吉 Tsai, Hsiu-Chi</div> <div class="author-desc"> 人生的收穫及樂趣，在於貫徹「胡椒鹽」原則（服務、教學、研究）<br> 而我人生的剩餘價值，則是帶給人世歡樂。 </div> </div> </div>