--- title: "Design Foundational Data Engineering Observability - Shuhsi Lin" tags: PyConTW2025, 2025-organize, 2025-共筆 --- # Design Foundational Data Engineering Observability - Shuhsi Lin {%hackmd L_RLmFdeSD--CldirtUhCw %} <iframe src=https://app.sli.do/event/rnWFcQGCk16ePjNXW7fsQc height=450 width=100%></iframe> :::success 本演講提供 AI 翻譯字幕及摘要,請點選這裡前往 >> [PyCon Taiwan AI Notebook](https://pycontw.connyaku.app/?room=aVOkUKV4oXw42GbzXgY2) AI translation subtitles and summaries are available for this talk. Click here to access >> [PyCon Taiwan AI Notebook](https://pycontw.connyaku.app/?room=aVOkUKV4oXw42GbzXgY2) ::: > Collaborative writing start from below > 從這裡開始共筆 ### Start with Basic - Stories in Smart Pizza & AI - Common Data Engineering Challenges - Complex data pipeline - From Monitoring to Observability - Monitoring Focus - WHERE the issue is - WHEN & WHAT - Observability Focus - WHY it happened - WHY & HOW - Observability and Data Observability - 5 Pillars of Data Observability - Freshness、Volume、Distribution、Schema、Lineage - Key Focus Areas of Data Observability - Infrastructure - Hardware & services runing pipelines - ### How to DO - Data Observability Design Patterns - Flow Interruption Detector - Meta Data 的收集是重要的 - Skew Detector - 異常流量發生時要被監控 - 用 comparison window - Lag Detector-Monitoring Latency - MAX & P90/P95 - SLA (Service Level Agreement) - Dataset Tracker (Data lineage ) - Fine-Grained Tracker - Data Quality Design Pattern - Quality Enforcement - AWAP (Audit-Write-Audit-Pulish) - 避免有問題資料被儲存 - Schema Consistency - Schema Versioning - Schema Migrator - The Evolution of Data Observability ### Reference - [Data Engineering Desian Patterns](https://www.oreilly.com/library/view/data-engineering-design/9781098165826/) - [Slide 投影片](https://speakerdeck.com/sucitw/design-foundational-data-engineering-observability) Below is the part that speaker updated the talk/tutorial after speech 講者於演講後有更新或勘誤投影片的部份