Try   HackMD

From Data Pipeline to Airflow - the Obstacles in Our Migration - 莊鐵鴻

歡迎來到 PyCon APAC 2022 共筆

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

共筆入口:https://hackmd.io/@pycontw/2022
手機版請點選上方 按鈕展開議程列表。
Welcome to PyCon APAC 2022 Collaborative Writing
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Collaborative Writing Workplace:https://hackmd.io/@pycontw/2022
Using mobile please tap to unfold the agenda.

Slide link 投影片連結:
YouTube link 演講影片連結:TBA

Collaborative writing start from below
從這裡開始共筆

Below is the part that speaker updated the talk/tutorial after speech
講者於演講後有更新或勘誤投影片的部份

  • Q: 把舊系統轉移到 airflow 遇到的挑戰中,你覺得最值得再深入研究的是什麼?
    • 我對兩個部份特別感興趣:
      • CWL - Common Workflow Language。似乎是一種描述 workflow 的通用標準,Airflow 的 conference 裡有相關的演講。如果我們的 pipeline 可以用 CWL 描述,也許就能用現成的 App 透過使用者介面來建立了。
      • 執行時對 DAG Run 的掌握度所能帶來的好處。比如在執行 DAG Run 時才決定 EC2 的 region 的話是否有好處,這件事在 AWS Data Pipeline 上是無法控制的。
  • Q: What is your motivation behind the migration to Airflow? Did you consider other pipeline solutions, and how do they compare to Airflow (and AWS Data Pipeline)?
    • We heard that there will no more improvement on AWS Data Pipeline and we want more flexibility. So we decided to find alternative solutions.
    • I had to make the decision without much information. So I chose Airflow because of Python.
    • I did read some materials about Apache Nifi later.
      • Cons
        • Developed with Java and I do not like Java.
        • Looks like if we want something unusual, we have to speak Java, and I do not like Java.
        • Looks like Nifi focuses on data more while we currently focus on tasks more.
      • Pros
        • The web interface is nice.
        • The data flow builder is almost what I want for our recommender system.