新人資訊
技術-後台類-azure function #3-寫入資料庫
上回已在 azure function 良好的架構下成功爬取網路資料,辛苦得來的當然要存起來囉。資料散在網路世界就如百鳥在林,存在自己精心設計的資料倉儲裡則是一鳥在手,要發展任何應用均少不了資料的儲存,資料的生命週期和各階段的結構轉換幾乎成為大部分的系統的主要構成元素,在此先來搞定資料寫入。
寫程式之前先建立資料結構,確認資料細度為每日一筆,建立以日期為 key 的簡單 table 如下。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
梳理一下程式結構,將爬蟲功能切開為獨立的程序,以精簡主程式的篇幅。同時把傳入參數簡化成為 YYYYMMDD
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
這樣主程式變成一行,預計把存入資料庫的功能變成一個程序 save_to_db(),接下來就可以專注在此程序的內在邏輯。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
基本的套件為 pyodbc, 必須進行 install and import,且在 requirements.txt 裡設好,這在前一篇已經實作過了。然後必須存在一個 SQL Server 實體資料庫資源,我也是把它建立在 azure 環境裡,在管理介面裡可以找到連線字串的以下四項關鍵資訊,關於資料庫的建立和管理是很大的議題,暫時無法在此詳述。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
接下來繼續一步一腳印逐步推進目標,先加入一行確認連線是否建立正確;若 connection string 稍有閃失這一行就會錯誤,試了幾種寫法之後確定已經成功。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
然後就是組出 SQL 字串對資料庫下達指令,為了讓此功能容許對同一天的資料重複執行,所以實際動作試先刪除再插入資料,寫完程式後出乎意料的一次就成功了,資料已經正確寫入,達陣。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
最後就是 deploy to azure,享受 always available 的快樂。
本來以為今天就一切順利,沒想到最後這一步又慘遭滑鐵盧,在 azure 的環境執行失敗!沒關係,這就是開發者的日常,系統的坑何止這些!趁機再學一下怎麼看 log,找到如下頁籤,在程式中埋下更多 logging,嘗試了不知道幾次!最終歸納出原因直指:ODBC Driver!我的本機是 windows 環境,azure function app 的底層是 linux 環境,非常有可能連線字串對應不到正確的版本。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
經過 這篇文章 的啟發,嘗試修改連線字串:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
注意是「17 版」喔,在 SQL Server Console 目前還提示著 13 版,真是無所不在的坑阿!總算還是過關了,試著傳入不同日期的參數,結果完全正確!開心。
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
自從三天前測通 azure function 以後,已經成功將爬蟲程式包進去,今天進一步將資料寫進資料庫,可以開始做夢了,一步一步打造自己的數據中台,下一篇將研究如何讓 function 定時自動化執行。
By Newman Chen 2022/3/3
參考資料:
https://medium.com/ccclub/ccclub-python-for-beginners-tutorial-533b8d8d96f3
https://thewindowsupdate.com/2021/12/12/how-to-connect-azure-sql-database-from-python-function-app-using-managed-identity-or-access-token/