被窩好舒服
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.

      Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Explore these features while you wait
      Complete general settings
      Bookmark and like published notes
      Write a few more notes
      Complete general settings
      Write a few more notes
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Python 爬蟲結合 LINE Notify 推播591租屋網資訊 ###### tags: `Python` `LINE Notify` `GitHub` `Heroku` ## 緣起 感謝[超簡單一鍵推播 591 租屋資訊完全免 Coding-透過 Google Sheet 與 LINE Notify](https://ithelp.ithome.com.tw/articles/10255573)這篇文章,讓我有靈感及動力嘗試做一個LINE推播! 也很剛好租屋也快到期,但總是會忘記或懶得每天上591看有沒有合適的房子,於是專屬於我需求的LINE Notify通知就此誕生。 一直以來都很想寫些技術文章,或是將自己的筆記整理成一篇文章,剛好也透過這次來練練手。 ## 目標 1. 抓取591租屋網符合條件且為3小時內更新的資訊 2. 每三小時執行一次程式,並透過LINE Notify推播更新的資訊 ## 建置環境 - Python 3.9.2 - GitHub - Heroku ## 申請 LINE Notify 權杖 [LINE Notify](https://notify-bot.line.me/zh_TW/)是LINE官方的帳號,只要與其他網路服務完成連動設定,就可透過LINE Notify發送訊息。 登入後,進入個人頁面。 ![登入頁面](https://i.imgur.com/mt0g5kN.png) 點選發行權杖。 ![發行權杖](https://i.imgur.com/1WrBEaT.png) 填寫LINE Notify名稱及要連動的群組或選擇透過一對一聊天發送,完成後點選發行。 <span style="color:#B5495B">如果是與群組連動,記得要將LINE Notify邀請至群組。</span> ![輸入資料](https://i.imgur.com/ARuoCDM.png) 發行成功,會給你一串權杖,將這串權杖記錄起來。 ![獲得權杖](https://i.imgur.com/lkDsoF5.png) 連動成功,如果有 LINE Notify 的好友,LINE會通知連動設定完成。 ![LINE通知連動成功](https://i.imgur.com/hSGXXp4.png) 成功連動的服務,可於個人頁面**已連動的服務**察看到 ![連動成功頁面](https://i.imgur.com/uRrAaL1.png) ## Python 爬蟲 LINE Notify設定好之後,就開始爬蟲吧! 目標:爬取591租屋資訊! 進入[591租屋](https://rent.591.com.tw/?kind=0&region=8)頁面,依照自己需求設定條件。 ![591網頁](https://i.imgur.com/2ikZ5Ih.png) F12 開啟開發者工具,到 Network 頁籤,找到想要爬取的頁面。 ![F12](https://i.imgur.com/D589Etp.png) 從 Headers 頁籤查看要抓取的 Url 及 Request Haders。 ![Headers](https://i.imgur.com/DB9FDjB.png) 利用 Request 套件建立HTTP請求。 ```` Python # 取得591租屋資訊 import requests #要抓取頁面的Url url = "https://rent.591.com.tw/?kind=0&region=8&section=98,102,101,99,100&rentprice=0,15000&pattern=2&order=posttime&orderType=desc" #自訂 Request Headers headers = { "Accept" : "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", "Accept-Encoding" : "gzip, deflate, br", "Accept-Language" : "zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7", "Connection" : "keep-alive", "User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36", "Upgrade-Insecure-Requests" : "1", "Cache-Control" : "max-age=0", "Host" : "rent.591.com.tw", "Cookie" : "urlJumpIp=8; urlJumpIpByTxt=%E5%8F%B0%E4%B8%AD%E5%B8%82; is_new_index=1; is_new_index_redirect=1; T591_TOKEN=0mgp6gnmca0m1aes0a653qpk76; _ga=GA1.3.1853129893.1614755590; tw591__privacy_agree=0; _ga=GA1.4.1853129893.1614755590; _fbp=fb.2.1614755592267.503379817; new_rent_list_kind_test=0; _gid=GA1.3.990458239.1615170698; _gid=GA1.4.990458239.1615170698; webp=1; PHPSESSID=ugspv0rqvnetihun53ane0jlc4; XSRF-TOKEN=eyJpdiI6ImloZzR5Qm9SRk1XNVd4bmJ2VG8zNUE9PSIsInZhbHVlIjoiSExCSnRITEZjSE8rWktjVEptSnlEd1AxNEs1cHRcL1dEYktOR0dvUUNwdU9vNVVPUHlaK3UyXC9pOWpCVElxV0JJdzZGWFF0bytcL3MrSGNGSlpyQk96OGc9PSIsIm1hYyI6IjQ5NDgzZjc1YWExYTkyZDQ2YWRjZWQwZDI5YTIwODZhMTJkYzNlMmZiYzUwNmZmMzY2YjNhZjQ4NGI4OGY2NjMifQ%3D%3D; 591_new_session=eyJpdiI6ImpYUE9QWDJWYVwvaVlJc3dUK0ZiY3h3PT0iLCJ2YWx1ZSI6ImVMYnpSQ2ZhNG9VZHNSdWZNMjZTSG5nUTZOaWZlZ05kQkRXVkNLZDAxQlBqWWJneXVZbXZEWmd6SVRrMU5ZbGtrOU9tVG9RZm1CM2ZKUnNYQVlJaTNRPT0iLCJtYWMiOiIwN2UzODgzYWE0OGM2YTlkMDI1YTVjYjkzNmUyYWJiMzA5M2JmN2M0M2Q4NDQ1ODhlYTZkM2E3NzFkMjVjMWZlIn0%3D" } response = requests.get(url=url, headers=headers) print(response) #### 產生結果 #### <Response [200]> #### Response Status Code 200,代表成功 ```` 引用 BeautifulSoup 傳入回傳的HTML。 ```` Python from bs4 import BeautifulSoup soup = BeautifulSoup(response.text, "html.parser") # 輸出排版後的HTML print(soup.prettify()) ```` 執行結果片段 ``` html <div class="sub-nav-list nav-wide"> <dl> <dt> 待租房源 </dt> <dd> <a google-data-stat="頭部導航_租屋_所有房源" href="//rent.591.com.tw"> 所有房源 </a> <a google-data-stat="頭部導航_租屋_房東出租" href="//rent.591.com.tw?shType=host"> 房東出租 </a> <a google-data-stat="頭部導航_租屋_整層住家" href="//rent.591.com.tw?kind=1"> 整層住家 </a> <a google-data-stat="頭部導航_租屋_地圖找房" href="//rent.591.com.tw/map-index.html"> 地圖找房 </a> <a google-data-stat="頭部導航_租屋_獨立套房" href="//rent.591.com.tw?kind=2"> 獨立套房 </a> <a google-data-stat="頭部導航_租屋_捷運找房" href="//rent.591.com.tw?mrt=1"> 捷運找房 </a> <a google-data-stat="頭部導航_租屋_分租套房" href="//rent.591.com.tw?kind=3"> 分租套房 </a> <a google-data-stat="頭部導航_租屋_學校找房" href="//rent.591.com.tw?school=0"> 學校找房 </a> <a google-data-stat="頭部導航_租屋_雅房" href="//rent.591.com.tw?kind=4"> 雅房 </a> </dd> </dl> </div> ``` 成功取得 HTML 結構後,就可以繼續利用 BeautifulSoup 擷取所需的資訊。 ![網頁呈現](https://i.imgur.com/fF9CiJU.png) ![F12程式碼](https://i.imgur.com/yI7SH8O.png) ```` Ptython #正則 import re # 取得 <ul class="listInfo clearfix"></ul> 內所有元素 listInfoUl = soup.find_all("ul", class_="listInfo clearfix") num = 0 for ul in listInfoUl: # 照片 img = ul.find("img").get("data-original") # 標題 title = ul.find("a").getText() # 詳細資訊的 URL detailUrl = ul.find("a").get("href") # 價格 price = ul.find("div", class_="price").getText().strip() # 簡易說明 wordDetail = '' for de in ul.find_all("p", class_="lightBox"): wordDetail = wordDetail + " | " + de.getText().replace(" ", "").replace("\n", "") # 更新時間點 for up in ul.find_all("em"): pattern = re.compile('更新') if len(pattern.findall(up.getText())) > 0: uptime = up.getText() #印出擷取結果 print( 'title: ' + title + ", " + 'img: ' + img + "," + 'detailUrl:' + detailUrl + ", " + 'price: ' + price + ", " + 'detail:' + wordDetail + ", " + 'update' + uptime ) print("--------------") ```` 執行結果片段 ![執行結果片段](https://i.imgur.com/4Xo3uCC.png) 取得3小時內更新的內容,並製作LINE要顯示的內容。 ```` Python #表情符號 import emoji #取得3小時內更新的內容 pattern = re.compile('小時內更新') if len(pattern.findall(uptime)) > 0: pattern = re.compile('(.*)(?=小時)') hours = re.search(pattern, uptime).group(1) if int(hours) <= 3: #LINE訊息 msg = emoji.emojize('\n小幫手來啦~ :relaxed: \n租屋網更新資訊啦! :boom: \n :mega: ', use_aliases=True) + title + emoji.emojize('\n :dollar: ', use_aliases=True) + price + emoji.emojize('\n :memo: ', use_aliases=True) + wordDetail + emoji.emojize('\n :alarm_clock: ', use_aliases=True) + uptime + emoji.emojize('\n\n :tada: 看更詳細點↓網址 \n https:', use_aliases=True) + detailUrl #印出要傳送的LINE訊息 print(msg) print('-------------') ```` 執行結果片段 ![執行結果片段](https://i.imgur.com/sRa8XKu.png) 透過 LINE Notify API 送出訊息。 ```` Python #lineNotify設定 def lineNotifyMessage(token, msg, imgUrl): # hearders 這兩項必帶 # token 為 LINE Notinfy 申請的權杖 headers = { "Authorization": "Bearer " + token, "Content-Type": "application/x-www-form-urlencoded" } # message : 要顯示的文字 # imageThumbnail、imageFullsize : 要顯示的圖片 # stickerPackageId、stickerId : 貼圖 message = {'message': msg, 'imageThumbnail':imgUrl,'imageFullsize':imgUrl,'stickerPackageId':1,'stickerId':13} #透過 POST 傳送 req = requests.post("https://notify-api.line.me/api/notify", headers = headers, data = message) return req.status_code # 傳送LINE訊息 lineNotifyMessage("申請的權杖", msg, img) ```` 執行結果 ![LINE 執行結果](https://i.imgur.com/xwJfxRz.png) ## Git檔案至 GitHub 建立 Repository,輸入Repository Name,將權限設為公開(Public)。 ![建立Repository](https://i.imgur.com/GELqnfg.png) 將程式碼上傳至剛剛建立的Repository。 ``` RentHouseInfo.py #主程式 requirements.txt #告訴Heroku要安裝什麼套件 runtime.txt #告訴Heroku Python的版本 ``` ![Push至GitHub](https://i.imgur.com/Hr9z3xK.png) ## Heroku架設 [Heroku](https://www.heroku.com/)是一個平台即服務(PaaS),可自行在Heroku平台開發和佈署各種網站,它提供免費帳戶一個月一定小時的運行時間,使用量不大的話,覺得滿划算的。 建立帳號後,建立一個應用程式。 ![建立應用程式](https://i.imgur.com/yCn53kn.png) 輸入名稱。 ![輸入資料](https://i.imgur.com/iyYg1lM.png) 選擇 Deploy 頁籤,再Deployment method 選擇 GitHub,然後點選 Connect to GitHub 按鈕。 ![選擇頁簽](https://i.imgur.com/plAgpFn.png) 會跳出是否要授權Heroku與GitHub之間連動。 ![GitHub連動](https://i.imgur.com/Bwfi2ti.png) 完成連接後輸入你的Repository。 ![Repository連動](https://i.imgur.com/sZ7BL5C.png) 自動部屬。 ![自動部屬](https://i.imgur.com/FjmDCKK.png) 切換至Resources頁籤。 ![Resources](https://i.imgur.com/ynfuHWL.png) 在Add-ons區塊中搜尋Heroku Scheduler。 ![搜尋addon](https://i.imgur.com/hDmFspu.png) 新增Heroku Schedule Add-on。 ![新增Heroku Schedule](https://i.imgur.com/bkwDxGN.png) 新增完成後,點擊Heroku Scheduler,進入設定頁面。 ![Heroku Scheduler頁面](https://i.imgur.com/vxcItMP.png) 點擊 Create job 按鈕,建立一個工作。 ![建立一個工作](https://i.imgur.com/uzlXcng.png) 依照自己的需求設定。 ![設定工作](https://i.imgur.com/1o6yV79.png) <span style="color:#B5495B">Schedule只能設定每10分鐘、每小時即每天的某個時間點,且時間點為UTC時間,如果對於時間點明確需要的需自行換算時間。</span> 現在想要每三小時執行一次,所以設定每天某個時間點執行,間格為三小時。 ![設定每三小時執行](https://i.imgur.com/5DH2G7B.png) 設定完成後,等待時間到,觀察是否有正常執行。 ![測試執行1](https://i.imgur.com/rBhm7WB.png) ![測試執行2](https://i.imgur.com/m9ynTcw.png) ![測試執行3](https://i.imgur.com/P6Xg6e1.jpg) ## 參考資料 - [超簡單一鍵推播 591 租屋資訊完全免 Coding-透過 Google Sheet 與 LINE Notify](https://ithelp.ithome.com.tw/articles/10255573) - [使用LINE Notify發送訊息(Heroku+GitHub+Python)](https://rnnnnn.medium.com/%E4%BD%BF%E7%94%A8line-notify%E7%99%BC%E9%80%81%E8%A8%8A%E6%81%AF-heroku-github-python-9132ff9ebe1b) - [自建 LINE Notify 訊息通知](https://www.oxxostudio.tw/articles/201806/line-notify.html) - [LINE Notify 入門到進階應用(4) --- 傳送文字網路圖片到Line Notify 其他語言](http://white5168.blogspot.com/2017/01/line-notify-4-line-notify.html#.YD9CM9x-W01) - [Heroku - 自動執行python腳本](https://yeeinhole.github.io/2020/03/20/heroku-trelloXline/) - [Python emoji Packages](https://pypi.org/project/emoji/) - [Python emoji Charts](https://www.webfx.com/tools/emoji-cheat-sheet/)

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Google Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully