專題製作QA

# 專題製作QA 1. 同學如有卡關，可於下方留言提問，並留下程式碼及相關錯誤訊息，請先以互相討論方式嘗試解決。 2. 有解決的問題請註明 ==(已回覆)== ，未解決的問題會統整後請老師協助解答。範例 : 4/10 Python專題 Q.輸入總是會出現錯誤訊息? ``` 卡關程式碼 ``` A.主要是因...... ==(已回覆)== --- ### QA 05/02 python 爬蟲 Q:請問NewsAPI有支援ETtoday、風傳媒、中國時報和聯合新聞網嗎?因為從newsapi.get_sources()沒有顯示這幾個網站，甚至在打 ``` domains='ettoday.net, udn.com, storm.mg, chinatimes.com', ``` 或 ``` sources='ettoday.net, udn.com, storm.mg, chinatimes.com', ``` 會抓不到資料？謝謝 --- 4/24 Python爬蟲專題 Q.請問JWT應該是要正確登入後才可以取得嗎？(正常token應該是正確登入才會有) 請問可以登入的帳密是什麼？ A.==(已回覆)== JWT 應該要登入後才可以取得。請先存取 ( request) 範例中的註冊（SIGNUP) API，註冊後再進行登入，你就會得到 JWT. --- 4/24 Python爬蟲專題 Q.請問JSON response 500是正確的嗎？ ``` headers = {'Accept': 'application/json'} r = requests.get(logninAPI, headers=headers) console_logs = self.driver.get_log("browser") #result = self.driver.page_source return (console_logs, r, True) ``` 結果如下： ``` stephanieyu@youxinyudeMacBook-Pro VS_Code % /usr/local/bin/python3 /Users/stephanieyu/Desktop/VS_Code/pra ctice/python/hm_autotestbot.py signup console log: [{'level': 'WARNING', 'message': "security - Error with Permissions-Policy header: Origin trial controlled feature not enabled: 'interest-cohort'.", 'source': 'security', 'timestamp': 1682306498689}, {'level': 'SEVERE', 'message': 'https://yillkid.github.io/favicon.ico - Failed to load resource: the server responded with a status of 404 ()', 'source': 'network', 'timestamp': 1682306499138}] signup json <Response [500]> lognin console log: [{'level': 'WARNING', 'message': "security - Error with Permissions-Policy header: Origin trial controlled feature not enabled: 'interest-cohort'.", 'source': 'security', 'timestamp': 1682306507572}] login json <Response [500]> Database exists! Table exists! ``` A.==(已回覆)== 目前使用的是 HTTP get method,根據提供的前端資訊，他必需要使用 POST method --- 4/24 Python爬蟲專題 Q.請問在Frontend crawler註冊後沒有顯示log資訊，是不是要寫個程序要log？是跟瀏覽器chrome要log嗎? 類似使用get_log()? A.==(已回覆)== 練習題答案是不需要作到這麼複雜，連結為註冊的 [code](https://gist.github.com/yillkid/f53322b9af4dd07e66c56a2d0e51d2ce) **請注意第 26 ~ 27 行，瀏覽器的 lconsole.log 其實來自於 API 的回傳值，例如： obj.token** - 如想使用get_log() 提示如下 : 如果想透過爬蟲，直接抓瀏覽器的 console.log [範例請見](https://gist.github.com/yillkid/7430a2ec15597b26885e97089417715e) --- Q:(承上題)請問如果不需要用到get_log()的話，是要用beautfiulsoup取得 API 的回傳值嗎? A.是的，如果不抓瀏覽器的 console.log ... 可以透過 beautifulsoup 獲取 API response ==(已回覆)== --- Q:請問如何改寫存取格式?[問題連結](https://hackmd.io/@WVuRUYslRZm9PNDSsjjDiA/Hk_lhlNmn) A. 請參考我們之前的 workshop 解答，我把重點節錄在下面 ==(已回覆)== ```python= for obj in SetObjs: # 尋找所有 h3 tag list_title = obj.findChildren("h3") # 尋找所有 p tag list_content = obj.findChildren("p", recursive = False) for index in range(0, len(list_title)): # print(list_title[0].text) # print(list_content[0].text) # 插入資料庫 insert_data(list_title[0].text, list_content[0].text) ``` --- Q:為什麼註冊輸入完按下button會出現Error,然後一直抓不到JWT (Q),登入看起來正常,可是另外用一個新帳號試註冊還是出現Error [code](https://hackmd.io/@super0selina/BkKHr1472) ![](https://i.imgur.com/Cc4aj2y.png) 下面的圖是第一個print(entry) 出現的結果 ![](https://i.imgur.com/HAOJ06Q.png) A: 我剛剛測試，你的 code 是正常的. ==(已回覆)== ![](https://i.imgur.com/wWPPIJl.png) --- Q:卡住了，請幫忙確認哪裡有問題，謝謝 [屍體](https://docs.google.com/document/d/11-R34MUUNVp4cb0vNtl5rkUYFen29KX-/edit?usp=share_link&ouid=113139451357057309811&rtpof=true&sd=true) A: 前端與後端的來源其實都是同一個，如以下程式碼 ==(已回覆)== - 這邊是獲取 JWT - console.log("Get JWT from cookie " + obj.token); - 這邊是顯示錯誤訊息 - console.log(thrownError); - 請參考程式碼 - https://github.com/yillkid/ntc-python-crawler-workshop-frontent/blob/main/signin.html 所以目前有 2 個管道可以拿到 log 1. 直接從 API response 獲取，請參考上例 2. 從 selenium get_log 函式拿到 console.log