2023-8-6開發日誌

# 2023-8-6 ## LINEbot: * 歌:我愛他 --- ### 經過 1. event message text 上面應該是從某個網站抄下來的就是一個會模仿講話的line bot Flask就像是一個app負責處理line傳來的訊息在line的webhook裡面設定從render複製過來的網址再加上```"callback"```(因該是為了防止混淆request的位置) 剩下的註解都是用chatgpt寫的不然我也不會 ```python= from flask import Flask, request, abort from linebot import LineBotApi, WebhookHandler from linebot.exceptions import InvalidSignatureError from linebot.models import * import os app = Flask(__name__) line_bot_api = LineBotApi(os.environ['CHANNEL_ACCESS_TOKEN']) handler = WebhookHandler(os.environ['CHANNEL_SECRET']) @app.route("/callback", methods=['POST']) def callback(): signature = request.headers['X-Line-Signature']#是用來驗證請求的有效性 body = request.get_data(as_text=True)#獲取 HTTP 請求的資料主體（body）以文字格式解析後續處理方便 app.logger.info("Request body: " + body)#請求的資料主體寫入 Flask 應用程式的日誌，方便後續查看 try: handler.handle(body, signature)#資料主體和簽名傳handler處理 handler是WebhookHandler 物件處理 LINE Bot 收到的事件。 except InvalidSignatureError:#簽名驗證異常 abort(400) return 'OK' @handler.add(MessageEvent, message=TextMessage) def handle_message(event): message = TextSendMessage(text=event.message.text)# 將回覆的文字訊息包裝成 TextSendMessage 物件 line_bot_api.reply_message(event.reply_token, message)# 使用 line_bot_api 回覆訊息給使用者 import os if __name__ == "__main__": port = int(os.environ.get('PORT', 5000)) app.run(host='0.0.0.0', port=port) ``` --- 因為我的目標是如果開頭有梗圖支援四個字就要去找梗圖所以我要辨認訊息內容本來是所以要寫： ```python= if message.text.startswith("梗圖支援"): ... ``` --- 但是我如果要有不一樣的回覆內容的話我也不能直接寫 ```python= line_bot_api.reply_message(event.reply_token,"蛤") ``` --- 因為它的格式要求要是```text```這個資料型態所以改成把reply_text當作是要回復的內容的變數在經過 ```python= message = TextSendMessage(text=reply_text) ``` 把它轉成text型態而本來message的位置就用它本來的```event.message``` --- 結果就是: ```python= from flask import Flask, request, abort from linebot import LineBotApi, WebhookHandler from linebot.exceptions import InvalidSignatureError from linebot.models import * import os app = Flask(__name__) line_bot_api = LineBotApi(os.environ['CHANNEL_ACCESS_TOKEN']) handler = WebhookHandler(os.environ['CHANNEL_SECRET']) @app.route("/callback", methods=['POST']) def callback(): signature = request.headers['X-Line-Signature']#是用來驗證請求的有效性 body = request.get_data(as_text=True)#獲取 HTTP 請求的資料主體（body）以文字格式解析後續處理方便 app.logger.info("Request body: " + body)#請求的資料主體寫入 Flask 應用程式的日誌，方便後續查看 try: handler.handle(body, signature)#資料主體和簽名傳handler處理 handler是WebhookHandler 物件處理 LINE Bot 收到的事件。 except InvalidSignatureError:#簽名驗證異常 abort(400) return 'OK' @handler.add(MessageEvent, message=TextMessage) def handle_message(event): if event.message.text.startswith("梗圖支援 "): # 如果使用者傳來的訊息開頭是 "梗圖支援 "，則回覆 "蛤" reply_text = "蛤" elif event.message.text.startswith("語錄支援"): reply_text = "還很笨" else: # 否則回覆使用者傳來的訊息內容 reply_text = event.message.text # 將回覆的文字訊息包裝成 TextSendMessage 物件 message = TextSendMessage(text=reply_text) # 使用 line_bot_api 回覆訊息給使用者 line_bot_api.reply_message(event.reply_token, message) import os if __name__ == "__main__": port = int(os.environ.get('PORT', 5000)) app.run(host='0.0.0.0', port=port) ``` 2. 開始爬蟲 #### 動機已經可以接受line端的指令了接下來要做的就是在接到指令之後去爬蟲這部分我找了下梗圖網站覺得迷因倉庫的齊全度真的低(或是我不會用) 所以就想用估狗搜圖一開始AI的建議是: ```python= import requests from bs4 import BeautifulSoup def get_image_urls(search_query, num_images=10): # 構造 Google 圖片搜尋的 URL url = f"https://www.google.com/search?q={search_query}&tbm=isch" # 偽裝為瀏覽器發送請求，避免被阻擋 headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"} # 發送請求並獲取回應內容 response = requests.get(url, headers=headers) response.raise_for_status() # 解析 HTML 頁面 soup = BeautifulSoup(response.content, "html.parser") # 儲存圖片 URL 的列表 image_urls = [] # 遍歷所有 <img> 標籤 for img in soup.find_all("img"): # 如果已經獲取到指定數量的圖片 URL，則停止遍歷 if len(image_urls) >= num_images: break # 提取圖片的 URL img_url = img.get("src") # 確保 URL 是有效的以 http 開頭 if img_url and img_url.startswith("http"): # 將圖片 URL 加入列表 image_urls.append(img_url) # 返回圖片 URL 的列表 return image_urls if __name__ == "__main__": # 設定搜尋關鍵字和要抓取的圖片數量 search_query = "請支援收銀" num_images = 5 # 獲取圖片 URL 的列表 image_urls = get_image_urls(search_query, num_images) # 列印出每個圖片的 URL for idx, img_url in enumerate(image_urls, start=1): print(f"Image {idx}: {img_url}") ``` 但是我抓到的都是很糊的縮圖畢竟網址開頭都一樣 ![](https://hackmd.io/_uploads/H1XCWonj3.png) ![image alt](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRupnZE0NrroI3jp1wKa3qD29nMLRVbFn9fMOIAdCQ5asQIwaJKbLKGTXLqKg&s)就很糊 --- 所以就要嘗試著抓原圖看了很久都者不到有什麼適合爬的後來在建中DC問了大佬幫我找到方法 3. 二次爬蟲 ![](https://hackmd.io/_uploads/rkRU7sns2.png) ![](https://hackmd.io/_uploads/Hkiw7onj3.png) 但還沒有完全寫出來