Python / requests

Python / requests === ###### tags: `Python` ###### tags: `Python`, `requests`, `response`, `session`, `curl` [TOC] ## HTTP status code https://hackmd.io/L4FMNkarTtKU79Dp6_1iRg ## curl / python 對應關係 ### GET ```curl= curl 10.78.26.20:32586 ``` ```curl= curl -X GET 10.78.26.20:32586 ``` ```python= import requests response = requests.get('http://10.78.26.20:32586') print('status_code:', response.status_code) print('text:', response.text) ``` ### POST ```curl= curl -X POST \ 10.78.26.20:32586/predict \ -F file=@dataset/classification/penguin/input/test_x.csv \ -F INPUT_CSV_WITH_HEADER=true ``` 後端： ```python= import requests import json my_data = {'INPUT_CSV_WITH_HEADER': True} my_file = {'file': open('dataset/classification/penguin/input/test_x.csv', 'r')} response = requests.post( 'http://10.78.26.20:32586/predict', data=my_data, files=my_file) print('status_code:', response.status_code) print('text:', response.text) print('json:', json.loads(response.text)) ``` ## request / headers ### Content-Type > 用來指定 body 的內容類型，可以是 json 格式或是其他 ``` headers = { 'content-type': 'application/json', } ``` - Content-Type 是用來指示 data 的類型 - 若沒有定義 Content-Type，一般後端拿到的預設值為 `application/x-www-form-urlencoded` - 例如 flask.request.get_json() 會拿到 None - 因 json data 是 None，存取時就會引發 'NoneType' object is not subscriptable - key 不區分大小寫 - 'Content-Type' - 'content-type' ### Connection: keep-alive - ### request header 有 Connection: keep-alive 用途為何？如果沒加會如何？ [ChatGPT] Connection: keep-alive 是 HTTP 協議的一個標頭字段，用來指示客戶端與伺服器之間的連線是否應該保持開啟，以便在未來的請求中重用同一個連線。當使用 Connection: keep-alive 時，當一個請求完成後，TCP 連線不會立即關閉，而是保持打開的狀態，以便在稍後的請求中重用。這樣的好處包括： 1. 減少了建立和關閉 TCP 連線的開銷，因為連線被重用。 2. 提高了多次請求的效能，因為不需要重新建立連線。如果未提供 Connection: keep-alive，那麼預設情況下，每個 HTTP 請求都會在完成後將 TCP 連線關閉，需要重新建立連線。這樣的話，每個請求都要付出建立連線的成本，可能會影響效能。需要注意的是，現代的瀏覽器和伺服器通常默認啟用了 Connection: keep-alive，因此通常不需要手動添加。不過，如果有需要的話，你仍然可以明確地添加這個標頭字段。 - ### 如果 request 是 stream, header 會預設 Connection: keep-alive 嗎？ [ChatGPT] 當使用 requests 套件進行 HTTP 請求時，預設情況下，Connection: keep-alive 應該是啟用的。requests 套件會在每個請求中自動添加這個標頭，以保持 TCP 連線的開啟，從而提高效能。當你使用 stream=True 參數時，通常表示你希望以串流的方式處理響應內容，例如下載大型文件。在這種情況下，保持連線開啟可以確保整個響應數據能夠以串流的方式逐步下載，而不是一次性下載整個內容。總的來說，Connection: keep-alive 的使用在現代的 HTTP 請求中是相當普遍的，並且通常由庫和工具默認啟用。 - ### 如何將 keep-alive 設為關閉？ [ChatGPT] 若想在 requests 中將 Connection: keep-alive 設定為關閉，你可以使用 Session 物件並調整其 keep_alive 屬性。以下是一個範例： ```python= import requests # 創建 Session 物件 session = requests.Session() # 關閉 keep-alive session.keep_alive = False # 使用 Session 進行請求 response = session.get('https://example.com') # 在完成所有請求後，最好關閉 Session session.close() ``` 這樣做將會關閉該 Session 物件的 keep-alive 功能，每次請求都會強制使用新的連線。請注意，這樣的設置可能會影響效能，因為它將阻止連線的重用。在大多數情況下，保持 keep-alive 是一個良好的實踐，因為它可以節省連接時間和資源。 ## request / post ### stream=True/False - ### 實際功用 - 不管stream=True/False，都可以透過 `iter_lines()` 擷取 ```python if True: for line in res.iter_lines(): print(line) else: res.json() ``` - ### requests.post 的參數 stream 用途為何？為何我設了好像不起作用？ [ChatGPT] stream 參數是 Python 的 requests.post 方法的一個參數，用於控制是否啟用流式傳輸。當 stream 被設置為 True 時，requests 將會使用 iter_content 方法來下載內容，這允許你一次只下載一小部分數據，而不是將整個內容全部下載到內存中。 stream 主要用於處理大型文件或者慢速的數據源，以節省內存。但要注意，啟用 stream 後，你需要手動處理內容，通常是使用 iter_content 或 iter_lines 方法。以下是一個簡單的示例，展示如何使用 stream： ```python= import requests url = 'https://www.example.com/large-file' response = requests.post(url, stream=True) # 檢查是否成功 if response.status_code == 200: # 使用 iter_content 下載內容，chunk_size 可以指定每次下載的大小 for chunk in response.iter_content(chunk_size=128): # 處理下載的數據，例如保存到文件 process_chunk(chunk) else: print(f"Request failed with status code: {response.status_code}") ``` 如果你的 stream 設置好像沒有起作用，可能有以下原因： 1. **服務器不支持流式傳輸**：當服務器不支持流式傳輸時，stream 參數可能不會生效。 2. **服務器立即返回內容**：如果服務器立即返回所有內容，而不是按需生成，則即使啟用了 stream，也可能會一次性下載所有內容。 3. **網絡問題**：檢查網絡連接是否穩定，可能存在網絡問題導致無法實現按需下載。請檢查以上原因，確保服務器支持流式傳輸，而且在服務器端是按需生成內容的。 :::warning ::: ## response / headers - demo ```python= res = requests.post('http://10.78.26.241:30080/completion', headers={'Connection': 'keep-alive'}, json={"n_predict": 2, "prompt": "Hello", "stream": True}, stream=True) print(res.headers) ``` 執行結果： ```json= { 'Access-Control-Allow-Headers': 'content-type', 'Access-Control-Allow-Origin': '*', 'Content-Type': 'text/event-stream', 'Keep-Alive': 'timeout=5, max=5', 'Server': 'llama.cpp', 'Transfer-Encoding': 'chunked' } ``` ### 查看是否是大檔案 - [requests之stream参数分析以及用with确保请求关闭](http://www.xoxxoo.com/index/index/article/id/770.html) - stream=True 通常用在大檔案 - stream=False 通常用在小檔案 - 根據 content-type 判斷檔案類型 ```python= res = requests.get(url, stream=True) ctype = res.headers['Content-Type'] ctype = ctype.lower() if ctype.find('text/html') == -1: res.close() return None htmlstring = res.content res.close() ``` ## 參考資料 - [python requests file upload](https://stackoverflow.com/questions/22567306/python-requests-file-upload) - [Python 使用 requests 模組產生 HTTP 請求，下載網頁資料教學](https://blog.gtwang.org/programming/python-requests-module-tutorial/)