2022/11/22 社課共筆

--- title: 2022/11/22 社課共筆 tags: 社課 --- # 2022/11/22 社課共筆 :::warning - 社課資源: - [簡報](https://hackmd.io/@c4t0212/SyLVAAK8s#/) - [IDE 環境](https://colab.research.google.com/) - [slido](https://app.sli.do/event/1o5MqmJDyzG9wnYNGUNwpk/live/questions) - [回饋表單](https://forms.gle/USxTEx9Vv5pKTV5i6) - ![](https://i.imgur.com/vDtzPFM.png =50%x) - [活動官網](https://gdg.community.dev/gdg-taipei/) ::: [銘傳金手指爬蟲](https://colab.research.google.com/drive/1MPHQlPn3LoFNYAviuV8RPf4snXn3QHAN?usp=sharing) ptt https://www.ptt.cc/bbs/index.html ptt 八卦版 https://www.ptt.cc/bbs/Gossiping/M.1669098866.A.563.html ptt 多圖片 https://www.ptt.cc/bbs/PlayStation/M.1669083797.A.372.html python requests 庫範例 ```python= import requests as req #response= req.request(Method, url[, headers, cookies, data...]) pp = {'q': 'mcu123'} url = 'https://www.google.com/search' rsp = req.request('GET', url= url) srsp = req.request('POST', url= url, params= pp) print(response.text) #印出文字版網頁 print(reqponse.status_code) #看看response的狀態碼 ``` python beautiful soup 範例 ```python= from bs4 import BeautifulSoup as bs soup = bs(rsp.text, 'html.parser') rst = soup.find_all('title') print(rst) ``` beautiful soup 抓 google 首頁網址 ```python= from bs4 import BeautifulSoup as bs soup = bs(rsp.text, 'html.parser') rst = soup.find_all('a') for x in rst: print(x.get('href')) # print(rst) ``` cookie 練習 ```python= url = 'https://www.ptt.cc/bbs/Gossiping/M.1669098866.A.563.html' ck = { 'over18' : '1'} response = req.request('GET', url= url, cookies = ck) soup = bs(response.text, 'html.parser') rst = soup.find_all('a') for x in rst: print(x.get('href')) #print(response.text) ```