# 爬蟲 * [Python爬蟲教學整合Python Selenium及BeautifulSoup實現動態網頁爬蟲](https://www.learncodewithmike.com/2020/05/python-selenium-scraper.html) * [動態網頁爬蟲第一道鎖 — Selenium教學:如何使用Webdriver、send_keys(附Python 程式碼)](https://medium.com/marketingdatascience/selenium%E6%95%99%E5%AD%B8-%E4%B8%80-%E5%A6%82%E4%BD%95%E4%BD%BF%E7%94%A8webdriver-send-keys-988816ce9bed) * [Day-1 Python爬蟲小人生](https://ithelp.ithome.com.tw/articles/10202121) * [xpath提取爬蟲內容](https://lufor129.medium.com/%E6%89%8B%E6%8A%8A%E6%89%8B%E5%AF%AB%E5%80%8B%E7%88%AC%E8%9F%B2%E6%95%99%E5%AD%B8-%E4%B8%80-xpath-518553fd676d) * [滑動驗證碼](https://blog.v123582.tw/2021/06/05/%E8%AE%93-Python-%E7%88%AC%E8%9F%B2%E4%B9%9F%E8%83%BD%E8%AE%80%E5%BE%97%E6%87%82%E3%80%8C%E6%BB%91%E5%8B%95%E9%A9%97%E8%AD%89%E7%A2%BC%E3%80%8D/) ---- ==**自動登入easytest**== from selenium import webdriver from time import sleep driver=webdriver.Chrome('./chromedriver.exe') driver.get('https://easytest.yzu.edu.tw') searchbar=driver.find_element_by_class_name('navbar-left') login_button=searchbar.find_element_by_tag_name('a') login_button.click() sleep(1) account=driver.find_element_by_name("cust_id") password=driver.find_element_by_name("cust_pass") account.send_keys("s1071426") password.send_keys("Oct19961126") driver.find_element_by_id("send").click() ---- ==**自動登入元智portal**== from selenium import webdriver from time import sleep driver=webdriver.Chrome('./chromedriver.exe') driver.get('https://lib.yzu.edu.tw/ajaxYZlib/PersonLogin/PersonLogin.aspx') searchbar=driver.find_element_by_class_name('word10') account=driver.find_element_by_name("txtUserID") password=driver.find_element_by_name("txtUserPWD") account.send_keys("s1071426") password.send_keys("Oct19961126") driver.find_element_by_id("send").click() ---- ## xpath from selenium import webdriver from time import sleep driver=webdriver.Chrome('./chromedriver.exe') for start in range(0,21,10): driver.get('https://www.yzu.edu.tw/admin/pr/index.php/tw/news-tw') for i in range(1,11): title=driver.find_element_by_xpath("/html/body/div[2]/div[3]/div[2]/div/div/div/div/div/form/table/tbody/tr[1]/td[1]/a") print(title.text) account=driver.find_element_by_name("txtUserID") password=driver.find_element_by_name("txtUserPWD") account.send_keys("s1071426") password.send_keys("Oct19961126") driver.find_element_by_id("send").click() --- ## 爬蟲常見抓取attribute * https://www.geeksforgeeks.org/find_element_by_class_name-driver-method-selenium-python/ --- ## 課程內容 ![](https://i.imgur.com/DJ9wYxt.png) ---- ![](https://i.imgur.com/XIT12zZ.png) ---- ![](https://i.imgur.com/nXj8NlS.png) --- * 特殊tags * ![](https://i.imgur.com/MTOsrtr.png =x250) * [attribute tags](https://www.w3schools.com/html/html_attributes.asp) * 更多 html tags補充(可自行參考) [w3school html tags](https://www.w3schools.com/html/) ![](https://i.imgur.com/ueHnZzN.png =x100) --- ![](https://i.imgur.com/hI6BgXg.png) * 爬蟲元件 ![](https://i.imgur.com/SifVgrd.png =x150) * 安裝爬蟲元件 * pip * pip install requests * pip install BeautifulSoup4 * python * import requests * from bs4 import BeautifulSoup * ![](https://i.imgur.com/KdpBK2j.png =x130) * [Jupyter Install](https://jupyter.org/install) * 使用GET 抓取網頁內容