**從facebook粉絲團抓取文章**

# **從facebook粉絲團抓取文章** ## 1.匯入函數 ``` from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.by import By from bs4 import BeautifulSoup import time ``` ## 2.設定變數，用函數把將值填入 ``` # Login with Selenium FACEBOOK_ID = "cyhan014@gmail.com" FACEBOOK_PW = "自行填入" TARGET_URL = "https://www.facebook.com/groups/pythontw" options = webdriver.ChromeOptions() options.add_argument("incognito") driver = webdriver.Chrome(options = options) driver.get("https://www.facebook.com") username = driver.find_element(By.ID, 'email') username.send_keys(FACEBOOK_ID) password = driver.find_element(By.ID, 'pass') password.send_keys(FACEBOOK_PW) login= driver.find_element(By.NAME,'login') login.submit() ``` ## 3.登入失敗，連入粉絲團 ``` time.sleep(3) ##等三秒後，連入粉絲團 driver.get(TARGET_URL) ``` ## 4.滑滾輪四次，載入更多文章 ``` time.sleep(3) for x in range(1, 4): driver.execute_script("window.scrollTo(0,document.body.scrollHeight)") time.sleep(5) ``` ## 5.用Beautifulsoup抓取文章 ``` # beautifulSoup 接手 # Get source soup = BeautifulSoup(driver.page_source, 'html.parser') titles = soup.findAll("div", {"class": "x11i5rnm xat24cr x1mh8g0r x1vvkbs xdj266r x126k92a"})##facebook的class值一段時間會隨機變換 ``` ## 6.印出有幾則文章，從陣列取出文章 ``` print("一共:" + str(len(titles)) + " 則文章...") for title in titles:##用findAll取值的資料會是陣列，用迭代取值 posts = title.findAll('div', {'dir': 'auto'}) for post in posts: if post: print(post.getText()) print("----------------------------------------")##用此作分隔線 # Exit driver.quit() ##關閉webdriver ``` ## 7.Output ![](https://i.imgur.com/6DTQVfu.png) ![](https://i.imgur.com/3Z7YL7H.png) ![](https://i.imgur.com/5QDxDxx.png)