NKUST ITC
Python 社課
日期 | 內容 |
---|---|
11/19 | DOM解析、API |
11/26 | 以下待訂 |
12/17 | 可能會辦活動 |
12/26 | 期末社員大會 |
HTML由以下3種東西組成
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<div class="content">My first paragraph.</div>
</body>
</html>
<div class="content">My first paragraph.</div>
pip install beautifulsoup4
from bs4 import BeautifulSoup
import requests
rs = requests.get('https://movies.yahoo.com.tw/movie_intheaters.html')
dom = BeautifulSoup(rs.text, 'lxml')
print(dom.prettify()) # 輸出排版後的HTML
from bs4 import BeautifulSoup
with open("index.html") as f:
dom = BeautifulSoup(f, 'lxml')
print(dom.prettify()) # 輸出排版後的HTML
# find()尋找單一元素
from bs4 import BeautifulSoup
import requests
url = 'http://quotes.toscrape.com/'
rs = requests.get(url)
dom = BeautifulSoup(rs.text, 'lxml')
print(dom.find('p').text)
# find_all()尋找所有元素
from bs4 import BeautifulSoup
import requests
url = 'http://quotes.toscrape.com/'
rs = requests.get(url)
soup = BeautifulSoup(rs.text, 'lxml')
tags = soup.find_all(class_='text') # 找出所有class為text的資料
for tag in tags: # 搭配for顯示class為text的內容
print(tag.text)
from bs4 import BeautifulSoup
import requests
url = 'http://quotes.toscrape.com/'
rs = requests.get(url)
soup = BeautifulSoup(rs.text, 'html.parser')
tags = soup.find(class_='quote').find_all('a')
for tag in tags:
print(tag['href'])
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<div class="content">My first paragraph.</div>
</body>
</html>
如上的html文件,只要這樣就能取出div的內容
print(dom.select(".content")[0].text)
# Output: My first paragraph.
在css selector中
故 class="content"在css selector中只要用
".content"
就可以表示
or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing