Scratch and Python 2018 - Python Lecture 4

File I/O

open
- Read text file: FILE = open(filename,'r')
  - Read whole file as a string: FILE.read()
  - Read while file as lines: FILE.readlines()
  - Read a single line: FILE.readline()
  - Is iterable: for line in FILE:
- Write text file: FILE = open(filename,'w')
  - Print to file: print(things,file=FILE)
  - Write a string: FILE.write(string)
close
- FILE.close()

with block

Prevent you forget to close the file.

with open(filename,'r') as FILE:
    lines = FILE.readlines()

JSON

Reference
import json
dumps
- json.dumps({'name':'MZ','height':177.5,'weight':115})

dump

with open(filename,'w') as FILE:
    json.dump({'name':'MZ','height':177.5,'weight':115},FILE)

loads
- lst = json.loads('[1,2,3,4,5]')

load

with open(filename,'r') as FILE:
    obj = json.load(FILE)

CSV

Reference
import csv

csv.reader

import csv
with open(csvfile,'r') as FILE:
    rd = csv.reader(FILE)
    rows = [row for row in rd]
    
print(rows)

csv.writer

import csv
with open(csvfile,'w') as FILE:
    wt = csv.writer(FILE)
    wt.writerow(['Name','Height','Weight'])
    wt.writerow(['MZ',177.5,115])

csv.DictReader

import csv
with open(csvfile,'r') as FILE:
    rd = csv.DictReader(FILE)
    rows = [row for row in rd]
    
print(rows)

csv.DictWriter

import csv
with open(csvfile,'w') as FILE:
    fns = ['Name','Height','Weight']
    wt = csv.DictWriter(FILE,fieldnames=fns)
    wt.writerow({'Name':'MZ','Height':177.5,'Weight':115})

`requests`: a browser (?)

import requests: don't forget s
URL: Uniform Resource Locator
- 這術語好長我們叫他網址好了
result = requests.get('http://www.nctu.edu.tw/')
- 這網址好長請試著複製貼上吧
result.text
- 這內容好字串：type(result.text)
- String processing
  - str
  - import re: regular expression
result.raise_for_status()
- 連網頁做了好多事情，要是網址有錯瀏覽器會直接當掉給你看嗎？
- It is OK to raise an exception, always using try-except blocks is tedious.
  - Java is tedious: almost always try or throws (raise in Python)
- raise_for_status(): If there is any exception, raise it now.

Save the content

Open a file to save it: the_file = open('a_name.html', 'wb')
- Filename: a_name.html
- Mode: wb means "write binary"
  - the_file.write(chunk): write a chunk of bytes
- Mode: wt means "write text"
  - print(some_str,file=the_file): write a string
for chunk in result.iter_content(102400): to iterate 102400-byte chunks of result
Remember to close the file: the_file.close()

Sample code












import requests

url = input('Input URL: ')
result = requests.get(url)
result.raise_for_status()

name = input('input filename: ')
FILE = open(name,'wb')
for chunk in result.iter_content(102400):
    FILE.write(chunk)

FILE.close()

Sample code 2: with-block (Probably, every body sometimes forgets to close the file.)










import requests

url = input('Input URL: ')
result = requests.get(url)
result.raise_for_status()

name = input('input filename: ')
with open(name,'wb') as FILE:
    for chunk in result.iter_content(102400):
        FILE.write(chunk)

Try to download some images with the sample codes.








import requests

def url_to_file(url,filename):
    result = requests.get(url)
    result.raise_for_status()
    with open(filename,'wb') as FILE:
        for chunk in result.iter_content(102400):
            FILE.write(chunk)

Open Data

政府開放資料平台

紫外線即時監測資料

Task: 請撰寫一個程式，透過紫外線即時監測資料，找出 UVI 前三高的地方。

Hint: requests.get('http://opendata.epa.gov.tw/ws/Data/UV/?$format=json')
Sample Code







import requests, json

res = requests.get('http://opendata.epa.gov.tw/ws/Data/UV/?$format=json')
data = json.loads(res.text)
uvi_place = [(float(d['UVI']),d['SiteName']) for d in data if d['UVI']!='']
uvi_place.sort(reverse=True)
print(uvi_place[:3])

全國電子發票B2C開立資料集

Task: 請撰寫一個程式，統計 2014 年至 2017 年，各行業平均客單價。輸出為 CSV 格式的檔案。

Hint: requests.get('http://sip.einvoice.nat.gov.tw/ods-main/ODS308E/download/3886F055-EB77-4DF9-98E2-F3F49A7D3434/1/845E38D0-76D4-4B49-922A-96F41705F175/0/?fileType=csv')
Sample Code
























import requests, csv

def url_to_file(url,filename):
    result = requests.get(url)
    result.raise_for_status()
    with open(filename,'wb') as FILE:
        for chunk in result.iter_content(102400):
            FILE.write(chunk)

url = 'http://sip.einvoice.nat.gov.tw/ods-main/ODS308E/download/3886F055-EB77-4DF9-98E2-F3F49A7D3434/1/845E38D0-76D4-4B49-922A-96F41705F175/0/?fileType=csv'
filename = 'C:\\Users\\user\\Desktop\\task2.csv'
url_to_file(url, filename)

with open(filename,'r',encoding='utf8') as FILE:
    rd = csv.DictReader(FILE)
    rows = [row for row in rd]

avg = {}
for row in rows:
    if row['\ufeff發票年月'].startswith('2018'): continue
    avg.setdefault(row['行業名稱'],[]).append(float(row['平均客單價']))

for k, v in avg.items():
    print(k,sum(v)/len(v))

Scratch and Python 2018 - Python Lecture 4

File I/O

JSON

CSV

requests: a browser (?)

Open Data

`requests`: a browser (?)