owned this note
owned this note
Published
Linked with GitHub
---
disqus: hackmd
---
Databases and SQL for Data Science with Python <br> WEEK_4 - Access Databases Using Python
====
###### tags: `IBM Data Engineering Professional Certificate`,`Reading Note`,`Coursera`,`Databases and SQL for Data Science with Python`
### Overview
>Kitty ipsum dolor sit amet, shed everywhere shed everywhere stretching attack your ankles chase the red dot, hairball run catnip eat the grass sniff.
<br>
## Access Databases Using Python
### 1. How to Access Databases Using Python
* SQL API

* APIs used by popular SQL-based DBMS systems

<br>
### 2. Writing code using DB-API
* Concepts of the Python DB API
* Connection Objects
* 連結 Database
* 管理 transaction
* Cursor object
* 訪問 Database
* 提取資料
* Connection methods
* .cursor()
* .commit()
* .rollback()
* .close()
* cursor methods
* .callproc()
* .execute()
* .executemany()
* .fetchone()
* .fetchmany()
* .fetchall()
* .nextset()
* .arraysize()
* .close()
* Writing code using DB-API
* Create connection object
```python=
connection = connect('databasename', 'username', 'pswd')
```
* Create a cursor object
```python=
cursor = connection.cursor()
```
* Run Queries
```python=
cursor.execute('SELECT * FROM <table_name>')
results = cursor.fetchall()
```
* Free resources
```python=
cursor.close()
```
<br>
### 3. Connecting to a database using ibm_db API
* 連結資料庫的憑證

* 資料庫連線

* 關閉資料庫連線

### 4. Creating tables, loading data and querying data
* 使用 Python 創建資料庫的 table
```python=
stmt = ibm_db.exec_immediate(conn,
"CREATE TABLE Trucks(
serial_no varchar(20) PRIMARY KEY NOT NULL,
model VARCHAR(20) NOT NULL,
manufacture VARCHAR(20) NOT NULL,
Emgine_size VARCHAR(20) NOT NULL,
Truck_Class VARCHAR(20) NOT NULL)")
```
* 使用 Python 匯出資料至 table
```python=
stmt = ibm_db.exec_immediate(conn,
"INSERT INTO Trucks(serial_no,
model, manufacturer, Engine_size, Truck_Class)
VALUES('A1234', 'Lonestar', 'International Trucks', 'Cummins ISX15', 'Class 8');")
```
* 使用 Python 提取資料
```python=
stmt = ibm_db.exec_immediate(conn, "SELECT * FROM Trucks")
ibm_db.fetch_both(stmt)
```
OUTPUT:

* 使用 Pandas 提取資料
```python=
import pandas
import ibm_db_dbi
pconn = ibm_db_dbi.Connection(conn)
df = pandas.read_sql("SELECT * FROM Trucks", pconn)
df
```
OUTPUT:

### 5. Analyzing data with Python
* 匯入資料的過程

* Load CSV file into DB2 on cloud
1. Source
* 匯入 CSV 檔

2. Target
* 新增 table


3. Define
* 定義資料

4. Finalize
* 開始把資料匯入 table 裡

* 成功匯入資料至 table

* 點擊查閱 table

* Using pandas
* 使用 pandas 探索資料
```python=
import pandas
import ibm_db_dbi
pconn = ibm_db_dbi.Connection(conn)
df = pandas.read_sql('SELECT * FROM MCDONALDS_NUTRITION', pconn)
df
```
OUTPUT:

* View first few rows
```python=
df.head()
```
OUTPUT:

* Learn about your data
```python=
df.describe(include='all')
```
OUTPUT:

* Which food item has maximum sodium content ?
* 以繪圖的方式找出 sodium 的最大值
```python=
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
### Categorical scatterplots
plot = sns.swarmplot(x="Category", y="Sodium", data=df)
plt.setp(plot.get_xticklabels(), rotation=70)
plt.title("Sodium Content")
plt.show()
```
OUTPUT:

* 以 Code 的方式找出 sodium 的最大值
* Code 1
```python=
df['Sodium'].describe()
```
OUTPUT:

* Code 2
```python=
df['Sodium'].idxmax()
```
OUTPUT:

* Code 3
```python=
df.at[82, 'Item']
```
OUTPUT:

* Further data exploration using visualizations
```python=
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
plot = sns.jointplot(x="Protein", y="Total Fat", data=df)
plot.show()
```
OUTPUT:

<br>
## Summary & Highlights
* 課程完整整理的內容,所以把它記錄下來
* You can access a database from a language like Python by using the appropriate API. Examples include ibm_db API for IBM DB2, psycopg2 for ProstgreSQL, and dblib API for SQL Server.
* DB-API is Python's standard API for accessing relational databases. It allows you to write a single program that works with multiple kinds of relational databases instead of writing a separate program for each one.
* The DB_API connect constructor creates a connection to the database and returns a Connection Object, which is then used by the various connection methods.
* The connection methods are:
* The cursor() method, which returns a new cursor object using the connection.
* The commit() method, which is used to commit any pending transaction to the database.
* The rollback() method, which causes the database to roll-back to the start of any pending transaction.
* The close() method, which is used to close a database connection.
* You can use SQL Magic commands to execute queries more easily from Jupyter Notebooks.
* Magic commands have the general format %sql select * from tablename.
* Cell magics start with a double %% (percent) sign and apply to the entire cell.
* Line magics start with a single % (percent) sign and apply to a particular line in a cell.