# 雲端平台 - GCP
*6/10
* 雲:透過雲端運算,使用者不用自己架設機台。serverless,撇除了裝置的問題。公司不用付IT費用,不同部門可同時取用相同資源。
* Iaas基礎結構即服務: 租用土地
Paas平台即服務: 租用建造房屋需要的工具跟設備,Microsoft Axure
Saas軟體即服務: 租房子,Slack
* 雲端部屬類型
私人雲- 只給某些人使用
公用雲- 外部廠商
* 設定Anaconda/Jupyter環境

* Jupyter
a是在上面增加一個區塊
b是在下面增加一個區塊
!表示在terminal執行,如!pip list、!python --version
執行區塊: shift+enter或是按run
ctrl shift - 分割上下
* Big Query
主機- Taiwan: asia-east1
或東京
### Big Query實做
* GCP Projects (專案) > BigQuery Datasets (資料集) > BigQuery Tables (資料表) > BigQuery View (檢視表)
[文件說明](https://cloud.google.com/bigquery/docs/locations)
* 啟動金鑰






* 建立資料集(資料集名稱create_new_dataset)
```py=
from google.cloud import bigquery as bq
import datetime
import pandas as pd
import pyarrow
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\\Users\\Samantha\\Documents\\dv105\\0610雲端平台GCP\\neat-throne-389400-6704b48a595e.json"
client = bq.Client()
#建資料集
dataset_id = 'neat-throne-389400.create_new_dataset'#設定Dataset 名稱,以修改
dataset = bq.Dataset(dataset_id)
dataset.location = "asia-east1" #設定資料位置,如不設定預設是S
dataset.default_table_expiration_ms =30*24*60*60*1000#設定資料期時間,這邊設定30天過期
dataset.description = 'neat-throne-389400 & expiration in 30 days & location at asia-east1'# 設定dataset描述
dataset = client.create_dataset(dataset) # Make an API request.
datasets = list(client.list_datasets()) # Make an API request.]
for dataset in datasets:
print(dataset.dataset_id)
```
* 建立資料表(資料表名稱create_table)
```py=
#設定Table名稱
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\\Users\\Samantha\\Documents\\dv105\\0610雲端平台GCP\\neat-throne-389400-6704b48a595e.json"
client = bq.Client()
table_id = "neat-throne-389400.create_new_dataset.create_table"
#設定Table資料結構
schema = [
bq.SchemaField("name", "STRING"),
bq.SchemaField("post", "STRING"),
bq.SchemaField("timestamp", "TIMESTAMP"),
]
table = bq.Table(table_id, schema=schema)
table.expires = datetime.datetime.now() + datetime.timedelta(days=6)#設定Table過時間
table.description = "create a new table and write the description."#設定Table描述
#設定Table Partition
# table.time_partitioning = bq.TimePartitioning(
# type_=bq.TimePartitioningType.DAY,
# field="timestamp",
# expiration_ms=7776000000,
# )
#建立 Table
table = client.create_table(table) # Make an API request.
```
* 將資料寫入資料表create_table
```py=
#将json 資料傳人BigQuery
df = pd.DataFrame({
'name': ['Max'],
'post': ['1'],
'timestamp': [datetime.datetime.now()]
})
table = client.dataset('create_new_dataset').table('create_table')
job = client.load_table_from_dataframe(df,table)
job.result()
```
* 建立資料表(資料表名稱create_nested_table)
```py=
table_id = "neat-throne-389400.create_new_dataset.create_nested_table"
#設定Table資料結構
schema = [
bq.SchemaField('post', 'STRING', mode='NULLABLE'),
bq .SchemaField('account',
'RECORD',
mode='REPEATED',
fields=[
bq.SchemaField('name', "STRING", mode="NULLABLE"),
bq.SchemaField('address', "STRING" , mode="NULLABLE"),
bq.SchemaField('number', "INTEGER" , mode="NULLABLE")
])
]
table = bq.Table(table_id, schema=schema) #設定Table期時間
table.expires = datetime.datetime.now() + datetime.timedelta(days=6)
#設定Table描述
table.description = 'create a new table and write the description.'
# 建立 Table
table = client.create_table(table) # Make an API request.
```
* 將資料寫入資料表create_nested_table
```py=
import json
#将json 資料傳人BigQuery
now_stamp = datetime.datetime.now()
print(now_stamp)
json_data = [{
'post' :
'post01',
'account':[{
'name':'Max',
'address':'忠孝東路走九遍',
'number':'0900000000'
}]
}]
table = client.dataset('create_new_dataset').table('create_nested_table')
job = client.load_table_from_json(json_data,table)
job.result()
```
### Big Query 爬蟲實做
* 建立爬蟲的資料表(ifoodie_table)
```py=
table_id = "neat-throne-389400.create_new_dataset.ifoodie_table"
schema = [
bq.SchemaField('restaurant_url', 'STRING', mode='NULLABLE'),
bq.SchemaField('name', 'STRING', mode='NULLABLE'),
bq.SchemaField('address', 'STRING', mode='NULLABLE'),
bq.SchemaField('category','RECORD',mode='REPEATED',
fields=[
bq.SchemaField('tag', "STRING", mode="NULLABLE"),
bq.SchemaField('tag_url', "STRING" , mode="NULLABLE")
])
]
table = bq.Table(table_id, schema=schema) #設定Table期時間
table.expires = datetime.datetime.now() + datetime.timedelta(days=6)
#設定Table描述
table.description = 'create a new table and write the description.'
# 建立 Table
table = client.create_table(table) # Make an API request.
```
* 爬ifoodie網頁,並將爬到的資料寫入資料表ifoodie_table
```py=
from bs4 import BeautifulSoup
import requests
import json
#将json 資料傳人BigQuery
now_stamp = datetime.datetime.now()
print(now_stamp)
url =f"https://ifoodie.tw/explore/list/%E9%8D%8B%E9%A1%9E"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
paragraph = soup.find_all('div', {"class": 'restaurant-info'})
data = []
for para in paragraph:
try:
url= para.a['href']
restaurant_url = f'https://ifoodie.tw/{url}'
name = para.findChild("div").findChild("div").findChild("div").findChild("a").text
address = para.findChild("div").findChild("div").findNextSibling("div").findNextSibling("div").findNextSibling("div").text
tags = para.find_all('a', {'class': 'category'})
tags_list = []
for t in tags:
tag = t.text
turl = t['href']
tag_url = f'https://ifoodie.tw/{turl}'
tag_obj = {'tag': tag, 'tag_url': tag_url}
tags_list.append(tag_obj)
para_obj = {'restaurant_url': restaurant_url, 'name': name, 'address': address, 'category': tags_list}
data.append(para_obj)
except Exception as e:
print("錯誤發生:", e)
continue
table = client.dataset('create_new_dataset').table('ifoodie_table')
job = client.load_table_from_json(data,table)
job.result()
```
* GCP上的畫面

* 啟用Maps Javascript API服務

* 到憑證確認金鑰


* 金鑰用於Html呼叫google map api時使用

### 用Geocoding API將地址轉換為經緯度
* 先建立BQ資料表 ifoodie_address_info
```py=
from google.cloud import bigquery as bq
import datetime
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\\Users\\Samantha\\Documents\\dv105\\0610雲端平台GCP\\neat-throne-389400-6704b48a595e.json"
client = bq.Client()
table_id = "neat-throne-389400.create_new_dataset.ifoodie_address_info"
schema = [
bq.SchemaField('name', 'STRING', mode='NULLABLE'),
bq.SchemaField('address', 'STRING', mode='NULLABLE'),
bq.SchemaField('lat', 'FLOAT', mode='NULLABLE'),
bq.SchemaField('lng', 'FLOAT', mode='NULLABLE'),
]
table = bq.Table(table_id, schema=schema) #設定Table期時間
table.expires = datetime.datetime.now() + datetime.timedelta(days=6)
#設定Table描述
table.description = 'create a new table and write the description.'
# 建立 Table
table = client.create_table(table) # Make an API request.
```
* 使用API轉換經緯度,並寫入BQ資料表
```py=
import json
import requests
from google.cloud import bigquery as bq
import datetime
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "C:\\Users\\Samantha\\Documents\\dv105\\0610雲端平台GCP\\neat-throne-389400-6704b48a595e.json"
client = bq.Client()
with open('C:\\Users\\Samantha\\Documents\\dv105\\0525專題二-數據一條龍\\ifoodie.txt', 'r', encoding='utf-8') as f:
data = json.load(f)
data_obj = []
for i in data:
name = i['name']
address = i['address']
response = requests.get(f'https://maps.googleapis.com/maps/api/geocode/json?address={address}&key=AIzaSyCg-FTOgv2CzJ0hbNkaiB2xr8z8MBVVQww')
a = json.loads(response.text)
lati = a['results'][0]['geometry']['location']['lat']
lont = a['results'][0]['geometry']['location']['lat']
json_data = {'name': name, 'address': address, 'lat': lati, 'lng': lont}
data_obj.append(json_data)
table = client.dataset('create_new_dataset').table('ifoodie_address_info')
job = client.load_table_from_json(data_obj,table)
job.result()
```
* 完成後GCP BQ畫面

* 利用經緯度畫圖
1. 在BQ的頁面上,點匯出-透過Looker Studio探索

2. 將圖表改為泡泡地圖,並新增將經緯度合併的欄位。



3. 將location拉到地區的維度處即可完成。

### Vision AI
* 啟用[Cloud Vision AI](https://cloud.google.com/vision?hl=zh-tw)

* 安裝Vision套件
```$ !pip install google-cloud-vision```
* 取得圖片的label資訊
```py=
from google.cloud import vision
def label_image(image_path):
client = vision.ImageAnnotatorClient()
with open(image_path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
response = client.label_detection(image=image)
labels = response. label_annotations
print(labels)
for label in labels:
print(label.description)
image_path = 'C:\\Users\\Samantha\\Desktop\\PIC\\19.jpg'
label_image(image_path)
```

* 以上方label資訊改寫,取得圖片的image_property資訊(色彩資訊)
```py=
from google.cloud import vision
def analyze_image_properties(image_path):
client = vision.ImageAnnotatorClient()
with open(image_path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
features = [vision.Feature(type_=vision.Feature.Type.IMAGE_PROPERTIES),]
request = vision.AnnotateImageRequest(image=image, features=features)
response = client.batch_annotate_images(requests=[request])
image_properties = response.responses[0].image_properties_annotation
dominant_colors = image_properties.dominant_colors.colors
for color in dominant_colors:
print("Color RGB: {}, Score: {}".format(color.color.red, color.color.green, color.color.blue, color.score))
image_path = 'C:\\Users\\Samantha\\Desktop\\PIC\\123.jpg'
analyze_image_properties(image_path)
```

* 文本辨識
```py=
#文本辨識
from google.cloud import vision
def detect_text(image_path):
client = vision.ImageAnnotatorClient()
with open(image_path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
for text in texts:
print(text.description)
image_path = 'D:\\APPS\\Desktop\\GCP\\Picture\\123.jpg'
detect_text(image_path)
```
* 取得照片所有資訊
```py=
from google.cloud import vision
def detect_labels_landmarks_faces(image_path):
client = vision.ImageAnnotatorClient()
with open(image_path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
#設定檢測的特徵
features = [
vision.Feature(type_=vision.Feature.Type.LABEL_DETECTION) ,
vision.Feature(type_=vision.Feature.Type.LANDMARK_DETECTION) ,
vision.Feature(type_=vision.Feature.Type.FACE_DETECTION) ,
vision.Feature(type_=vision.Feature.Type.IMAGE_PROPERTIES) ,
vision.Feature(type_=vision.Feature.Type.SAFE_SEARCH_DETECTION)
]
image_context = vision.ImageContext(
language_hints=[ "en" ] #如果需要,您可以指定語言提示
)
#建立詮釋圖像請求
request = vision.AnnotateImageRequest(
image=image,
features=features,
image_context=image_context
)
response = client.annotate_image(request)
print(response)
print('========')
#提取標籤
if response.label_annotations:
print("標籤:")
for label in response.label_annotations:
print(label.description)
#提取地標
if response.landmark_annotations:
print("地標")
for landmark in response.landmark_annotations:
print('地標名稱:', landmark.description)
print('位置矩陣範圍:')
print('左:', landmark.bounding_poly.vertices[0].x)
print('上:', landmark.bounding_poly.vertices[0].y)
print('右:', landmark.bounding_poly.vertices[2].x)
print('下:', landmark.bounding_poly.vertices[2].y)
print('----------')
#提取人臉物件的屬性
if response.face_annotations:
print('人臉物件屬性')
for face in response.face_annotations:
print('人臉位置矩陣範圍:')
print('左:', face.bounding_poly.vertices[0].x)
print('上:', face.bounding_poly.vertices[0].y)
print('右:', face.bounding_poly.vertices[2].x)
print('下:', face.bounding_poly.vertices[2].y)
print('其他屬性:', face)
#提取影像屬性
if response.image_properties_annotations:
props = response.image_properties_annotations
print('影像顏色屬性:')
for color in props.dominant_colors.colors:
print('顏色:', color.color)
print('分數:', color.score)
print('像素比例:', color.pixel_fraction)
#提取安全搜索屬性
if response.safe_search.annotation:
safe_search = response.safe_search.annotation
print('安全搜索屬性:')
print('成人:', safe_search.adult)
print('spoof:', safe_search.spoof)
print('medical:')
print('暴力:', safe_search.violence)
print('racy:', safe_search.racy)
return response
image_path = 'D:\\APPS\\Desktop\\GCP\\Picture\\123.jpg'
detect_labels_landmarks_faces(image_path)
```
### DialogFlow
* 在DialogFlow上建立新的Agent,選擇Default Welcome Intent,在Training phrases中打入讓機器接受的詞彙,如「你好」。


* 到GCP的IAM-服務帳戶-Create Service Account,服務帳戶名稱需與上方新Agent名稱相同,角色需選擇DialogFlow - DialogFlow API用戶端。創建後新增金鑰,並將金鑰存入電腦。



* 安裝DialogFlow套件
```$ !pip install google-cloud-dialogflow```
* 使用以下Code可對DialogFlow做API存取
注意要調整Key的路徑(os.environ)與project名稱
```py=
import os
from google.cloud import dialogflow_v2
from google.cloud.dialogflow_v2 import types
os.environ["GOOGLE APPLICATION CREDENTIALS"]= "C:\\Users\\Samantha\\Downloads\\neat-throne-389400-9879a39cdb8d.json"
# 設定Dialogflow專案 ID和語言代碼
project_id = "neat-throne-389400"
language_code = "en-Us"
#建立一個唯一的sessionID
session_id = "1234"
#要發送的文字請求
# text - "Hello, how are you?"
#text = =“妳好"
# text - "classs"
text ="你好"
def detect_intent(project_id, session_id, text, language_code):
session_client = dialogflow_v2.SessionsClient()
session = session_client.session_path(project_id, session_id)
text_input = dialogflow_v2.types.TextInput(text=text, language_code=language_code)
query_input = dialogflow_v2.types.QueryInput(text=text_input)
response = session_client.detect_intent(session=session, query_input=query_input)
return response.query_result.fulfillment_text
#呼叫detect_intent函式,獲取Dialogflow的回應
response = detect_intent(project_id, session_id, text, language_code)
#輸出回應
print("Dialogflow Response: ", response)
```

###### tags: `雲端平台` `GCP` `爬蟲` `python` `jupyter`