# Word Count範例
### 主講人:黃夙賢
---
## Word Count
- 模擬三台機器的hadoop進行word count範例
- 參考資訊:https://github.com/kiwenlau/hadoop-cluster-docker

---
## 範例容器image下載
- 抓取範例容器image檔案
- sudo docker pull kiwenlau/hadoop:1.0

---
## 下載hadoop執行檔案
- 若無git請安裝(sudo apt install git)
- git clone https://github.com/kiwenlau/hadoop-cluster-docker

---
## 建立容器溝通介面
- 三個容器模擬三台電腦
- 建立hadoop三個容器溝通之網路介面
- sudo docker network create --driver=bridge hadoop

---
## 啟動容器
- cd hadoop-cluster-docker
- sudo ./start-container.sh
- 會建立hadoop三台機器:hadoop-master、hadoop-slave1、hadoop-slave2

---
## 啟動Hadoop
- 啟動容器成功後,會進入容器內的linux作業系統
- 接下來啟動hadoop,會建立hadoop三台機器:hadoop-master、hadoop-slave1、hadoop-slave2
- ./start-hadoop.sh

---
## Word Count範例
- 看一下wordcount範例在做甚麼
- more run-wordcount.sh

---
## 執行Word Count程式
- ./run-wordcount.sh

---

---
## 執行結果
- file1.txt有兩個字,file2.txt有兩個字
- 最後顯示統計出來的結果

---
## 莎士比亞文章字詞統計
- 莎士比亞文章字詞統計 [資料來源](https://github.com/brunoklein99/deep-learning-notes/blob/master/shakespeare.txt)
- wget https://github.com/shhuangmust/bigdata/raw/main/run-wordcount1.sh
- chmod 755 run-wordcount1.sh
- ./run-wordcount1.sh
---

---
## 統計結果

---
## 刪除容器
- exit跳出容器
- docker rm -f $(docker ps -aq)
- docker network rm hadoop

---
{"metaMigratedAt":"2023-06-17T09:40:36.742Z","metaMigratedFrom":"YAML","title":"Word Count範例","breaks":true,"contributors":"[{\"id\":\"ef0225b9-6c2a-4012-82c9-fa1031d2c4db\",\"add\":2023,\"del\":211}]","description":"模擬三台機器的hadoop進行word count範例"}