# Hadoop安裝 # ###### tags: `bigdata` `hadoop` `單機版` ### 單機版 ### 參考: https://blog.gtwang.org/linux/linux-hadoop-single-node-cluster-tutorial/ #### Step1: 安裝 ```bash wget https://archive.apache.org/dist/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz tar zxvf hadoop-2.7.7.tar.gz sudo mv hadoop-2.7.7 /opt/ ``` #### Step2: Java, Hadoop PATH 在~/.bash_profile(Mac) 或 ~/.bashrc(Linux) 中加入 ```bash export JAVA_HOME="/usr/lib/jvm/java-8-oracle" export HADOOP_HOME=/opt/hadoop-2.7.7 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin ``` 編輯好執行 `source ~/.bashrc` #### Step3: 測試 Hadoop 安裝後預設就是單機模式,測試時可直接使用設定檔 ```bash # 查版本 hadoop version # 複製設定檔 cp -r $HADOOP_HOME/etc/hadoop /path/to/your/input # 執行內建MR的範例 hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar # 看結果 cat /path/to/your/output/part-r-00000 ``` ### Hadoop Cluster ### 參考: https://hadoop.apache.org/docs/r2.7.7/hadoop-project-dist/hadoop-common/ClusterSetup.html **core-site.xml** ```xml <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/opt/hadoop-2.7.7/tmp</value> </property> </configuration> ``` `fs.default.name` - HDFS節點的URL `hadoop.tmp.dir` - 存放一些namenode,datanode的資料,預設為/tmp **hdfs-site.xml** ```xml <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/hdfs/data</value> </property> <property> <name>dfs.http.address</name> <value>localhost:50070</value> </property> </configuration> ``` `dfs.replication` - 系統文件塊的數據備份數,一般不大於從機個數 `dfs.name.dir` - NameNode的數據存放的本地路徑 `dfs.data.dir` - DataNode的數據存放的本地路徑 `dfs.http.address` - NameNode的tracker頁面監聽地址端口 **執行** ```bash # 格式化namenode hadoop namenode -format # 啟動dfs ./sbin/start-dfs.sh # 啟動yarn ./sbin/start-yarn.sh ``` **瀏覽器** Hadoop Apps: http://localhost:8088 Hadoop Browsing: http://localhost:50070 **註**: * 執行 `start-yarn.sh` 若發生`JAVA_HOME`找不到,請至`hadoop-env.sh`設定