###### tags: `class` `資料探勘` `Hadoop` # 資管所(資料探勘) ``` sudo apt-get update ``` ``` sudo apt-get install default-jdk [Y/n] Y ``` java -version >java version 1.7.0_201 - java update-alternatives --display java - ssh sudo apt-get install ssh - rsync sudo apt-get install rsync 產生SSH Key 密鑰 進行後續身分驗證 ``` ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa ``` dsa(書) cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ==rsa==(影片) cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys HadHoop Download --- https://archive.apache.org/dist/hadoop/common/ 輸上用 2.6.4 binary version (可能是上面網站到2.6.4的第一個[tar.gz](https://archive.apache.org/dist/hadoop/common/hadoop-2.6.4/)) 為了要配合後面跟**Spark2.0** 複製第二個的連結到VB cmd打 沒有src的 wget 貼上 https://archive.apache.org/dist/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz 要改版本沒意外就是改links數字(但我不保證是改小有用,改大改新版應該都還會在,影片用3.1.4) 照你下載的東西名稱解壓縮 版本數字也要記得改 ``` sudo tar -zxvf hadoop-2.6.4.tar.gz sudo mv hadoop-2.6.4 /usr/local/hadoop ``` ![](https://i.imgur.com/JxHYwYq.png) 查看安裝目錄 ``` ll /usr/local/hadoop ``` 設置hadoop環境變量 寫在~/.bashrc 裡面,每次登入時自動運行一次環境變數設置 **卡在第68頁** ~/.bashrc 這個東西改黨內資訊OK ,但是執行完沒有任何事情發生 這串要塞到檔案裏面 (p.69) ``` export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 export HADOOP_INSTALL=/home/amrit/hadoop-3.1.4 export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS=”-Djava.library.path=$HADOOP_INSTALL/lib” ``` source ~/.bashrc sudo gedit /usr/local/hadoop/etc/hadoop/hadoop-env.sh ``` export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 ``` sudo gedit /usr/local/hadoop/etc/hadoop/core-site.xml ``` <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration> ``` yt --- ``` <property> <name>hadoop.tmp.dir</name> <value>/home/amrit/hadoop-3.1.4/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> ``` ``` <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/hdu20/hadoop-3.1.4/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> </configuration> ``` ``` sudo gedit /usr/local/hadoop/etc/hadoop/yarn-site.xml <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> ``` ``` sudo gedit /usr/local/hadoop/etc/hadoop/mapred-site.xml <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> ``` ``` sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value> file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value> file:/usr/local/hadoop/hadoop_data/hdfs/datanode</value> </property> ``` --- p.73 ``` sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode sudo chown hduser:hduser -R /usr/local/hadoop ``` hadoop namenode -format start-dfs.sh start-yarn.sh yes [python 在linux看和更新](https://ubuntuqa.com/zh-tw/article/10336.html)