Hadoop pseudo-distribution - distributed, Hadoop, Pseudo, set

The virtual machine creation and basic linux configuration are skipped, and the key configuration for building a pseudo-distributed hadoop cluster on a single node is recorded.

Get the hadoop bin package and decompress it, etc. Skip it.

All modes need to modify this configuration

/etc/profile

 export JAVA_HOME=/opt/apps/jdk

 export CLASSPATH=.:${JAVA_HOME}/lib

 export PATH=$PATH:$JAVA_HOME/bin



 export HADOOP_HOME=/opt/apps/hadoop

 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

 export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native

 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin



hadoop-env.sh

 JAVA_HOME must be configured, the default JAVA_HOME=${ JAVA_HOME} may not be available

 export JAVA_HOME=/opt/apps/jdk



 Configure HADOOP_HOME environment variable for easy use

 export HADOOP_HOME=/opt/apps/hadoop

 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

 export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native

　　 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Feature: All services are on one machine, that is, you can use the local file system or the distributed file system



core-site.xml

 fs.defaultFS = hdfs://Master:9000 file system< /span>

 hadoop.tmp.dir = /opt/workspace/hadoop working directory



hdfs-site.xml

 dfs.replication = 1 The default number of replicas, which can be reconfigured by HDFS_Client, pseudo-distributed single node does not require multiple replicas



mapred-site.xml

 mapreduce.framework.name = yarn mr runs the framework



yarn-site.xml

 yarn.resourcemanager.hostname = Master specifies the hostname of RM

 yarn.nodemanager.aux-services = mapreduce_shuffle auxiliary service



namenode format: hadoop namenode -format

Required for all modes Modify this configuration

/etc/profile

 export JAVA_HOME=/opt/apps/jdk

 export CLASSPATH=.:${JAVA_HOME}/lib

 export PATH=$PATH:$JAVA_HOME/bin



 export HADOOP_HOME=/opt/apps/hadoop

 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

 export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native

 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin



hadoop-env.sh

 JAVA_HOME must be configured, the default JAVA_HOME=${ JAVA_HOME} may not be available

 export JAVA_HOME=/opt/apps/jdk



 Configure HADOOP_HOME environment variable for easy use

 export HADOOP_HOME=/opt/apps/hadoop

 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

 export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native

　　 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Features : All services are on one machine, either local file system or distributed file system can be used



core-site.xml

 fs.defaultFS = hdfs://Master:9000 file system< /span>

 hadoop.tmp.dir = /opt/workspace/hadoop working directory



hdfs-site.xml

 dfs.replication = 1 The default number of replicas, which can be reconfigured by HDFS_Client, pseudo-distributed single node does not require multiple replicas



mapred-site.xml

 mapreduce.framework.name = yarn mr runs the framework



yarn-site.xml

 yarn.resourcemanager.hostname = Master specifies the hostname of RM

 yarn.nodemanager.aux-services = mapreduce_shuffle auxiliary service



namenode format: hadoop namenode -format

Leave a Comment Cancel reply