The virtual machine creation and basic linux configuration are skipped, and the key configuration for building a pseudo-distributed hadoop cluster on a single node is recorded.
Get the hadoop bin package and decompress it, etc. Skip it.
All modes need to modify this configuration
/etc/profile
export JAVA_HOME=/opt/apps/jdk
export CLASSPATH=.:${JAVA_HOME}/lib
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
hadoop-env.sh
JAVA_HOME must be configured, the default JAVA_HOME=${ JAVA_HOME} may not be available
export JAVA_HOME=/opt/apps/jdk
Configure HADOOP_HOME environment variable for easy use
export HADOOP_HOME=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Feature: All services are on one machine, that is, you can use the local file system or the distributed file system
core-site.xml
fs.defaultFS = hdfs://Master:9000 file system< /span>
hadoop.tmp.dir = /opt/workspace/hadoop working directory
hdfs-site.xml
dfs.replication = 1 The default number of replicas, which can be reconfigured by HDFS_Client, pseudo-distributed single node does not require multiple replicas
mapred-site.xml
mapreduce.framework.name = yarn mr runs the framework
yarn-site.xml
yarn.resourcemanager.hostname = Master specifies the hostname of RM
yarn.nodemanager.aux-services = mapreduce_shuffle auxiliary service
namenode format: hadoop namenode -format
Required for all modes Modify this configuration
/etc/profile
export JAVA_HOME=/opt/apps/jdk
export CLASSPATH=.:${JAVA_HOME}/lib
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
hadoop-env.sh
JAVA_HOME must be configured, the default JAVA_HOME=${ JAVA_HOME} may not be available
export JAVA_HOME=/opt/apps/jdk
Configure HADOOP_HOME environment variable for easy use
export HADOOP_HOME=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Features : All services are on one machine, either local file system or distributed file system can be used
core-site.xml
fs.defaultFS = hdfs://Master:9000 file system< /span>
hadoop.tmp.dir = /opt/workspace/hadoop working directory
hdfs-site.xml
dfs.replication = 1 The default number of replicas, which can be reconfigured by HDFS_Client, pseudo-distributed single node does not require multiple replicas
mapred-site.xml
mapreduce.framework.name = yarn mr runs the framework
yarn-site.xml
yarn.resourcemanager.hostname = Master specifies the hostname of RM
yarn.nodemanager.aux-services = mapreduce_shuffle auxiliary service
namenode format: hadoop namenode -format