Hadoop pseudo-distribution

The virtual machine creation and basic linux configuration are skipped, and the key configuration for building a pseudo-distributed hadoop cluster on a single node is recorded.

Get the hadoop bin package and decompress it, etc. Skip it.

All modes need to modify this configuration

/etc/profile
export JAVA_HOME
=/opt/apps/jdk
export CLASSPATH
=.:${JAVA_HOME}/lib
export PATH
=$PATH:$JAVA_HOME/bin

export HADOOP_HOME
=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR
=$HADOOP_HOME/lib/native
export HADOOP_OPTS
=-Djava.library.path=$HADOOP_HOME/lib/native
export PATH
=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

hadoop
-env.sh
JAVA_HOME must be configured, the default JAVA_HOME
=${ JAVA_HOME} may not be available
export JAVA_HOME
=/opt/apps/jdk

Configure HADOOP_HOME environment variable for easy use
export HADOOP_HOME
=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR
=$HADOOP_HOME/lib/native
export HADOOP_OPTS
=-Djava.library.path=$HADOOP_HOME/lib/native
   export PATH
=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Feature: All services are on one machine, that is, you can use the local file system or the distributed file system


core
-site.xml
fs.defaultFS
= hdfs://Master:9000 file system< /span>
hadoop.tmp.dir = /opt/workspace/hadoop working directory

hdfs
-site.xml
dfs.replication
= 1 The default number of replicas, which can be reconfigured by HDFS_Client, pseudo-distributed single node does not require multiple replicas

mapred
-site.xml
mapreduce.framework.name
= yarn mr runs the framework

yarn
-site.xml
yarn.resourcemanager.hostname
= Master specifies the hostname of RM
yarn.nodemanager.aux
-services = mapreduce_shuffle auxiliary service

namenode format: hadoop namenode
-format

Required for all modes Modify this configuration

/etc/profile
export JAVA_HOME
=/opt/apps/jdk
export CLASSPATH
=.:${JAVA_HOME}/lib
export PATH
=$PATH:$JAVA_HOME/bin

export HADOOP_HOME
=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR
=$HADOOP_HOME/lib/native
export HADOOP_OPTS
=-Djava.library.path=$HADOOP_HOME/lib/native
export PATH
=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

hadoop
-env.sh
JAVA_HOME must be configured, the default JAVA_HOME
=${ JAVA_HOME} may not be available
export JAVA_HOME
=/opt/apps/jdk

Configure HADOOP_HOME environment variable for easy use
export HADOOP_HOME
=/opt/apps/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR
=$HADOOP_HOME/lib/native
export HADOOP_OPTS
=-Djava.library.path=$HADOOP_HOME/lib/native
   export PATH
=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

Features : All services are on one machine, either local file system or distributed file system can be used


core
-site.xml
fs.defaultFS
= hdfs://Master:9000 file system< /span>
hadoop.tmp.dir = /opt/workspace/hadoop working directory

hdfs
-site.xml
dfs.replication
= 1 The default number of replicas, which can be reconfigured by HDFS_Client, pseudo-distributed single node does not require multiple replicas

mapred
-site.xml
mapreduce.framework.name
= yarn mr runs the framework

yarn
-site.xml
yarn.resourcemanager.hostname
= Master specifies the hostname of RM
yarn.nodemanager.aux
-services = mapreduce_shuffle auxiliary service

namenode format: hadoop namenode
-format

Leave a Comment

Your email address will not be published.