yarn multi-tenant configuration management (CapacityScheduler)
hadoop version is 2.7
One: Before multi-tenant implementation, there is only one default queue
Second configuration file modification
yarn-site.xml< /p>
yarn.resourcemanager.scheduler.monitor.enable
true
name>yarn.resourcemanager.hostname
FoxLog17.engine.wx
yarn.nodemanager.aux -services
mapreduce_shuffle
yarn.no demanager.resource.cpu-vcores
4
yarn.nodemanager.resource. memory-mb
16000
yarn.nodemanager.vmem-pmem-ratio< /name>
2.1
yarn.nodemanager.vmem-check-enabled
yarn.nodemanager.vmem-check-enabled
value>false
yarn.resourcemanager.store.class
org.apache.hadoop.yarn .server.resourcemanager.recovery.FileSystemRMStateStore
yarn.resourcemanager.fs.state-store.uri
yarn.resourcemanager.fs.state-store.uri
value>/tmp/yarn/system/rmstoreyarn.scheduler.maximum-allocation-mb
value>6192
< Name> Yarn.Scheduler.minimum-allocation-mb name>
1280 value>
proty>
Yarn.ResourceManager.Recovery .enabled
true
yarn.log-aggregation-enable< /name>
true
yarn.log-aggregation.retain-seconds
value>607800
yarn.log-aggregation.retain-check-interval-seconds
10800 value> proty>
Yarn.NodeManager.local-dirs name>
/ hdpdata / hadoop / tmp / nm-local-dirs < /value>
yarn.nodemanager.log-dirs
/hdpdata/hadoop/tmp/nm-log- dirs
yarn.nodemanager.log.retain-seconds
10800
property>
yarn.nodemanager.remote-app-log-dir
/tmp/logs
yarn.nodemanager.remote-app-log-dir-suffix
logs logs logs
yarn.nodemanager.delete.debug-delay-sec
600
yarn.acl.enable
true < br>
<property>
< p class="cye-lm-tag"> <name>yarn.admin.aclname> span>
<value>hadpvalue>
property>
mapred-site.xml
" 1.0"?>
"text/xsl" href="configuration.xsl"?>
mapreduce.framework.name
yarn
mapreduce.reduce.shuffle.parallelcopies
10
mapreduce.jobhistory.address
FoxLog17.engine.wx:10020
mapreduce.jobhistory.webapp.address
FoxLog17.engine.wx:19888
mapreduce.map.memory.mb
2048
mapreduce.map.java.opts
-Xmx1536M
mapreduce.reduce.memory.mb
4096
mapreduce.reduce.java.opts
-Xmx3584M
mapred.child.env
LD_LIBRARY_PATH=/home/hadoop/lzo/lib
mapreduce.cluster.acls.enabled
true
capacity-scheduler.xml
yarn.scheduler.capacity.maximum-applications
10000
Maximum number of applications that can be pending and running.
yarn.scheduler.capacity.maximum-am-resource-percent
0.1
Maximum percent of resources in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.
yarn.scheduler.capacity.resource-calculator
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
The ResourceCalculator implementation to be used to compare
Resources in the scheduler.
The default ie DefaultResourceCalculator only uses Memory while
DominantResourceCalculator uses dominant-resource to compare
multi-dimensional resources such as Memory, CPU etc.
yarn.scheduler.capacity.root.queues
default,analysis
The queues at the this level (root is< span style="color: #000000;"> the root queue).
yarn.scheduler.capacity.root.default.capacity
70
Default queue target capacity.
yarn.scheduler.capacity.root.default.user-limit-factor
1.4
Default queue user limit a percentage from 0.0 to < span style="color: #800080;">1.0.
yarn.scheduler.capacity.root.default.maximum-capacity
100
The maximum capacity of the default queue.
yarn.scheduler.capacity.root.default.state
RUNNING
The state of the default queue. State can be one of RUNNING or STOPPED.
yarn.scheduler.capacity.root.default.acl_submit_applications
*
The ACL of who can submit jobs to the default queue.
yarn.scheduler.capacity.root.default.acl_administer_queue
*
The ACL of who can administer jobs on the default queue.
yarn.scheduler.capacity.node-locality-delay
40
Number of missed scheduling opportunities after which the CapacityScheduler
attempts to schedule rack-local containers.
Typically this should be set to number of nodes < span style="color: #0000ff;">in the cluster, By default is setting
approximately number of nodes in one rack which is < span style="color: #800080;">40.
yarn.scheduler.capacity.root.analysis.capacity
30
yarn.scheduler.capacity.root.analysis.user-limit-factor
1.9
yarn.scheduler.capacity.root.analysis.maximum-capacity
50
yarn.scheduler.capacity.root.analysis.state
RUNNING
yarn.scheduler.capacity.root.analysis.acl_submit_applications
*
yarn.scheduler.capacity.root.analysis.acl_administer_queue
*
The explanation of the above configuration is only a point of personal understanding, and can be compared with the official document.
Three: Update yarn parameters
yarn rmadmin -refreshQueues
View ui:
< p>
Four: Specify the queue for the submitted task (recommended method, there are other methods, not listed)
mapreduce: Specify the queue in the code;
config.set< /span>("mapred.job.queue.name", "analysis");
hive: modify the configuration file; because hive is generally used on the OLAP platform, the queue can be restricted;
hive-site.xml
mapreduce.job.queuename
analysis
spark: run the script to specify the queue or specify in the code
1- Script mode
--queue analysis
2- Code method
saprkConf.set("yarn,spark.queue", "your_queue_name")
Five: Declaration point
1- The resources borrowed from the queue will only be returned after being recycled; the unit is container.
2- Resources can be shared, and can reach the maximum limit of resources, generally to cooperate with user-limit-factor parameter (the default is 1, so it can't take up a lot of resources outside the queue, so the parameter is generally set larger);
3- CapacityScheduler resource scheduler cannot solve the problem caused by resource shortage The task waiting problem.
4- If there are not enough resources, fewer containers will be started. (For example, if you want to start a task with four containers, due to insufficient resources, only two containers will be started, then two containers will be started first. If resources are released, the expected four containers will be started)
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
yarn.resourcemanager.scheduler.monitor.enable
true
yarn.resourcemanager.hostname
17.FoxLogWX value>
proty>
Yarn.NodeManager.Aux-Services name>
mapreduce_shuffle value>
< /property>
yarn.nodemanager.resource.cpu-vcores
4
< br>
yarn.nodemanager.resource.memory-mb
16000
yarn.nodemanager.vmem-pmem-ratio
2.1
yarn.nodemanager.vmem-check-enabled
false
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
yarn.resourcemanager.fs.state-store.uri
/tmp/yarn/system/rmstore value>
yarn.scheduler.maximum-allocation-mb
6192
Property>
Yarn.Scheduler.minimum-allocation-mb name>
1280 value>
printy>
yarn.resourcemanager.recovery.enabled
true
>yarn.log-aggregation-enable
true
yarn.log-aggregation .retain-seconds
607800
yarn.log-aggregation.retain-check-interval-seconds
10800
yarn.nodemanager.log-dirs
/hdpdata/hadoop/tmp/nm- log-dirs
yarn.nodemanager.log.retain-seconds
10800
yarn.nodemanager.remote-app-log-dir
/tmp/logs
yarn.nodemanager.remote-app-log-dir-suffix
property>
< br> yarn.nodemanager.delete.debug-delay-sec
600
br>
yarn. acl.enable
true
<property>
<name>yarn.admin.aclname>
<value>hadp< /value>
property>< /span>
"1.0"?>
"text/xsl" href="configuration.xsl"?>
mapreduce.framework.name
yarn
mapreduce.reduce.shuffle.parallelcopies
10
mapreduce.jobhistory.address
FoxLog17.engine.wx:10020
mapreduce.jobhistory.webapp.address
FoxLog17.engine.wx:19888
mapreduce.map.memory.mb
2048
mapreduce.map.java.opts
-Xmx1536M
mapreduce.reduce.memory.mb
4096
mapreduce.reduce.java.opts
-Xmx3584M
mapred.child.env
LD_LIBRARY_PATH=/home/hadoop/lzo/lib
mapreduce.cluster.acls.enabled
true
yarn.scheduler.capacity.maximum-applications
10000
Maximum number of applications that can be pending and running.
yarn.scheduler.capacity.maximum-am-resource-percent
0.1
Maximum percent of resources in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.
yarn.scheduler.capacity.resource-calculator
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
The ResourceCalculator implementation to be used to compare
Resources in the scheduler.
The default i.e. DefaultResourceCalculator only uses Memory while
DominantResourceCalculator uses dominant-resource to compare
multi-dimensional resources such as Memory, CPU etc.
yarn.scheduler.capacity.root.queues
default,analysis
The queues at the this level (root is the root queue).
yarn.scheduler.capacity.root.default.capacity
70
Default queue target capacity.
yarn.scheduler.capacity.root.default.user-limit-factor
1.4
Default queue user limit a percentage from 0.0 to 1.0.
yarn.scheduler.capacity.root.default.maximum-capacity
100
The maximum capacity of the default queue.
yarn.scheduler.capacity.root.default.state
RUNNING
The state of the default queue. State can be one of RUNNING or STOPPED.
yarn.scheduler.capacity.root.default.acl_submit_applications
*
The ACL of who can submit jobs to the default queue.
yarn.scheduler.capacity.root.default.acl_administer_queue
*
The ACL of who can administer jobs on the default queue.
yarn.scheduler.capacity.node-locality-delay
40
Number of missed scheduling opportunities after which the CapacityScheduler
attempts to schedule rack-local containers.
Typically this should be set to number of nodes in the cluster, By default is setting
approximately number of nodes in one rack which is 40.
yarn.scheduler.capacity.root.analysis.capacity
30
yarn.scheduler.capacity.root.analysis.user-limit-factor
1.9 <! --可以配置为允许单个用户获取更多资源的队列容量的倍数。如果值小于1,那么该用户使用的资源仅限该队列资源,而不会大量去占用其他队列的闲暇资源。 -->
yarn.scheduler.capacity.root.analysis.maximum-capacity
50
yarn.scheduler.capacity.root.analysis.state
RUNNING
yarn.scheduler.capacity.root.analysis.acl_submit_applications
*
yarn.scheduler.capacity.root.analysis.acl_administer_queue
*
yarn rmadmin -refreshQueues
config.set("mapred.job.queue.name", "analysis");
hive:修改配置文件;因为hive一般是用于OLAP平台,可以把队列限制死;
hive-site.xml
mapreduce.job.queuename
analysis
spark:运行脚本指定queue 或 代码中指定
1- 脚本方式
--queue analysis
2- 代码方式
saprkConf.set("yarn,spark.queue", "your_queue_name")
五:声明点
1- 队列借出去的资源被回收后才会归还;以container为单位。
2- 资源可以共用,并且可以达到最大限制资源,一般要配合user-limit-factor参数(默认为1,所以不能占用大量队列外资源,所以该参数一般设置大些);
3- CapacityScheduler资源调度器是没办法解决因资源短缺造成的任务等待问题。
4- 如果资源不足够多,会启动较少的container。(例如想启动一个四个container的任务,由于资源不足,只够启动两个,那么会先启动两个container,如果有资源释放,则会启动预期的四个container)
config.set("mapred.job.queue.name", "analysis");
hive-site.xml
mapreduce.job.queuename
analysis
1- 脚本方式
--queue analysis
2- 代码方式
saprkConf.set("yarn,spark.queue", "your_queue_name")
1- 队列借出去的资源被回收后才会归还;以container为单位。
2- 资源可以共用,并且可以达到最大限制资源,一般要配合user-limit-factor参数(默认为1,所以不能占用大量队列外资源,所以该参数一般设置大些);
3- CapacityScheduler资源调度器是没办法解决因资源短缺造成的任务等待问题。
4- 如果资源不足够多,会启动较少的container。(例如想启动一个四个container的任务,由于资源不足,只够启动两个,那么会先启动两个container,如果有资源释放,则会启动预期的四个container)