YARN multi-tenant management

yarn multi-tenant configuration management (CapacityScheduler)

hadoop version is 2.7

One: Before multi-tenant implementation, there is only one default queue

share picture

Second configuration file modification

yarn-site.xml< /p>





yarn.resourcemanager.scheduler.monitor.enable
true



name>yarn.resourcemanager.hostname
FoxLog17.engine.wx



yarn.nodemanager.aux -services
mapreduce_shuffle


yarn.no demanager.resource.cpu-vcores
4



yarn.nodemanager.resource. memory-mb
16000


yarn.nodemanager.vmem-pmem-ratio< /name>
2.1


yarn.nodemanager.vmem-check-enabled

yarn.nodemanager.vmem-check-enabled

value>false


yarn.resourcemanager.store.class
org.apache.hadoop.yarn .server.resourcemanager.recovery.FileSystemRMStateStore


yarn.resourcemanager.fs.state-store.uri

yarn.resourcemanager.fs.state-store.uri

value>/tmp/yarn/system/rmstore


yarn.scheduler.maximum-allocation-mb

value>6192


< Name> Yarn.Scheduler.minimum-allocation-mb
1280


Yarn.ResourceManager.Recovery .enabled
true




yarn.log-aggregation-enable< /name>
true


yarn.log-aggregation.retain-seconds

value>607800



yarn.log-aggregation.retain-check-interval-seconds
10800

Yarn.NodeManager.local-dirs
/ hdpdata / hadoop / tmp / nm-local-dirs < /value>


yarn.nodemanager.log-dirs
/hdpdata/hadoop/tmp/nm-log- dirs


yarn.nodemanager.log.retain-seconds
10800


yarn.nodemanager.remote-app-log-dir
/tmp/logs


yarn.nodemanager.remote-app-log-dir-suffix
logs
logs
logs

yarn.nodemanager.delete.debug-delay-sec
600








yarn.acl.enable
true< br>

      <property>

< p class="cye-lm-tag">        <name>yarn.admin.aclname> span>

        <value>hadpvalue>

property>




mapred-site.xml

" 1.0"?>

"text/xsl" href="configuration.xsl"?>






mapreduce.framework.name
yarn



mapreduce.reduce.shuffle.parallelcopies
10






mapreduce.jobhistory.address
FoxLog17.engine.wx:10020


mapreduce.jobhistory.webapp.address
FoxLog17.engine.wx:19888







mapreduce.map.memory.mb
2048



mapreduce.map.java.opts
-Xmx1536M



mapreduce.reduce.memory.mb
4096



mapreduce.reduce.java.opts
-Xmx3584M



mapred.child.env
LD_LIBRARY_PATH=/home/hadoop/lzo/lib




mapreduce.cluster.acls.enabled
true



capacity-scheduler.xml





yarn.scheduler.capacity.maximum-applications
10000

Maximum number of applications that can be pending and running.




yarn.scheduler.capacity.maximum-am-resource-percent
0.1

Maximum percent of resources
in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.




yarn.scheduler.capacity.resource-calculator
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator

The ResourceCalculator implementation to be used to compare
Resources
in the scheduler.
The
default ie DefaultResourceCalculator only uses Memory while
DominantResourceCalculator uses dominant
-resource to compare
multi
-dimensional resources such as Memory, CPU etc.




yarn.scheduler.capacity.root.queues
default,analysis

The queues at the
this level (root is< span style="color: #000000;"> the root queue).




yarn.scheduler.capacity.root.default.capacity
70
Default queue target capacity.



yarn.scheduler.capacity.root.default.user-limit-factor
1.4

Default queue user limit a percentage
from 0.0 to < span style="color: #800080;">1.0.




yarn.scheduler.capacity.root.default.maximum-capacity
100

The maximum capacity of the
default queue.




yarn.scheduler.capacity.root.default.state
RUNNING

The state of the
default queue. State can be one of RUNNING or STOPPED.




yarn.scheduler.capacity.root.default.acl_submit_applications
*

The ACL of who can submit jobs to the
default queue.




yarn.scheduler.capacity.root.default.acl_administer_queue
*

The ACL of who can administer jobs on the
default queue.




yarn.scheduler.capacity.node-locality-delay
40

Number of missed scheduling opportunities after which the CapacityScheduler
attempts to schedule rack
-local containers.
Typically
this should be set to number of nodes < span style="color: #0000ff;">in the cluster, By default is setting
approximately number of nodes
in one rack which is < span style="color: #800080;">40.





yarn.scheduler.capacity.root.analysis.capacity
30



yarn.scheduler.capacity.root.analysis.user-limit-factor
1.9    



yarn.scheduler.capacity.root.analysis.maximum-capacity
50      



yarn.scheduler.capacity.root.analysis.state
RUNNING



yarn.scheduler.capacity.root.analysis.acl_submit_applications
*



yarn.scheduler.capacity.root.analysis.acl_administer_queue
*



The explanation of the above configuration is only a point of personal understanding, and can be compared with the official document.

Three: Update yarn parameters

yarn rmadmin -refreshQueues

View ui:

share picture

< p>

Four: Specify the queue for the submitted task (recommended method, there are other methods, not listed)

mapreduce: Specify the queue in the code;

config.set< /span>("mapred.job.queue.name", "analysis");

hive: modify the configuration file; because hive is generally used on the OLAP platform, the queue can be restricted;

hive-site.xml


mapreduce.job.queuename
analysis

spark: run the script to specify the queue or specify in the code

1- Script mode

--queue analysis

2- Code method
saprkConf.
set("yarn,spark.queue", "your_queue_name")

Five: Declaration point

1- The resources borrowed from the queue will only be returned after being recycled; the unit is container.

2- Resources can be shared, and can reach the maximum limit of resources, generally to cooperate with user-limit-factor parameter (the default is 1, so it can't take up a lot of resources outside the queue, so the parameter is generally set larger);
3- CapacityScheduler resource scheduler cannot solve the problem caused by resource shortage The task waiting problem.
4- If there are not enough resources, fewer containers will be started. (For example, if you want to start a task with four containers, due to insufficient resources, only two containers will be started, then two containers will be started first. If resources are released, the expected four containers will be started)






yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler  


yarn.resourcemanager.scheduler.monitor.enable
true


yarn.resourcemanager.hostname
17.FoxLog WX


Yarn.NodeManager.Aux-Services
mapreduce_shuffle
< /property>

yarn.nodemanager.resource.cpu-vcores
4

< br>    
        yarn.nodemanager.resource.memory-mb
        16000
    

    
       
yarn.nodemanager.vmem-pmem-ratio
2.1


yarn.nodemanager.vmem-check-enabled
false


yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore


yarn.resourcemanager.fs.state-store.uri
/tmp/yarn/system/rmstore


yarn.scheduler.maximum-allocation-mb
6192


Yarn.Scheduler.minimum-allocation-mb
1280


yarn.resourcemanager.recovery.enabled
true












> yarn.log-aggregation-enable
true


yarn.log-aggregation .retain-seconds
607800


yarn.log-aggregation.retain-check-interval-seconds
10800











yarn.nodemanager.log-dirs
/hdpdata/hadoop/tmp/nm- log-dirs


yarn.nodemanager.log.retain-seconds
10800


yarn.nodemanager.remote-app-log-dir
/tmp/logs


yarn.nodemanager.remote-app-log-dir-suffix
property>
< br> yarn.nodemanager.delete.debug-delay-sec
600




br>

yarn. acl.enable
true

      <property>

     <name>yarn.admin.aclname>

        <value>hadp< /value>

      property>< /span>




"1.0"?>

"text/xsl" href="configuration.xsl"?>






mapreduce.framework.name
yarn



mapreduce.reduce.shuffle.parallelcopies
10






mapreduce.jobhistory.address
FoxLog17.engine.wx:10020


mapreduce.jobhistory.webapp.address
FoxLog17.engine.wx:19888







mapreduce.map.memory.mb
2048



mapreduce.map.java.opts
-Xmx1536M



mapreduce.reduce.memory.mb
4096



mapreduce.reduce.java.opts
-Xmx3584M



mapred.child.env
LD_LIBRARY_PATH=/home/hadoop/lzo/lib




mapreduce.cluster.acls.enabled
true







yarn.scheduler.capacity.maximum-applications
10000

Maximum number of applications that can be pending and running.




yarn.scheduler.capacity.maximum-am-resource-percent
0.1

Maximum percent of resources
in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.




yarn.scheduler.capacity.resource-calculator
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator

The ResourceCalculator implementation to be used to compare
Resources
in the scheduler.
The
default i.e. DefaultResourceCalculator only uses Memory while
DominantResourceCalculator uses dominant
-resource to compare
multi
-dimensional resources such as Memory, CPU etc.




yarn.scheduler.capacity.root.queues
default,analysis

The queues at the
this level (root is the root queue).




yarn.scheduler.capacity.root.default.capacity
70
Default queue target capacity.



yarn.scheduler.capacity.root.default.user-limit-factor
1.4

Default queue user limit a percentage
from 0.0 to 1.0.




yarn.scheduler.capacity.root.default.maximum-capacity
100

The maximum capacity of the
default queue.




yarn.scheduler.capacity.root.default.state
RUNNING

The state of the
default queue. State can be one of RUNNING or STOPPED.




yarn.scheduler.capacity.root.default.acl_submit_applications
*

The ACL of who can submit jobs to the
default queue.




yarn.scheduler.capacity.root.default.acl_administer_queue
*

The ACL of who can administer jobs on the
default queue.




yarn.scheduler.capacity.node-locality-delay
40

Number of missed scheduling opportunities after which the CapacityScheduler
attempts to schedule rack
-local containers.
Typically
this should be set to number of nodes in the cluster, By default is setting
approximately number of nodes
in one rack which is 40.





yarn.scheduler.capacity.root.analysis.capacity
30



yarn.scheduler.capacity.root.analysis.user-limit-factor
1.9    <! --可以配置为允许单个用户获取更多资源的队列容量的倍数。如果值小于1,那么该用户使用的资源仅限该队列资源,而不会大量去占用其他队列的闲暇资源。 -->



yarn.scheduler.capacity.root.analysis.maximum-capacity
50      



yarn.scheduler.capacity.root.analysis.state
RUNNING



yarn.scheduler.capacity.root.analysis.acl_submit_applications
*



yarn.scheduler.capacity.root.analysis.acl_administer_queue
*



yarn rmadmin -refreshQueues

config.set("mapred.job.queue.name", "analysis");

hive:修改配置文件;因为hive一般是用于OLAP平台,可以把队列限制死;

hive-site.xml


mapreduce.job.queuename
analysis

spark:运行脚本指定queue 或 代码中指定

1- 脚本方式

--queue analysis

2- 代码方式
saprkConf.
set("yarn,spark.queue", "your_queue_name")

 

五:声明点

1- 队列借出去的资源被回收后才会归还;以container为单位。

2- 资源可以共用,并且可以达到最大限制资源,一般要配合user-limit-factor参数(默认为1,所以不能占用大量队列外资源,所以该参数一般设置大些);
3- CapacityScheduler资源调度器是没办法解决因资源短缺造成的任务等待问题。
4- 如果资源不足够多,会启动较少的container。(例如想启动一个四个container的任务,由于资源不足,只够启动两个,那么会先启动两个container,如果有资源释放,则会启动预期的四个container)

config.set("mapred.job.queue.name", "analysis");

hive-site.xml


mapreduce.job.queuename
analysis

1- 脚本方式

--queue analysis

2- 代码方式
saprkConf.
set("yarn,spark.queue", "your_queue_name")

1- 队列借出去的资源被回收后才会归还;以container为单位。

2- 资源可以共用,并且可以达到最大限制资源,一般要配合user-limit-factor参数(默认为1,所以不能占用大量队列外资源,所以该参数一般设置大些);
3- CapacityScheduler资源调度器是没办法解决因资源短缺造成的任务等待问题。
4- 如果资源不足够多,会启动较少的container。(例如想启动一个四个container的任务,由于资源不足,只够启动两个,那么会先启动两个container,如果有资源释放,则会启动预期的四个container)

Leave a Comment

Your email address will not be published.