YARN multi-tenant management - management, More, tenant, Yarn

yarn multi-tenant configuration management (CapacityScheduler)

hadoop version is 2.7

One: Before multi-tenant implementation, there is only one default queue

share picture

Second configuration file modification

yarn-site.xml



 
 
 yarn.resourcemanager.scheduler.monitor.enable
 true

 

 name>yarn.resourcemanager.hostname
 FoxLog17.engine.wx
 
 

yarn.nodemanager.aux -services
 mapreduce_shuffle
 
 
 yarn.no demanager.resource.cpu-vcores
 4
 

 
 yarn.nodemanager.resource. memory-mb
 16000
 
 
 yarn.nodemanager.vmem-pmem-ratio< /name>
 2.1
 
 
 yarn.nodemanager.vmem-check-enabled
 
 yarn.nodemanager.vmem-check-enabled
 
 value>false

 
 yarn.resourcemanager.store.class
 org.apache.hadoop.yarn .server.resourcemanager.recovery.FileSystemRMStateStore
 
 
 yarn.resourcemanager.fs.state-store.uri
 
 yarn.resourcemanager.fs.state-store.uri

 value>/tmp/yarn/system/rmstore


yarn.scheduler.maximum-allocation-mb

 value>6192
 
 
 < Name> Yarn.Scheduler.minimum-allocation-mb  
  1280  
  
  
  Yarn.ResourceManager.Recovery .enabled
 true
 


 
yarn.log-aggregation-enable< /name>
 true
  
 
 yarn.log-aggregation.retain-seconds

 value>607800
 
 
 yarn.log-aggregation.retain-check-interval-seconds
 10800   
  
  Yarn.NodeManager.local-dirs  
  / hdpdata / hadoop / tmp / nm-local-dirs < /value>
 
 
 yarn.nodemanager.log-dirs
 /hdpdata/hadoop/tmp/nm-log- dirs
 
 
 yarn.nodemanager.log.retain-seconds
 10800
 
 
 yarn.nodemanager.remote-app-log-dir
 /tmp/logs
  
 
 yarn.nodemanager.remote-app-log-dir-suffix
 logs
logs
logs
 
 yarn.nodemanager.delete.debug-delay-sec
 600
 



 


 
 yarn.acl.enable
 true< br>

<name>yarn.admin.aclname> span>

<value>hadpvalue>

property>

mapred-site.xml

" 1.0"?>

"text/xsl" href="configuration.xsl"?>











 

 mapreduce.framework.name

 yarn

 



 

 mapreduce.reduce.shuffle.parallelcopies

 10

 

 

 

 



 

 mapreduce.jobhistory.address

 FoxLog17.engine.wx:10020

 

 

 mapreduce.jobhistory.webapp.address

 FoxLog17.engine.wx:19888

 





 





 

 mapreduce.map.memory.mb

 2048

 



 

 mapreduce.map.java.opts

 -Xmx1536M

 



 

 mapreduce.reduce.memory.mb

 4096

 



 

 mapreduce.reduce.java.opts

 -Xmx3584M

 



 

 mapred.child.env

 LD_LIBRARY_PATH=/home/hadoop/lzo/lib

 



 

 

 mapreduce.cluster.acls.enabled

 true

capacity-scheduler.xml

yarn.scheduler.capacity.maximum-applications

10000

Maximum number of applications that can be pending and running.

yarn.scheduler.capacity.maximum-am-resource-percent

0.1

Maximum percent of resources in the cluster which can be used to run

application masters i.e. controls number of concurrent running

applications.

yarn.scheduler.capacity.resource-calculator

org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator

The ResourceCalculator implementation to be used to compare

Resources in the scheduler.

The default ie DefaultResourceCalculator only uses Memory while

DominantResourceCalculator uses dominant-resource to compare

multi-dimensional resources such as Memory, CPU etc.

yarn.scheduler.capacity.root.queues

default,analysis

The queues at the this level (root is the root queue).

yarn.scheduler.capacity.root.default.capacity

Default queue target capacity.

yarn.scheduler.capacity.root.default.user-limit-factor

1.4

Default queue user limit a percentage from 0.0 to 1.0.

yarn.scheduler.capacity.root.default.maximum-capacity

100

The maximum capacity of the default queue.

yarn.scheduler.capacity.root.default.state

RUNNING

The state of the default queue. State can be one of RUNNING or STOPPED.

yarn.scheduler.capacity.root.default.acl_submit_applications

The ACL of who can submit jobs to the default queue.

yarn.scheduler.capacity.root.default.acl_administer_queue

The ACL of who can administer jobs on the default queue.

yarn.scheduler.capacity.node-locality-delay

Number of missed scheduling opportunities after which the CapacityScheduler

attempts to schedule rack-local containers.

Typically this should be set to number of nodes in the cluster, By default is setting

approximately number of nodes in one rack which is 40.

yarn.scheduler.capacity.root.analysis.capacity

yarn.scheduler.capacity.root.analysis.user-limit-factor

1.9

yarn.scheduler.capacity.root.analysis.maximum-capacity

yarn.scheduler.capacity.root.analysis.state

RUNNING

yarn.scheduler.capacity.root.analysis.acl_submit_applications

yarn.scheduler.capacity.root.analysis.acl_administer_queue

The explanation of the above configuration is only a point of personal understanding, and can be compared with the official document.

Three: Update yarn parameters

yarn rmadmin -refreshQueues

View ui:

share picture

Four: Specify the queue for the submitted task (recommended method, there are other methods, not listed)

mapreduce: Specify the queue in the code;

config.set< /span>("mapred.job.queue.name", "analysis");
hive: modify the configuration file; because hive is generally used on the OLAP platform, the queue can be restricted;
hive-site.xml

 mapreduce.job.queuename

 analysis

spark: run the script to specify the queue or specify in the code
1- Script mode

--queue analysis

2- Code method

saprkConf.set("yarn,spark.queue", "your_queue_name")
Five: Declaration point

1- The resources borrowed from the queue will only be returned after being recycled; the unit is container.

2- Resources can be shared, and can reach the maximum limit of resources, generally to cooperate with user-limit-factor parameter (the default is 1, so it can't take up a lot of resources outside the queue, so the parameter is generally set larger);

3- CapacityScheduler resource scheduler cannot solve the problem caused by resource shortage The task waiting problem.

4- If there are not enough resources, fewer containers will be started. (For example, if you want to start a task with four containers, due to insufficient resources, only two containers will be started, then two containers will be started first. If resources are released, the expected four containers will be started)






 yarn.resourcemanager.scheduler.class
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler　　
 
 
 yarn.resourcemanager.scheduler.monitor.enable
 true

 
 yarn.resourcemanager.hostname
17.FoxLog  WX  
  
  
  Yarn.NodeManager.Aux-Services  
  mapreduce_shuffle  
 < /property>
 
 yarn.nodemanager.resource.cpu-vcores
 4
 
< br>    
        yarn.nodemanager.resource.memory-mb
        16000
    
    
        
 yarn.nodemanager.vmem-pmem-ratio
 2.1
 
 
  yarn.nodemanager.vmem-check-enabled
 false
 
 
 yarn.resourcemanager.store.class
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore 
 
 
 yarn.resourcemanager.fs.state-store.uri
 /tmp/yarn/system/rmstore
 
 
 yarn.scheduler.maximum-allocation-mb
 6192
  
  
  Yarn.Scheduler.minimum-allocation-mb  
  1280  
  
 
 yarn.resourcemanager.recovery.enabled
 true












 > yarn.log-aggregation-enable
 true
  
 
 yarn.log-aggregation .retain-seconds
 607800
 
 
 yarn.log-aggregation.retain-check-interval-seconds
 10800      
 
 
 
 
 
 
 
 
 
 
 yarn.nodemanager.log-dirs
 /hdpdata/hadoop/tmp/nm- log-dirs
 
 
 yarn.nodemanager.log.retain-seconds
 10800 
 
 
 yarn.nodemanager.remote-app-log-dir
 /tmp/logs
 
 
 yarn.nodemanager.remote-app-log-dir-suffix
   property>
 < br> yarn.nodemanager.delete.debug-delay-sec
 600




    br> 
 
 yarn. acl.enable
 true

<name>yarn.admin.aclname>

property>

"1.0"?>

"text/xsl" href="configuration.xsl"?>











 

        mapreduce.framework.name

        yarn

 



 

        mapreduce.reduce.shuffle.parallelcopies

        10

 

        

    

    



 

        mapreduce.jobhistory.address

        FoxLog17.engine.wx:10020

 

 

        mapreduce.jobhistory.webapp.address        

            FoxLog17.engine.wx:19888

 





        





             

                     mapreduce.map.memory.mb

                             2048

                                 



        

                mapreduce.map.java.opts

                        -Xmx1536M

                            



        

                mapreduce.reduce.memory.mb

                        4096

                            



        

                mapreduce.reduce.java.opts

                        -Xmx3584M

                            



 

        mapred.child.env

        LD_LIBRARY_PATH=/home/hadoop/lzo/lib

 



    

    

              mapreduce.cluster.acls.enabled

        true

yarn.scheduler.capacity.maximum-applications

10000

Maximum number of applications that can be pending and running.

yarn.scheduler.capacity.maximum-am-resource-percent

0.1

Maximum percent of resources in the cluster which can be used to run

application masters i.e. controls number of concurrent running

applications.

yarn.scheduler.capacity.resource-calculator

org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator

The ResourceCalculator implementation to be used to compare

Resources in the scheduler.

The default i.e. DefaultResourceCalculator only uses Memory while

DominantResourceCalculator uses dominant-resource to compare

multi-dimensional resources such as Memory, CPU etc.

yarn.scheduler.capacity.root.queues

default,analysis

The queues at the this level (root is the root queue).

yarn.scheduler.capacity.root.default.capacity

Default queue target capacity.

yarn.scheduler.capacity.root.default.user-limit-factor

1.4

Default queue user limit a percentage from 0.0 to 1.0.

yarn.scheduler.capacity.root.default.maximum-capacity

100

The maximum capacity of the default queue.

yarn.scheduler.capacity.root.default.state

RUNNING

The state of the default queue. State can be one of RUNNING or STOPPED.

yarn.scheduler.capacity.root.default.acl_submit_applications

The ACL of who can submit jobs to the default queue.

yarn.scheduler.capacity.root.default.acl_administer_queue

The ACL of who can administer jobs on the default queue.

yarn.scheduler.capacity.node-locality-delay

Number of missed scheduling opportunities after which the CapacityScheduler

attempts to schedule rack-local containers.

Typically this should be set to number of nodes in the cluster, By default is setting

approximately number of nodes in one rack which is 40.

yarn.scheduler.capacity.root.analysis.capacity

yarn.scheduler.capacity.root.analysis.user-limit-factor

1.9 <！ --可以配置为允许单个用户获取更多资源的队列容量的倍数。如果值小于1，那么该用户使用的资源仅限该队列资源，而不会大量去占用其他队列的闲暇资源。 -->

yarn.scheduler.capacity.root.analysis.maximum-capacity

yarn.scheduler.capacity.root.analysis.state

RUNNING

yarn.scheduler.capacity.root.analysis.acl_submit_applications

yarn.scheduler.capacity.root.analysis.acl_administer_queue

yarn rmadmin -refreshQueues

config.set("mapred.job.queue.name", "analysis");

hive：修改配置文件；因为hive一般是用于OLAP平台，可以把队列限制死；

hive-site.xml



    mapreduce.job.queuename

    analysis

spark：运行脚本指定queue 或代码中指定

1- 脚本方式

--queue analysis



2- 代码方式

saprkConf.set("yarn,spark.queue", "your_queue_name")

五：声明点

1- 队列借出去的资源被回收后才会归还；以container为单位。

2- 资源可以共用，并且可以达到最大限制资源，一般要配合user-limit-factor参数（默认为1，所以不能占用大量队列外资源，所以该参数一般设置大些）；

3- CapacityScheduler资源调度器是没办法解决因资源短缺造成的任务等待问题。

4- 如果资源不足够多，会启动较少的container。（例如想启动一个四个container的任务，由于资源不足，只够启动两个，那么会先启动两个container，如果有资源释放，则会启动预期的四个container）

config.set("mapred.job.queue.name", "analysis");

hive-site.xml



    mapreduce.job.queuename

    analysis

1- 脚本方式

--queue analysis



2- 代码方式

saprkConf.set("yarn,spark.queue", "your_queue_name")

1- 队列借出去的资源被回收后才会归还；以container为单位。

2- 资源可以共用，并且可以达到最大限制资源，一般要配合user-limit-factor参数（默认为1，所以不能占用大量队列外资源，所以该参数一般设置大些）；

3- CapacityScheduler资源调度器是没办法解决因资源短缺造成的任务等待问题。

4- 如果资源不足够多，会启动较少的container。（例如想启动一个四个container的任务，由于资源不足，只够启动两个，那么会先启动两个container，如果有资源释放，则会启动预期的四个container）

Leave a Comment Cancel reply