In the last post, Generating synthetic MapReduce load, I presented my load generating program. This time I will use it to make my cluster busy so we can examine YARN’s capacity scheduler.
Capacity scheduler basically splits the cluster into virtual sub-clusters. It does so by using queues.
Each queue is allocated a slice of the cluster resources and each queue can be further sliced into sub-queues.
This way you ensure that no queue will hog the cluster resources and different users with different workload can share the same cluster.
First we need to configure the cluster to use Capacity scheduler, as CDH5 default is the Fair scheduler.
So we go to cluster menu and choose dynamic resource pools:
We then go to configuration tab, then to “other settings”, and at the bottom click “other Fair scheduler settings”:
This is somewhat misleading since it contains settings also for schedulers other than the Fair scheduler.
We change scheduler class from Fair scheduler to Capacity scheduler, and we configure it in the box below. I did not find any other way to change capacity scheduler settings but directly editing this XML.
This is the settings I used for this example:
<?xml version="1.0"?> <configuration> <property> <name>yarn.scheduler.capacity.root.queues</name> <value>default,guy</value> </property> <property> <name>yarn.scheduler.capacity.root.capacity</name> <value>100</value> </property> <property> <name>yarn.scheduler.capacity.root.default.capacity</name> <value>50</value> </property> <property> <name>yarn.scheduler.capacity.root.guy.capacity</name> <value>50</value> </property> </configuration>
I created two queues, “default” and “guy”. I assigned default 60% of cluster resources and guy the other 40%.
Now we will save the changes and restart the cluster because the configuration is stale now.
Now we are ready to test it.
In my cluster I have a total of 5 vcores and I set up queue default to use up to 60% of the resources and queue guy to use the rest 40%.
I used my HiveLoader program from this post. I started two instances of it, one firing 4 threads at default queue and the other targeting another four threads to queue guy.
Firs we start the instance that uses guy queue. Looking at the YARN UI you can see that it uses 3 containers and 3 vcors, along with 3GB of RAM:
Now I start the other instance that uses default queue. Soon you can see that the default queue takes up 60% of the resources (3 containers), and the guy queue is reduced to 40% of the resources (2 containers):
There are another two nice options to see the queue status, one is via the YARN Web UI, under the cluster menu, clicking on “scheduler” will show you the current queue status relative to the total capacity of the cluster:
The other is in Cloudera manager itself, in the dynamic resource pool page:
This way you can share the cluster resources between several entities without having one of them hogging the whole cluster resources.