After we have successfully upgraded the Cloudera manager in part 1, we are ready for the upgrade of CDH itself.
For those with enterprise license, Cloudera supports a rolling upgrade, where you do not have to shut down all your nodes at once and the cluster can keep functioning while you are upgrading. Unfortunately I do not have such license so I will have to perform a full cluster shutdown.
I ran this procedure on a simple, not secured, not highly available cluster. For a secured or HA cluster there may be some extra steps needed. I also used parcels and not packages.
First, Cloudera recommends running host inspector on all hosts and fix any problems found.
We will enter maintenance mode to suppress system alerts during the upgrade process. This is done from the cluster’s actions menu:
We now have to back up the NameNode data directory that contains HDFS metadata. Go to HDFS service, configuration, and scroll down until you find the NameNode Data Directories. Then log in to the NameNode server, and back up this directory. If there is more than one, just pick one of them (they are all identical).
You should also Back up the Hive, Sentry, and Sqoop metastore databases.
Now let the fun part begin ! Start the upgrade wizard by choosing “upgrade cluster” from the cluster actions menu.
Now there are some checks and confirmations and the upgrade itself begins:
This time host inspector does not complain like after the Cloudera manager upgrade, the upgrade fixed the version incompatibilities.
The cluster will now automatically restart and make configuration changes:
And voila! , Few minutes later we have an upgraded cluster:
Notice some enhancements in the new version like enabling Hive to run on spark (this is a new feature in Hadoop, but here it is done easily with Cloudera manager) and HiveServer2 WebUI. I will explore those in a later post.
That’s it ! Don’t forget to exit maintenance mode !