How to find the physical location of a HDFS block

This is not a full size post, but just a small and quick one that you may find helpful in some situations:

Every now and then you need to find the physical location of a data block (mainly on which data node it resides).

There is more than one way to get this information, but the one I use is HDFS WebUI. It works on Apache Hadoop and on Cloudera (didn’t check on other distributions).

You should point your browser to: http://[your active nameNode]:50070

You  will see a web page like this:


Open the last menu item “utilities” and choose “browse the file system”.

View full size image

This will show you all the directories in your HDFS. Browse until you get to the file you want:

View full size image

You can also see that the replication factor for this block is 3.

Clicking the file name will bring up this window:

The block information drop down list enables you to choose which block of the file you want to see (If the file spans more than one block).

Then, under “availability”, you can see the list of data nodes that holds a copy of this block.


This entry was posted in HDFS and tagged , . Bookmark the permalink.

Leave a Reply