Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
It would be very useful to have a command that we could give a hdfs directory to, that would use fsck to find the block locations of the data files in that directory and group them by host and display the distribution graphically. We did this by hand and it was very for finding a skewed distribution that was causing performance problems. The tool should also be able to group by rack id and generate a similar plot.