I use hdfs oiv -p FileDistribution command to do a file analyse. But the ArrayIndexOutOfBoundsException happened and lead the process terminated. The stack infos:
I looked into the code and I found the exception was threw in increasing count of distribution. And the reason for the exception is that the bucket number was more than the distribution's length.
Here are my steps:
1).The input command params:
The numIntervals in code should be 104857600/1024000 =102(real value:102.4), so the distribution's length should be numIntervals + 1 = 103.
2).The ArrayIndexOutOfBoundsException will happens when the file size is in range ((maxSize/step)*step, maxSize]. For example, if the size of one file is 104800000, and it's in range of size as mentioned before. And the bucket number is calculated as 104800000/1024000=102.3, then in code we do the Math.ceil of this, so the final value should be 103. But the distribution's length is also 103, it means the index is from 0 to 102. So the ArrayIndexOutOfBoundsException happens.
In a word, the exception will happens when maxSize can not be divided by step and meanwhile the size of file is in range ((maxSize/step)*step, maxSize]. The related logic should be changed from