Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5517

Lower the default maximum number of blocks per file

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change
    • Target Version/s:

      Description

      We introduced the maximum number of blocks per file in HDFS-4305, but we set the default to 1MM. In practice this limit is so high as to never be hit, whereas we know that an individual file with 10s of thousands of blocks can cause problems. We should lower the default value, in my opinion to 10k.

      1. HDFS-5517.patch
        2 kB
        Aaron T. Myers

        Activity

        Hide
        Uma Maheswara Rao G added a comment -

        +1 for lowering the value. Thanks Aaron for proposing this change. Recently we also faced this issue. When file having 2L blocks itself NN is behaving something abnormal.
        For every block addition we do update blocks, so this big array will be persisted every time incrementally I think. This will be overhead really, if allow to write more and more blocks for same file with the current code But in practice I am not sure what will be the scenario to configure less block sizes and more blocks for same file. But we saw this in testing cluster.

        Show
        Uma Maheswara Rao G added a comment - +1 for lowering the value. Thanks Aaron for proposing this change. Recently we also faced this issue. When file having 2L blocks itself NN is behaving something abnormal. For every block addition we do update blocks, so this big array will be persisted every time incrementally I think. This will be overhead really, if allow to write more and more blocks for same file with the current code But in practice I am not sure what will be the scenario to configure less block sizes and more blocks for same file. But we saw this in testing cluster.
        Hide
        Aaron T. Myers added a comment -

        Thanks a lot for the support, Uma. Here's a little patch which just changes the default from 1MM to 10,000.

        Show
        Aaron T. Myers added a comment - Thanks a lot for the support, Uma. Here's a little patch which just changes the default from 1MM to 10,000.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12614116/HDFS-5517.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5449//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5449//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614116/HDFS-5517.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5449//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5449//console This message is automatically generated.
        Hide
        Andrew Wang added a comment -

        +1 thanks Aaron.

        Show
        Andrew Wang added a comment - +1 thanks Aaron.
        Hide
        Vinayakumar B added a comment -

        +1 ATM. Thanks for Jira and patch.

        Show
        Vinayakumar B added a comment - +1 ATM. Thanks for Jira and patch.

          People

          • Assignee:
            Aaron T. Myers
            Reporter:
            Aaron T. Myers
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:

              Development