Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10448

CacheManager#addInternal tracks bytesNeeded incorrectly when dealing with replication factors other than 1

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.1
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: caching
    • Labels:
      None
    • Target Version/s:

      Description

      The logic in CacheManager#checkLimit is not correct. In this method, it does with these three logic:

      First, it will compute needed bytes for the specific path.

      CacheDirectiveStats stats = computeNeeded(path, replication);
      

      But the param replication is not used here. And the bytesNeeded is just one replication's vaue.

      return new CacheDirectiveStats.Builder()
              .setBytesNeeded(requestedBytes)
              .setFilesCached(requestedFiles)
              .build();
      

      Second, then it should be multiply by the replication to compare the limit size because the method computeNeeded was not used replication.

      pool.getBytesNeeded() + (stats.getBytesNeeded() * replication) > pool.getLimit()
      

      Third, if we find the size was more than the limit value and then print warning info. It divided by replication here, while the stats.getBytesNeeded() was just one replication value.

            throw new InvalidRequestException("Caching path " + path + " of size "
                + stats.getBytesNeeded() / replication + " bytes at replication "
                + replication + " would exceed pool " + pool.getPoolName()
                + "'s remaining capacity of "
                + (pool.getLimit() - pool.getBytesNeeded()) + " bytes.");
      

        Activity

        Hide
        linyiqun Yiqun Lin added a comment -

        Thanks Colin P. McCabe for review again and commit!

        Show
        linyiqun Yiqun Lin added a comment - Thanks Colin P. McCabe for review again and commit!
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-trunk-Commit #9992 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9992/)
        HDFS-10448. CacheManager#addInternal tracks bytesNeeded incorrectly when (cmccabe: rev 46f1602e896273b308fbd5df6c75f6c142828227)

        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9992 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9992/ ) HDFS-10448 . CacheManager#addInternal tracks bytesNeeded incorrectly when (cmccabe: rev 46f1602e896273b308fbd5df6c75f6c142828227) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java
        Hide
        cmccabe Colin P. McCabe added a comment -

        Committed to 2.8. Thanks, Yiqun Lin! Sorry for the delays in reviews.

        Show
        cmccabe Colin P. McCabe added a comment - Committed to 2.8. Thanks, Yiqun Lin ! Sorry for the delays in reviews.
        Hide
        cmccabe Colin P. McCabe added a comment -

        Hi Yiqun Lin,

        Sorry, I misread the patch the first time around. You are indeed changing computeNeeded to take the replication factor into account, which seems like a better way to go.

        +1

        Show
        cmccabe Colin P. McCabe added a comment - Hi Yiqun Lin , Sorry, I misread the patch the first time around. You are indeed changing computeNeeded to take the replication factor into account, which seems like a better way to go. +1
        Hide
        linyiqun Yiqun Lin added a comment -

        Hi, Colin P. McCabe, can you take a look for my last comment? It seems that you misunderstand my patch. Thanks.

        Show
        linyiqun Yiqun Lin added a comment - Hi, Colin P. McCabe , can you take a look for my last comment? It seems that you misunderstand my patch. Thanks.
        Hide
        linyiqun Yiqun Lin added a comment -

        Hi, Colin P. McCabe, I am a little confused of what you said.

        I think it should change computeNeeded to take replication into account

        I think I have took replication into account in computeNeeded.

            return new CacheDirectiveStats.Builder()
                .setBytesNeeded(requestedBytes * replication)
                .setFilesCached(requestedFiles)
                .build();
        

        Because the change of this, I have to update the logic of the origin code that calls computeNeeded,
        From

        pool.getBytesNeeded() + (stats.getBytesNeeded() * replication) > pool
                .getLimit()
        

        To

        pool.getBytesNeeded() + stats.getBytesNeeded() > pool.getLimit()
        

        I think this is needed.

        Correct me If I am wrong, thanks.

        Show
        linyiqun Yiqun Lin added a comment - Hi, Colin P. McCabe , I am a little confused of what you said. I think it should change computeNeeded to take replication into account I think I have took replication into account in computeNeeded . return new CacheDirectiveStats.Builder() .setBytesNeeded(requestedBytes * replication) .setFilesCached(requestedFiles) .build(); Because the change of this, I have to update the logic of the origin code that calls computeNeeded , From pool.getBytesNeeded() + (stats.getBytesNeeded() * replication) > pool .getLimit() To pool.getBytesNeeded() + stats.getBytesNeeded() > pool.getLimit() I think this is needed. Correct me If I am wrong, thanks.
        Hide
        cmccabe Colin P. McCabe added a comment -

        I think it should change computeNeeded to take replication into account, rather than modifying the code that calls computeNeeded.

        Show
        cmccabe Colin P. McCabe added a comment - I think it should change computeNeeded to take replication into account, rather than modifying the code that calls computeNeeded .
        Hide
        linyiqun Yiqun Lin added a comment -

        So, Colin P. McCabe, what do you think the patch for this?

        Show
        linyiqun Yiqun Lin added a comment - So, Colin P. McCabe , what do you think the patch for this?
        Hide
        cmccabe Colin P. McCabe added a comment -

        This is a good find. I think that computeNeeded should take replication into account-- the fact that it doesn't currently is a bug. Then there would be no need to change the callers of computeNeeded.

        Show
        cmccabe Colin P. McCabe added a comment - This is a good find. I think that computeNeeded should take replication into account-- the fact that it doesn't currently is a bug. Then there would be no need to change the callers of computeNeeded .
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 13m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 6m 17s trunk passed
        +1 compile 0m 42s trunk passed
        +1 checkstyle 0m 26s trunk passed
        +1 mvnsite 0m 51s trunk passed
        +1 mvneclipse 0m 12s trunk passed
        +1 findbugs 1m 40s trunk passed
        +1 javadoc 1m 4s trunk passed
        +1 mvninstall 0m 46s the patch passed
        +1 compile 0m 40s the patch passed
        +1 javac 0m 40s the patch passed
        +1 checkstyle 0m 23s the patch passed
        +1 mvnsite 0m 47s the patch passed
        +1 mvneclipse 0m 9s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 1m 46s the patch passed
        +1 javadoc 1m 4s the patch passed
        -1 unit 58m 18s hadoop-hdfs in the patch failed.
        +1 asflicense 0m 20s Patch does not generate ASF License warnings.
        89m 51s



        Reason Tests
        Failed junit tests hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
          hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:2c91fd8
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12805567/HDFS-10448.001.patch
        JIRA Issue HDFS-10448
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 952920eff414 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 6161d9b
        Default Java 1.8.0_91
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15523/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15523/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15523/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15523/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 13m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 6m 17s trunk passed +1 compile 0m 42s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 40s trunk passed +1 javadoc 1m 4s trunk passed +1 mvninstall 0m 46s the patch passed +1 compile 0m 40s the patch passed +1 javac 0m 40s the patch passed +1 checkstyle 0m 23s the patch passed +1 mvnsite 0m 47s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 1m 4s the patch passed -1 unit 58m 18s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s Patch does not generate ASF License warnings. 89m 51s Reason Tests Failed junit tests hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped   hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12805567/HDFS-10448.001.patch JIRA Issue HDFS-10448 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 952920eff414 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 6161d9b Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/15523/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15523/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15523/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15523/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        linyiqun Yiqun Lin added a comment -

        Attach a initial patch, thanks review.

        Show
        linyiqun Yiqun Lin added a comment - Attach a initial patch, thanks review.

          People

          • Assignee:
            linyiqun Yiqun Lin
            Reporter:
            linyiqun Yiqun Lin
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development