Hadoop Common
  1. Hadoop Common
  2. HADOOP-7133

CLONE to COMMON - HDFS-1445 Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.23.0
    • Component/s: util
    • Labels:
      None
    • Release Note:
      Hide
      This is the COMMON portion of a fix requiring coordinated change of COMMON and HDFS. Please see HDFS-1445 for HDFS portion and release note.
      Show
      This is the COMMON portion of a fix requiring coordinated change of COMMON and HDFS. Please see HDFS-1445 for HDFS portion and release note.
    • Tags:
      hard links, upgrade, snapshot

      Description

      The fix for HDFS-1445 "Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file" requires coordinated change in COMMON and HDFS. This is the COMMON portion, submitted here under a separate bug to activate the automated testing.

      Warning: this patch to COMMON, by itself, will break HDFS. It requires coordinated commit of the HDFS portion of the patch in HDFS-1445.

        Issue Links

          Activity

          Hide
          Matt Foley added a comment -

          Thanks guys. HADOOP-7181 opened for Hairong's ThreadLocal suggestion.
          HDFS-1445 "submit patch" activated.
          HADOOP-7182 opened to clean up the backward compatibility stub from FileUtils.

          Show
          Matt Foley added a comment - Thanks guys. HADOOP-7181 opened for Hairong's ThreadLocal suggestion. HDFS-1445 "submit patch" activated. HADOOP-7182 opened to clean up the backward compatibility stub from FileUtils.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #627 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/627/)
          HADOOP-7133. Batch the calls in DataStorage to FileUtil.createHardLink(). Contributed by Matt Foley.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #627 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk/627/ ) HADOOP-7133 . Batch the calls in DataStorage to FileUtil.createHardLink(). Contributed by Matt Foley.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/523/)
          HADOOP-7133. Batch the calls in DataStorage to FileUtil.createHardLink(). Contributed by Matt Foley.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Common-trunk-Commit/523/ ) HADOOP-7133 . Batch the calls in DataStorage to FileUtil.createHardLink(). Contributed by Matt Foley.
          Hide
          Jakob Homan added a comment -

          I've committed this. Resolving as fixed. Matt, please open a JIRA for Hairong's suggestion. Thanks.

          Show
          Jakob Homan added a comment - I've committed this. Resolving as fixed. Matt, please open a JIRA for Hairong's suggestion. Thanks.
          Hide
          Hairong Kuang added a comment -

          Sure, please go ahead and commit it. I already +1 it.

          Show
          Hairong Kuang added a comment - Sure, please go ahead and commit it. I already +1 it.
          Hide
          Matt Foley added a comment -

          Hi Hairong, glad it's working well for you. Your suggestion to use ThreadLocal buffers is perfect. It will take me a little time to implement and test, though, so I'd like to go ahead with Jakob's suggestion, if you don't mind, so we can proceed with review of the second part of the patch in HDFS-1445.

          Show
          Matt Foley added a comment - Hi Hairong, glad it's working well for you. Your suggestion to use ThreadLocal buffers is perfect. It will take me a little time to implement and test, though, so I'd like to go ahead with Jakob's suggestion, if you don't mind, so we can proceed with review of the second part of the patch in HDFS-1445 .
          Hide
          Jakob Homan added a comment -

          Hairong: Good suggestion on the threadlocal. I'm ready to commit this as is. Would you be ok with doing that optimization in separate JIRA?

          Show
          Jakob Homan added a comment - Hairong: Good suggestion on the threadlocal. I'm ready to commit this as is. Would you be ok with doing that optimization in separate JIRA?
          Hide
          Hairong Kuang added a comment -

          Matt, this optimization is fantastic. I have ported it to our internal branch. Together with parallel upgrades, it cut the DN upgrade time from 1 hour 40 minutes to 1 minute! Awesome job!

          +1. The patch looks good. One minor optimization is that you could use ThreadLocal for shell commands. This will remove the need to create a buffer for each shell command to run.

          Show
          Hairong Kuang added a comment - Matt, this optimization is fantastic. I have ported it to our internal branch. Together with parallel upgrades, it cut the DN upgrade time from 1 hour 40 minutes to 1 minute! Awesome job! +1. The patch looks good. One minor optimization is that you could use ThreadLocal for shell commands. This will remove the need to create a buffer for each shell command to run.
          Hide
          Matt Foley added a comment -

          Yes. s/resubmit/push the "submit patch" button again for/

          Show
          Matt Foley added a comment - Yes. s/resubmit/push the "submit patch" button again for/
          Hide
          Jakob Homan added a comment -

          I will resubmit part-2, attached to this issue, as soon as part-1 has been approved and committed.

          I had interpreted this to mean there would be a new HDFS patch. Sounds like I'm incorrect and the current patch there is the one you're submitting?

          Show
          Jakob Homan added a comment - I will resubmit part-2, attached to this issue, as soon as part-1 has been approved and committed. I had interpreted this to mean there would be a new HDFS patch. Sounds like I'm incorrect and the current patch there is the one you're submitting?
          Hide
          Matt Foley added a comment -

          Hi Jakob, the HDFS counterpart has been up for some time on HDFS-1445. However, test-patch can't run against the HDFS part until this COMMON part has been committed. So perhaps it would be best if you take a quick look at HDFS-1445, then commit this patch so that I can enable test-patch to run against the other part? Thanks.

          Show
          Matt Foley added a comment - Hi Jakob, the HDFS counterpart has been up for some time on HDFS-1445 . However, test-patch can't run against the HDFS part until this COMMON part has been committed. So perhaps it would be best if you take a quick look at HDFS-1445 , then commit this patch so that I can enable test-patch to run against the other part? Thanks.
          Hide
          Jakob Homan added a comment -

          +1 on this patch, however, even with the backwards compatibility now provided, I'd like to see the HDFS counterpart and commit them both at once. Once that has been posted and reviewed, I'll go ahead and commit both.

          Show
          Jakob Homan added a comment - +1 on this patch, however, even with the backwards compatibility now provided, I'd like to see the HDFS counterpart and commit them both at once. Once that has been posted and reviewed, I'll go ahead and commit both.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471347/HDFS-1445-trunk.v23_common_1-of-3.patch
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/243//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/243//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/243//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471347/HDFS-1445-trunk.v23_common_1-of-3.patch against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/243//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/243//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/243//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471347/HDFS-1445-trunk.v23_common_1-of-3.patch
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/240//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/240//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/240//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471347/HDFS-1445-trunk.v23_common_1-of-3.patch against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/240//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/240//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/240//console This message is automatically generated.
          Hide
          Matt Foley added a comment -

          This new version of the part-1 patch is backward compatible and will not break HDFS.

          Please review.

          Show
          Matt Foley added a comment - This new version of the part-1 patch is backward compatible and will not break HDFS. Please review.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12470703/HDFS-1445-trunk.v22_common_1-of-2.patch
          against trunk revision 1070021.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/231//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/231//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/231//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12470703/HDFS-1445-trunk.v22_common_1-of-2.patch against trunk revision 1070021. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/231//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/231//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/231//console This message is automatically generated.
          Hide
          Matt Foley added a comment -

          Hi Todd, that's painful but workable. I'll submit a revised patch shortly. Thanks for the guidance.

          Show
          Matt Foley added a comment - Hi Todd, that's painful but workable. I'll submit a revised patch shortly. Thanks for the guidance.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12470703/HDFS-1445-trunk.v22_common_1-of-2.patch
          against trunk revision 1068729.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 5 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/224//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/224//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/224//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12470703/HDFS-1445-trunk.v22_common_1-of-2.patch against trunk revision 1068729. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/224//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/224//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/224//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Since FileUtil.HardLink is a public class, we should deprecate it in this patch and only remove it in the next release. Instead of removing it, can you have it forward calls to the new implementation? That way we also don't need to coordinate the commit across HDFS and Common.

          Show
          Todd Lipcon added a comment - Since FileUtil.HardLink is a public class, we should deprecate it in this patch and only remove it in the next release. Instead of removing it, can you have it forward calls to the new implementation? That way we also don't need to coordinate the commit across HDFS and Common.

            People

            • Assignee:
              Matt Foley
              Reporter:
              Matt Foley
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development