Hadoop YARN
  1. Hadoop YARN
  2. YARN-12

Several Findbugs issues with new FairScheduler in YARN

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: scheduler
    • Labels:
      None

      Description

      The good feature of FairScheduler is added recently to YARN. As recently PreCommit test from MAPREDUCE-4309, there are several bugs found by Findbugs related to FairScheduler:
      org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerEventLog.shutdown() might ignore java.lang.Exception
      Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerEventLog.logDisabled; locked 50% of time
      Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.queueMaxAppsDefault; locked 50% of time
      Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.userMaxAppsDefault; locked 50% of time
      The details are in:https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2612//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html#DE_MIGHT_IGNORE

      1. YARN-12.patch
        12 kB
        Junping Du
      2. MAPREDUCE-4452-v3.patch
        12 kB
        Junping Du
      3. MAPREDUCE-4452-v2.patch
        4 kB
        Junping Du
      4. MAPREDUCE-4452-v1.patch
        4 kB
        Junping Du
      5. MAPREDUCE-4452.patch
        4 kB
        Junping Du

        Issue Links

          Activity

          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12536951/MAPREDUCE-4452.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2619//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2619//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2619//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536951/MAPREDUCE-4452.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2619//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2619//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2619//console This message is automatically generated.
          Hide
          Junping Du added a comment -

          Update for one FindBug issue. As this patch is to fix Findbugs issue, so no additional test is attached.

          Show
          Junping Du added a comment - Update for one FindBug issue. As this patch is to fix Findbugs issue, so no additional test is attached.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12536965/MAPREDUCE-4452-v1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2620//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2620//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536965/MAPREDUCE-4452-v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2620//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2620//console This message is automatically generated.
          Hide
          Karthik Kambatla (Inactive) added a comment -

          Thanks for the patch, Junping.

          The patch definitely fixes the findbugs warnings. However, the patch converts all methods reported by findbugs to synchronized. In cases where findbugs reports a false positive for bugs, that might affect performance. It might be best to commit this patch, but create a separate JIRA to de-synchronize methods. For every method we de-synchronize, we probably need a justification (in comments) as to why it doesn't create a buggy scenario.

          Show
          Karthik Kambatla (Inactive) added a comment - Thanks for the patch, Junping. The patch definitely fixes the findbugs warnings. However, the patch converts all methods reported by findbugs to synchronized. In cases where findbugs reports a false positive for bugs, that might affect performance. It might be best to commit this patch, but create a separate JIRA to de-synchronize methods. For every method we de-synchronize, we probably need a justification (in comments) as to why it doesn't create a buggy scenario.
          Hide
          NO NAME added a comment -

          The warnings have been suppressed manually, so let's hold on with this proposed fix and see if we can justify avoiding synchronization here.

          Show
          NO NAME added a comment - The warnings have been suppressed manually, so let's hold on with this proposed fix and see if we can justify avoiding synchronization here.
          Hide
          Junping Du added a comment -

          Hi Patrick, Looks like the warnings are showed up in current precommit test again so that we don't suppressed that any more? I think it is still risky to have no any protection for thread-safe. The reason is: it is possible for update thread and refreshQueue operation to call reloadAllocs() for updating these properties while other thread is reading. To lower the performance overhead, may be we can reduce lock granularity as new attached patch (v2). What do you think?

          Show
          Junping Du added a comment - Hi Patrick, Looks like the warnings are showed up in current precommit test again so that we don't suppressed that any more? I think it is still risky to have no any protection for thread-safe. The reason is: it is possible for update thread and refreshQueue operation to call reloadAllocs() for updating these properties while other thread is reading. To lower the performance overhead, may be we can reduce lock granularity as new attached patch (v2). What do you think?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537332/MAPREDUCE-4452-v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2635//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2635//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537332/MAPREDUCE-4452-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2635//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2635//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2635//console This message is automatically generated.
          Hide
          Junping Du added a comment -

          The left two warning in Findbugs is not a issue as get/set primitive is atomic if the size of the variable is less that or equal to 32 bits. Here, both userMaxAppsDefault and queueMaxAppsDefault are int type so accessing them can still keep atomic.

          Show
          Junping Du added a comment - The left two warning in Findbugs is not a issue as get/set primitive is atomic if the size of the variable is less that or equal to 32 bits. Here, both userMaxAppsDefault and queueMaxAppsDefault are int type so accessing them can still keep atomic.
          Hide
          NO NAME added a comment -

          Hey Junping,

          Apologies - my understanding was that the warning suppression was in place and working.

          This looks great, feel free to commit it. I can look at lock-free approaches to this later, but let's get the warnings under control.

          Thanks,

          • Patrick
          Show
          NO NAME added a comment - Hey Junping, Apologies - my understanding was that the warning suppression was in place and working. This looks great, feel free to commit it. I can look at lock-free approaches to this later, but let's get the warnings under control. Thanks, Patrick
          Hide
          Junping Du added a comment -

          Hi Patrick, For warning suppression, that's my previous understanding too. I also see some other precommit test works fine. But the interesting thing is it doesn't work on my PreCommit test in MAPREDUCE-4309 today and the same 4 Findbugs error comes out.
          I will mark this issue as reviewed and good to see someone can commit it.

          Show
          Junping Du added a comment - Hi Patrick, For warning suppression, that's my previous understanding too. I also see some other precommit test works fine. But the interesting thing is it doesn't work on my PreCommit test in MAPREDUCE-4309 today and the same 4 Findbugs error comes out. I will mark this issue as reviewed and good to see someone can commit it.
          Hide
          NO NAME added a comment -

          What really confuses me here is that you've added new locks for (queueMaxApps, userMaxApps, queueWeights, etc) in this patch rather than synchronizing them under the existing lock that's held when they are set.

          Really this doesn't ensure that they won't be concurrently read/modified any more than before, but it looks like findbugs just check's that variables are consistently read or modified under some lock rather than under the same lock. That's odd to me.

          Show
          NO NAME added a comment - What really confuses me here is that you've added new locks for (queueMaxApps, userMaxApps, queueWeights, etc) in this patch rather than synchronizing them under the existing lock that's held when they are set. Really this doesn't ensure that they won't be concurrently read/modified any more than before, but it looks like findbugs just check's that variables are consistently read or modified under some lock rather than under the same lock. That's odd to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537479/MAPREDUCE-4452-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2643//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2643//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2643//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537479/MAPREDUCE-4452-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 10 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2643//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2643//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2643//console This message is automatically generated.
          Hide
          Junping Du added a comment -

          Patrick, you are right that the threads of modifying and reading are pending on different lock object/attribute in my previous patch, so I update to v3 patch so both of them are synchronized on related attributes. The intention to get rid of synchronized getter methods (as v1 patch) is that will make QueueManager instance can only be accessed by one thread per time although different threads are reading/modifying different attributes. I guess that is the right way to do it. What do you think?

          Show
          Junping Du added a comment - Patrick, you are right that the threads of modifying and reading are pending on different lock object/attribute in my previous patch, so I update to v3 patch so both of them are synchronized on related attributes. The intention to get rid of synchronized getter methods (as v1 patch) is that will make QueueManager instance can only be accessed by one thread per time although different threads are reading/modifying different attributes. I guess that is the right way to do it. What do you think?
          Hide
          Junping Du added a comment -

          In new v3 patch, separating monitor objects from attributes to get rid of ML_SYNC_ON_FIELD_TO_GUARD_CHANGING_THAT_FIELD which is due to updating on lock object. Also, add synchronization to getter/setter of userMaxAppsDefault and queueMaxAppsDefault to get ride of FindBugs warning.(in fact, it is not necessary because this is int type and should be finished in one CPU instruction. However, it is not much overhead as lock level is fine-grained).

          Show
          Junping Du added a comment - In new v3 patch, separating monitor objects from attributes to get rid of ML_SYNC_ON_FIELD_TO_GUARD_CHANGING_THAT_FIELD which is due to updating on lock object. Also, add synchronization to getter/setter of userMaxAppsDefault and queueMaxAppsDefault to get ride of FindBugs warning.(in fact, it is not necessary because this is int type and should be finished in one CPU instruction. However, it is not much overhead as lock level is fine-grained).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12537485/MAPREDUCE-4452-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2644//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2644//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12537485/MAPREDUCE-4452-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2644//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2644//console This message is automatically generated.
          Hide
          Arun C Murthy added a comment -

          Junping - apologies. Can you pls rebase your patch after YARN-1. Thanks!

          Show
          Arun C Murthy added a comment - Junping - apologies. Can you pls rebase your patch after YARN-1 . Thanks!
          Hide
          Junping Du added a comment -

          Hi Arun, Just finish it and upload it as YARN-12.patch. However, it is pretty weird that I cannot mark this issue as patch available.

          Show
          Junping Du added a comment - Hi Arun, Just finish it and upload it as YARN-12 .patch. However, it is pretty weird that I cannot mark this issue as patch available.
          Hide
          NO NAME added a comment -

          Hey Junping,

          Thanks for the contribution, it looks good to me. I find the excess locking a bit pedantic, but if it's necessary to get FindBugs to shut-up, so be it!

          Show
          NO NAME added a comment - Hey Junping, Thanks for the contribution, it looks good to me. I find the excess locking a bit pedantic, but if it's necessary to get FindBugs to shut-up, so be it!
          Hide
          Arun C Murthy added a comment -

          I just committed this. Thanks Junping!

          Also, thanks to Patrick for the review.

          Show
          Arun C Murthy added a comment - I just committed this. Thanks Junping! Also, thanks to Patrick for the review.
          Hide
          Arun C Murthy added a comment -

          However, it is pretty weird that I cannot mark this issue as patch available.

          Yeah, sorry, working with #asfinfra to fix the new YARN jira 'workflow'. On it.

          Show
          Arun C Murthy added a comment - However, it is pretty weird that I cannot mark this issue as patch available. Yeah, sorry, working with #asfinfra to fix the new YARN jira 'workflow'. On it.

            People

            • Assignee:
              Junping Du
              Reporter:
              Junping Du
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development