Hadoop Common
  1. Hadoop Common
  2. HADOOP-4795

Lease monitor may get into an infinite loop

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.18.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If a lease is not found in the namespace for some reasons (e.g. bugs), lease monitor may get into an infinite loop.

      1. 4795_20081205.patch
        4 kB
        Tsz Wo Nicholas Sze
      2. 4795_20081208.patch
        4 kB
        Tsz Wo Nicholas Sze
      3. 4795_20081209.patch
        5 kB
        Tsz Wo Nicholas Sze
      4. 4795_20081209_0.18.patch
        5 kB
        Tsz Wo Nicholas Sze
      5. TestLeaseMonitor.java
        5 kB
        Tsz Wo Nicholas Sze
      6. 4795_20081209_0.19.patch
        5 kB
        Tsz Wo Nicholas Sze

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk #685 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/685/)
          . Prevent lease monitor getting into an infinite loop when leases and the namespace tree does not match. (szetszwo)

          Show
          Hudson added a comment - Integrated in Hadoop-trunk #685 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/685/ ) . Prevent lease monitor getting into an infinite loop when leases and the namespace tree does not match. (szetszwo)
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I just committed this.

          Show
          Tsz Wo Nicholas Sze added a comment - I just committed this.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          4795_20081209_0.19.patch: for 0.19

          Show
          Tsz Wo Nicholas Sze added a comment - 4795_20081209_0.19.patch: for 0.19
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Tested locally, only org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.TestStartAtOffset failed. It does not seem related to this patch. Will file a issue I see it failing again.

          Show
          Tsz Wo Nicholas Sze added a comment - Tested locally, only org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.TestStartAtOffset failed. It does not seem related to this patch. Will file a issue I see it failing again.
          Hide
          Tsz Wo Nicholas Sze added a comment -
               [exec] -1 overall.  
          
               [exec]     +1 @author.  The patch does not contain any @author tags.
          
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no tests are needed for this patch.
          
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
          
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
          
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
          
               [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
          

          Tested manually. No new tests added.

          Show
          Tsz Wo Nicholas Sze added a comment - [exec] -1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. Tested manually. No new tests added.
          Hide
          Konstantin Shvachko added a comment -

          +1.
          If the code in internalReleaseLease() below the new IOExceptions you introduced would throw an exception the data-structures may become inconsistent. But as I can see these exceptions are not thrown anywhere below under the conditions. We should commit this.
          Another concern that existing images will contain FileUnderConstruction, which will unclosable forever, and therefore we should investigate how to convert them into real files in a separate jira.

          Show
          Konstantin Shvachko added a comment - +1. If the code in internalReleaseLease() below the new IOExceptions you introduced would throw an exception the data-structures may become inconsistent. But as I can see these exceptions are not thrown anywhere below under the conditions. We should commit this. Another concern that existing images will contain FileUnderConstruction, which will unclosable forever, and therefore we should investigate how to convert them into real files in a separate jira.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          TestLeaseMonitor.java: This is a manual test for the infinite loop problem.

          Show
          Tsz Wo Nicholas Sze added a comment - TestLeaseMonitor.java: This is a manual test for the infinite loop problem.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          4795_20081209_0.18.patch: for 0.18

          Show
          Tsz Wo Nicholas Sze added a comment - 4795_20081209_0.18.patch: for 0.18
          Hide
          Tsz Wo Nicholas Sze added a comment -

          4795_20081209.patch: incorporated Konstantin's comments.

          Show
          Tsz Wo Nicholas Sze added a comment - 4795_20081209.patch: incorporated Konstantin's comments.
          Hide
          Konstantin Shvachko added a comment -
          1. I would rename Monitor.check() to something like checkLeases().
          2. esle statement can be removed.
          3. Wrapping String Collection into ArrayList can be replaced by collecting paths to be removed in an ArrayList and then removing them afterwards in a separate loop. This will be much less copying.
          Show
          Konstantin Shvachko added a comment - I would rename Monitor.check() to something like checkLeases(). esle statement can be removed. Wrapping String Collection into ArrayList can be replaced by collecting paths to be removed in an ArrayList and then removing them afterwards in a separate loop. This will be much less copying.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > In the method check(), if the older entry has not expired, we should break out of the while loop?

          You are right. I removed the "break;" in the original codes but forgot to add a "return;"

          4795_20081208.patch: fix the bug above. Will try to add a test.

          Show
          Tsz Wo Nicholas Sze added a comment - > In the method check(), if the older entry has not expired, we should break out of the while loop? You are right. I removed the "break;" in the original codes but forgot to add a "return;" 4795_20081208.patch: fix the bug above. Will try to add a test.
          Hide
          Suresh Srinivas added a comment -

          In the method check(), if the older entry has not expired, we should break out of the while loop? Otherwise this code will cause a tight loop until sortedLease.size() becomes 0 in size?

          Show
          Suresh Srinivas added a comment - In the method check() , if the older entry has not expired, we should break out of the while loop? Otherwise this code will cause a tight loop until sortedLease.size() becomes 0 in size?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          4795_20081205.patch: remove the lease if the corresponding INode is not found.

          Show
          Tsz Wo Nicholas Sze added a comment - 4795_20081205.patch: remove the lease if the corresponding INode is not found.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > lease is not found in the namespace for some reasons (e.g. bugs)

          Let me clarify this: the file creation lease is in the system but the corresponding file is not found. This should not happen normally.

          Show
          Tsz Wo Nicholas Sze added a comment - > lease is not found in the namespace for some reasons (e.g. bugs) Let me clarify this: the file creation lease is in the system but the corresponding file is not found. This should not happen normally.

            People

            • Assignee:
              Tsz Wo Nicholas Sze
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development