Hadoop Common
  1. Hadoop Common
  2. HADOOP-3121

dfs -lsr fail with "Could not get listing "

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.16.1
    • Fix Version/s: 0.18.3
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      (This happened a lot when namenode was extremely slow due to some other reasons.)

      % hadoop dfs -lsr /

      failed with

      > Could not get listing for /aaa/bbb/randomfile

      It's probably because file was deleted between items = fs.listStatus( and ls(items[i]

      I think lsr should ignore this case and continue.

      1. 3121_20081107_0.18.patch
        8 kB
        Tsz Wo Nicholas Sze
      2. 3121_20081107.patch
        8 kB
        Tsz Wo Nicholas Sze
      3. HADOOP-3121.patch
        7 kB
        Raghu Angadi
      4. 3121_20081106b.patch
        7 kB
        Tsz Wo Nicholas Sze
      5. 3121_20081106.patch
        7 kB
        Tsz Wo Nicholas Sze
      6. 3121_20081105.patch
        0.6 kB
        Tsz Wo Nicholas Sze

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk #655 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/655/)
        . lsr should keep listing the remaining items but not terminate if there is any IOException. (szetszwo)

        Show
        Hudson added a comment - Integrated in Hadoop-trunk #655 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/655/ ) . lsr should keep listing the remaining items but not terminate if there is any IOException. (szetszwo)
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I just committed this.

        Show
        Tsz Wo Nicholas Sze added a comment - I just committed this.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        3121_20081107_0.18.patch: for 0.18. It also passed all unit tests locally.

        Show
        Tsz Wo Nicholas Sze added a comment - 3121_20081107_0.18.patch: for 0.18. It also passed all unit tests locally.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        The patch passed all unit tests locally.

        Show
        Tsz Wo Nicholas Sze added a comment - The patch passed all unit tests locally.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        ant test-patch

             [exec] +1 overall.  
        
             [exec]     +1 @author.  The patch does not contain any @author tags.
        
             [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
        
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
        
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
        
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
        
             [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
        
        Show
        Tsz Wo Nicholas Sze added a comment - ant test-patch [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
        Hide
        Raghu Angadi added a comment -

        +1 for your patch. thanks.

        Show
        Raghu Angadi added a comment - +1 for your patch. thanks.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > I don't think most command print extra lines like 'ls: There were totaly 1 exception(s)' at the end..'
        I agree that the extra line is odd. Thanks.

        3121_20081107.patch: We should also change FsShell.run(...) to update exitCode.

        Show
        Tsz Wo Nicholas Sze added a comment - > I don't think most command print extra lines like 'ls: There were totaly 1 exception(s)' at the end..' I agree that the extra line is odd. Thanks. 3121_20081107.patch: We should also change FsShell.run(...) to update exitCode.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12393479/HADOOP-3121.patch
        against trunk revision 712033.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12393479/HADOOP-3121.patch against trunk revision 712033. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3545/console This message is automatically generated.
        Hide
        Raghu Angadi added a comment -

        I hope attached patch clarifies what I mean. It behaves same as before and does not add extra line of output that is unecessary.

        For e.g. compare 'bin/hadoop fs -ls /other' :

        • with the patch attached (and trunk) :
          ls: could not get get listing for '/other' : org.apache.hadoop.security.AccessControlException: Permission denied: [...]
          
        • with your patch :
          ls: could not get get listing for '/other' : org.apache.hadoop.security.AccessControlException: Permission denied: [...]
          ls: There were totaly 1 exception(s).
          

        I don't think most command print extra lines like 'ls: There were totaly 1 exception(s)' at the end..'

        Show
        Raghu Angadi added a comment - I hope attached patch clarifies what I mean. It behaves same as before and does not add extra line of output that is unecessary. For e.g. compare 'bin/hadoop fs -ls /other' : with the patch attached ( and trunk) : ls: could not get get listing for '/other' : org.apache.hadoop.security.AccessControlException: Permission denied: [...] with your patch : ls: could not get get listing for '/other' : org.apache.hadoop.security.AccessControlException: Permission denied: [...] ls: There were totaly 1 exception(s). I don't think most command print extra lines like 'ls: There were totaly 1 exception(s)' at the end..'
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > instead of ls throwing exception with "There were totally..." message, I think it is better to make ls return -1 and replace ls(..) with ' exitCode = ls() ' in doall().

        I am not sure what do you mean. Could you post a patch for this?

        Show
        Tsz Wo Nicholas Sze added a comment - > instead of ls throwing exception with "There were totally..." message, I think it is better to make ls return -1 and replace ls(..) with ' exitCode = ls() ' in doall(). I am not sure what do you mean. Could you post a patch for this?
        Hide
        Raghu Angadi added a comment -

        Thanks Nicholas. looks good. One nit:

        instead of ls throwing exception with "There were totally..." message, I think it is better to make ls return -1 and replace ls(..) with ' exitCode = ls() ' in doall().

        Show
        Raghu Angadi added a comment - Thanks Nicholas. looks good. One nit: instead of ls throwing exception with "There were totally..." message, I think it is better to make ls return -1 and replace ls(..) with ' exitCode = ls() ' in doall().
        Hide
        Tsz Wo Nicholas Sze added a comment - - edited

        3121_20081106b.patch: renamed getFileStatus(...) to shellListStatus(...)

        Show
        Tsz Wo Nicholas Sze added a comment - - edited 3121_20081106b.patch: renamed getFileStatus(...) to shellListStatus(...)
        Hide
        Raghu Angadi added a comment -

        Couple of comments:

        • do we need to rename ListStatus() to FileStatus()?
        • getFileStatus() name is too close to FS.getFileStatus(), might be confusing while reading the code.

        I suggest renaming it to some thing like 'shellListStatus()'

        Show
        Raghu Angadi added a comment - Couple of comments: do we need to rename ListStatus() to FileStatus()? getFileStatus() name is too close to FS.getFileStatus(), might be confusing while reading the code. I suggest renaming it to some thing like 'shellListStatus()'
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Raghu, I see your point now. Here is a new patch.

        3121_20081106.patch: try-catch all IOException.

        Show
        Tsz Wo Nicholas Sze added a comment - Raghu, I see your point now. Here is a new patch. 3121_20081106.patch: try-catch all IOException.
        Hide
        Raghu Angadi added a comment - - edited

        Then I don't think it is a proper fix for the real issue. It fixes just one symptom reported.

        > How about we fix other exceptions in a separated issue?
        may be. +0 from me.

        Show
        Raghu Angadi added a comment - - edited Then I don't think it is a proper fix for the real issue. It fixes just one symptom reported. > How about we fix other exceptions in a separated issue? may be. +0 from me.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > Shouldn't it be the same for other exceptions? What if dir '/a' does not have permissions for this user?

        Since this issue is target for 0.18, I don't want to introduce big changes. How about we fix other exceptions in a separated issue?

        Show
        Tsz Wo Nicholas Sze added a comment - > Shouldn't it be the same for other exceptions? What if dir '/a' does not have permissions for this user? Since this issue is target for 0.18, I don't want to introduce big changes. How about we fix other exceptions in a separated issue?
        Hide
        Raghu Angadi added a comment -

        Shouldn't it be the same for other exceptions? What if dir '/a' does not have permissions for this user?

        Show
        Raghu Angadi added a comment - Shouldn't it be the same for other exceptions? What if dir '/a' does not have permissions for this user?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        3121_20081105.patch: print an error message and then keep listing other items, instead of throwing an exception and quit.

        Same as before: "lsr /", then remove /a right before lsr printing the result of /a
        
        bash-3.2$ ./bin/hadoop fs -lsr /
        drwxr-xr-x   - tsz supergroup          0 2008-11-05 14:18 /a
        hdfs://namenode:9000/a: No such file or directory.
        drwxr-xr-x   - tsz supergroup          0 2008-11-05 14:18 /b
        ...
        
        Show
        Tsz Wo Nicholas Sze added a comment - 3121_20081105.patch: print an error message and then keep listing other items, instead of throwing an exception and quit. Same as before: "lsr /", then remove /a right before lsr printing the result of /a bash-3.2$ ./bin/hadoop fs -lsr / drwxr-xr-x - tsz supergroup 0 2008-11-05 14:18 /a hdfs://namenode:9000/a: No such file or directory. drwxr-xr-x - tsz supergroup 0 2008-11-05 14:18 /b ...
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > This seems already fixed in trunk...

        Unfortunately, we still have this bug but the error message have been changed.

        "lsr /", then remove /a right before lsr printing the result of /a
        
        bash-3.2$ ./bin/hadoop fs -lsr /
        drwxr-xr-x   - tsz supergroup          0 2008-11-05 14:01 /a
        lsr: hdfs://namenode9000/a: No such file or directory.
        

        I plan to change lsr such that lsr prints an error message and then keep listing other items.

        Show
        Tsz Wo Nicholas Sze added a comment - > This seems already fixed in trunk... Unfortunately, we still have this bug but the error message have been changed. "lsr /", then remove /a right before lsr printing the result of /a bash-3.2$ ./bin/hadoop fs -lsr / drwxr-xr-x - tsz supergroup 0 2008-11-05 14:01 /a lsr: hdfs://namenode9000/a: No such file or directory. I plan to change lsr such that lsr prints an error message and then keep listing other items.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        This seems already fixed in trunk. At least, the behavior should be different since I could not find "Could not get listing " in the ls codes. I will check this in details.

        Show
        Tsz Wo Nicholas Sze added a comment - This seems already fixed in trunk. At least, the behavior should be different since I could not find "Could not get listing " in the ls codes. I will check this in details.
        Hide
        T Meyarivan added a comment -

        Is it likely to be fixed in 0.18.x ?

        Show
        T Meyarivan added a comment - Is it likely to be fixed in 0.18.x ? –
        Hide
        dhruba borthakur added a comment -

        This does not seem to be a regression and I propose that we fix it for 0.17.

        Show
        dhruba borthakur added a comment - This does not seem to be a regression and I propose that we fix it for 0.17.

          People

          • Assignee:
            Tsz Wo Nicholas Sze
            Reporter:
            Koji Noguchi
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development