Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Two bugs on listStatus for HarFileSystem:

      1) consider the following directory tree inside a hadoop archive
      /foo
      /foo/bar1
      /fooo
      /fooo/bar2

      In this case, listStatus(new Path("/foo")) will include /fooo/bar2 because fileStatusesInIndex is testing a prefix.

      2) HADOOP-6591 didn't take into consideration method fileStatusesInIndex(), and archives v2 return empty results for listStatus()

      1. HADOOP-6645.patch
        1.0 kB
        Rodrigo Schmidt
      2. HADOOP-6645.hotfix.patch
        0.7 kB
        Rodrigo Schmidt

        Issue Links

          Activity

          Hide
          Rodrigo Schmidt added a comment -

          The second bug is my fault. My patch for HADOOP-6591 was based on a patch for a version of hadoop 0.20 that didn't have fileStatusesIndex(). It doesn't break anything because we haven't committed the patch that generates the v2 files. However, it does prevent that patch from passing the unit tests.

          Show
          Rodrigo Schmidt added a comment - The second bug is my fault. My patch for HADOOP-6591 was based on a patch for a version of hadoop 0.20 that didn't have fileStatusesIndex(). It doesn't break anything because we haven't committed the patch that generates the v2 files. However, it does prevent that patch from passing the unit tests.
          Hide
          Rodrigo Schmidt added a comment -

          Unfortunately the unit tests for HarFileSystem must be inside MAPREDUCE, since that is where HadoopArchives (the archive creation tool) resides.

          Show
          Rodrigo Schmidt added a comment - Unfortunately the unit tests for HarFileSystem must be inside MAPREDUCE, since that is where HadoopArchives (the archive creation tool) resides.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12439255/HADOOP-6645.patch
          against trunk revision 923619.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12439255/HADOOP-6645.patch against trunk revision 923619. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/422/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          Mahadev, Nicholas, can one of you take a quick look at this patch? It's a simple one and I've already tested it with TestHarFileSystem.

          Show
          Rodrigo Schmidt added a comment - Mahadev, Nicholas, can one of you take a quick look at this patch? It's a simple one and I've already tested it with TestHarFileSystem.
          Hide
          Mahadev konar added a comment -
          -      String parentString = parent.getName();
          +      String parentString = parent.getName() + Path.SEPARATOR;
          

          rodrigo,
          do the tests fail without adding the Path.SEPERATOR? Why is it necessary to append a path.separator?

          Show
          Mahadev konar added a comment - - String parentString = parent.getName(); + String parentString = parent.getName() + Path.SEPARATOR; rodrigo, do the tests fail without adding the Path.SEPERATOR? Why is it necessary to append a path.separator?
          Hide
          Rodrigo Schmidt added a comment -

          Thanks for looking at it, Mahadev!
          If we don't add the separator, the selection if will pass for /fooo/bar2 when we ask to list the contents of /foo in the example I gave in the JIRA description. Adding Path.SEPARATOR makes sure that we only select children of the right parent, and not children of a parent that is an extension of the right one.

          The current unit tests don't cover this case... However, unit tests are in the MAPREDUCE project and I can't add them in HADOOP patches like this one. To avoid errors, we have to commit this patch first and then add new unit tests to MAPREDUCE.

          Show
          Rodrigo Schmidt added a comment - Thanks for looking at it, Mahadev! If we don't add the separator, the selection if will pass for /fooo/bar2 when we ask to list the contents of /foo in the example I gave in the JIRA description. Adding Path.SEPARATOR makes sure that we only select children of the right parent, and not children of a parent that is an extension of the right one. The current unit tests don't cover this case... However, unit tests are in the MAPREDUCE project and I can't add them in HADOOP patches like this one. To avoid errors, we have to commit this patch first and then add new unit tests to MAPREDUCE.
          Hide
          Mahadev konar added a comment -

          good catch rodrigo. Sorry just skimmed through the description and thats why the stupid question above!

          +1 for the patch.

          Show
          Mahadev konar added a comment - good catch rodrigo. Sorry just skimmed through the description and thats why the stupid question above! +1 for the patch.
          Hide
          Mahadev konar added a comment -

          also, are you adding test case to MAPREDUCE-1585 ? It would be good to have a good test case for the above!

          Show
          Mahadev konar added a comment - also, are you adding test case to MAPREDUCE-1585 ? It would be good to have a good test case for the above!
          Hide
          Rodrigo Schmidt added a comment -

          Yes I am. I just haven't had the time to upload the new patch yet.

          It's a good test case to have.

          Show
          Rodrigo Schmidt added a comment - Yes I am. I just haven't had the time to upload the new patch yet. It's a good test case to have.
          Hide
          Mahadev konar added a comment -

          I just committed this. thanks rodrigo!

          Show
          Mahadev konar added a comment - I just committed this. thanks rodrigo!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #208 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/208/)
          . Bugs on listStatus for HarFileSystem (rodrigo via mahadev)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #208 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/208/ ) . Bugs on listStatus for HarFileSystem (rodrigo via mahadev)
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #285 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/285/)
          . Bugs on listStatus for HarFileSystem (rodrigo via mahadev)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #285 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/285/ ) . Bugs on listStatus for HarFileSystem (rodrigo via mahadev)
          Hide
          Tsz Wo Nicholas Sze added a comment -

          It seems that TestHadoopArchives in mapred is failing after this.

          Show
          Tsz Wo Nicholas Sze added a comment - It seems that TestHadoopArchives in mapred is failing after this.
          Hide
          Rodrigo Schmidt added a comment -

          I'm checking it.

          Show
          Rodrigo Schmidt added a comment - I'm checking it.
          Hide
          Rodrigo Schmidt added a comment -

          I think I found the problem.

          My bad! I tested it against my local version of mapreduce, in which I had taken TestHadoopArchives out. I'm reopening the task and will submit a new patch, after I do some extra testing.

          Show
          Rodrigo Schmidt added a comment - I think I found the problem. My bad! I tested it against my local version of mapreduce, in which I had taken TestHadoopArchives out. I'm reopening the task and will submit a new patch, after I do some extra testing.
          Hide
          Rodrigo Schmidt added a comment -

          I didn't cover the case in which the parent directory is "/"

          Show
          Rodrigo Schmidt added a comment - I didn't cover the case in which the parent directory is "/"
          Hide
          Rodrigo Schmidt added a comment -

          This new patch should fix the problem.

          Show
          Rodrigo Schmidt added a comment - This new patch should fix the problem.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12439606/HADOOP-6645.hotfix.patch
          against trunk revision 926421.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 common tests. The patch failed common unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12439606/HADOOP-6645.hotfix.patch against trunk revision 926421. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 common tests. The patch failed common unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/428/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          trying hudson again!

          Show
          Mahadev konar added a comment - trying hudson again!
          Hide
          Mahadev konar added a comment -

          rodrigo,
          the tests still fail for me. Are you sure it fixes the tests for you?

          Show
          Mahadev konar added a comment - rodrigo, the tests still fail for me. Are you sure it fixes the tests for you?
          Hide
          Rodrigo Schmidt added a comment -

          Yes, they passed for me.

          It's a bit tricky to test things from a different hadoop-common jar, though. For ant to take the right file I had to run "ant mvn-install" on my hadoop-common install. Thencleared the jar inside ~/ivy2/cache/... (this is very important), went to my hadoop-mapreduce install and ran "ant test -Dresolvers=internal -Dtestcase=TestHadoopArchives".

          Did you do all that too?

          Show
          Rodrigo Schmidt added a comment - Yes, they passed for me. It's a bit tricky to test things from a different hadoop-common jar, though. For ant to take the right file I had to run "ant mvn-install" on my hadoop-common install. Thencleared the jar inside ~/ivy2/cache/... (this is very important), went to my hadoop-mapreduce install and ran "ant test -Dresolvers=internal -Dtestcase=TestHadoopArchives". Did you do all that too?
          Hide
          Mahadev konar added a comment -

          I tried a short cut of copying hadoop-core.jar from the common build to build/ivy/lib..... looks like ivy does some weird thing where it replaces it with the one fetched from ivy cache. but clean up everything and use maven install seems to work.... I wonder why we have to use ivy for all this (maven all over would have been a better choice)....

          Show
          Mahadev konar added a comment - I tried a short cut of copying hadoop-core.jar from the common build to build/ivy/lib..... looks like ivy does some weird thing where it replaces it with the one fetched from ivy cache. but clean up everything and use maven install seems to work.... I wonder why we have to use ivy for all this (maven all over would have been a better choice)....
          Hide
          Rodrigo Schmidt added a comment -

          Yes, ivy does replace things from its cache. I was running into this problem before I learned the maven tricks with ivy cache deletion.

          Show
          Rodrigo Schmidt added a comment - Yes, ivy does replace things from its cache. I was running into this problem before I learned the maven tricks with ivy cache deletion.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12439606/HADOOP-6645.hotfix.patch
          against trunk revision 926421.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12439606/HADOOP-6645.hotfix.patch against trunk revision 926421. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/429/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          Mahadev, Nicholas, Can one of you just double check if this new patch passes the TestHadoopArchives test? I've already tested it in my machine but with this whole ivy/maven complexity, I would feel more comfortable if someone else checked it (so that I don't break trunk once more).

          Feel free to commit the patch afterward, if you think it's okay.

          Show
          Rodrigo Schmidt added a comment - Mahadev, Nicholas, Can one of you just double check if this new patch passes the TestHadoopArchives test? I've already tested it in my machine but with this whole ivy/maven complexity, I would feel more comfortable if someone else checked it (so that I don't break trunk once more). Feel free to commit the patch afterward, if you think it's okay.
          Hide
          Mahadev konar added a comment -

          rodrigo,
          I ran the tests on mapreduce with this patch and the test passes!

          Show
          Mahadev konar added a comment - rodrigo, I ran the tests on mapreduce with this patch and the test passes!
          Hide
          Rodrigo Schmidt added a comment -

          Great! Can you commit it?

          Show
          Rodrigo Schmidt added a comment - Great! Can you commit it?
          Hide
          Mahadev konar added a comment -

          I just committed this. thanks rodrigo!

          Show
          Mahadev konar added a comment - I just committed this. thanks rodrigo!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #209 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/209/)
          . Re: Bugs on listStatus for HarFileSystem (rodrigo via mahadev)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #209 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/209/ ) . Re: Bugs on listStatus for HarFileSystem (rodrigo via mahadev)
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #287 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/287/)
          . Re: Bugs on listStatus for HarFileSystem (rodrigo via mahadev)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #287 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/287/ ) . Re: Bugs on listStatus for HarFileSystem (rodrigo via mahadev)

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              Rodrigo Schmidt
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development