Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1585

Create Hadoop Archives version 2 with filenames URL-encoded

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: harchive
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Hadoop Archives version 1 don't cope with files that have spaces on their names.

      One proposal is to URLEncode filenames inside the index file (version 2, refers to HADOOP-6591).

      This task is to allow the creation of version 2 files that have file names encoded appropriately. It currently depends on HADOOP-6591

      1. MAPREDUCE-1585.patch
        6 kB
        Rodrigo Schmidt
      2. MAPREDUCE-1585.1.patch
        19 kB
        Rodrigo Schmidt
      3. MAPREDUCE-1585.2.patch
        15 kB
        Rodrigo Schmidt

        Issue Links

          Activity

          Hide
          Rodrigo Schmidt added a comment -

          I've uploaded a patch, but it is dependent on the code I proposed for HDFS-6591.

          Show
          Rodrigo Schmidt added a comment - I've uploaded a patch, but it is dependent on the code I proposed for HDFS-6591 .
          Hide
          Rodrigo Schmidt added a comment -

          I mean HADOOP-6591

          Show
          Rodrigo Schmidt added a comment - I mean HADOOP-6591
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Rodrigo, please give me some time to first fix MAPREDUCE-1579, which will be committed back to 0.20.

          Show
          Tsz Wo Nicholas Sze added a comment - Rodrigo, please give me some time to first fix MAPREDUCE-1579 , which will be committed back to 0.20.
          Hide
          Rodrigo Schmidt added a comment -

          No problem! I just wanted to give a heads up on what I was planning. I'll upload a new patch once MAPREDUCE-1579 gets committed.

          Show
          Rodrigo Schmidt added a comment - No problem! I just wanted to give a heads up on what I was planning. I'll upload a new patch once MAPREDUCE-1579 gets committed.
          Hide
          Rodrigo Schmidt added a comment -

          Attached a new patch, but it cannot be tested while pending patch HADOOP-6591 is not committed.

          Show
          Rodrigo Schmidt added a comment - Attached a new patch, but it cannot be tested while pending patch HADOOP-6591 is not committed.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Question: Should we throw an exception if har.space.replace.enable=true?

          Patch looks good, otherwise.

          Show
          Tsz Wo Nicholas Sze added a comment - Question: Should we throw an exception if har.space.replace.enable=true? Patch looks good, otherwise.
          Hide
          Mahadev konar added a comment -

          I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?

          Show
          Mahadev konar added a comment - I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?
          Hide
          Mahadev konar added a comment -

          I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?

          Show
          Mahadev konar added a comment - I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?
          Hide
          Rodrigo Schmidt added a comment -

          I agree with Mohadev. However, people running Yahoo 0.20 might apply Nicholas' patch for now and eventually migrate to 0.22. In that case, throwing an exception is not a bad idea.

          Show
          Rodrigo Schmidt added a comment - I agree with Mohadev. However, people running Yahoo 0.20 might apply Nicholas' patch for now and eventually migrate to 0.22. In that case, throwing an exception is not a bad idea.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12438889/MAPREDUCE-1585.1.patch
          against trunk revision 923907.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12438889/MAPREDUCE-1585.1.patch against trunk revision 923907. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          I'm checking the failed unit tests.

          Show
          Rodrigo Schmidt added a comment - I'm checking the failed unit tests.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          TestHadoopArchives contains some useful test cases. How about change it to work with the new version but not removing it?

          Show
          Tsz Wo Nicholas Sze added a comment - TestHadoopArchives contains some useful test cases. How about change it to work with the new version but not removing it?
          Hide
          Rodrigo Schmidt added a comment -

          You definitely have a point there after the last problems with HADOOP-6591. I'll change the patch.

          Show
          Rodrigo Schmidt added a comment - You definitely have a point there after the last problems with HADOOP-6591 . I'll change the patch.
          Hide
          Mahadev konar added a comment -

          rodrigo,
          would you be uploading an updated patch?

          Show
          Mahadev konar added a comment - rodrigo, would you be uploading an updated patch?
          Hide
          Rodrigo Schmidt added a comment -

          I'm sorry I've been busy with other things and didn't have the time to update the unit tests. I'll definitely do it this weekend.

          Show
          Rodrigo Schmidt added a comment - I'm sorry I've been busy with other things and didn't have the time to update the unit tests. I'll definitely do it this weekend.
          Hide
          Rodrigo Schmidt added a comment -

          This new patch doesn't adds new test cases to cover the problems found in HADOOP-6645 and HADOOP-6591.

          It also keeps TestHadoopArchives.java, and only changes it so that there are no more tests with space replacement for invalid characters, since space replacement is removed by this patch.

          Show
          Rodrigo Schmidt added a comment - This new patch doesn't adds new test cases to cover the problems found in HADOOP-6645 and HADOOP-6591 . It also keeps TestHadoopArchives.java, and only changes it so that there are no more tests with space replacement for invalid characters, since space replacement is removed by this patch.
          Hide
          Rodrigo Schmidt added a comment -

          Trying Hudson again!

          Show
          Rodrigo Schmidt added a comment - Trying Hudson again!
          Hide
          Mahadev konar added a comment -

          +1 the patch looks good...

          Show
          Mahadev konar added a comment - +1 the patch looks good...
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch
          against trunk revision 930423.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch against trunk revision 930423. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          I looked at the test output and it seems to be unrelated to this patch (java.lang.NoClassDefFoundError: org/apache/hadoop/metrics/jvm/JvmMetrics).

          Mahadev, what do you think? Can we commit it?

          Thanks,
          Rodrigo

          Show
          Rodrigo Schmidt added a comment - I looked at the test output and it seems to be unrelated to this patch (java.lang.NoClassDefFoundError: org/apache/hadoop/metrics/jvm/JvmMetrics). Mahadev, what do you think? Can we commit it? Thanks, Rodrigo
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch
          against trunk revision 930423.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch against trunk revision 930423. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          looks like hudson finally +1 ed it... ill go ahead and commit it...

          Show
          Mahadev konar added a comment - looks like hudson finally +1 ed it... ill go ahead and commit it...
          Hide
          Mahadev konar added a comment -

          I just committed this. thanks rodrigo!

          Show
          Mahadev konar added a comment - I just committed this. thanks rodrigo!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #301 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/301/)
          . Create Hadoop Archives version 2 with filenames URL-encoded (rodrigo via mahadev)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #301 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/301/ ) . Create Hadoop Archives version 2 with filenames URL-encoded (rodrigo via mahadev)
          Hide
          Rodrigo Schmidt added a comment -

          Great! Thanks, Mahadev!

          Show
          Rodrigo Schmidt added a comment - Great! Thanks, Mahadev!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #280 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/280/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #280 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/280/ )

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              Rodrigo Schmidt
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development