Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1585

Create Hadoop Archives version 2 with filenames URL-encoded

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: harchive
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Hadoop Archives version 1 don't cope with files that have spaces on their names.

      One proposal is to URLEncode filenames inside the index file (version 2, refers to HADOOP-6591).

      This task is to allow the creation of version 2 files that have file names encoded appropriately. It currently depends on HADOOP-6591

      1. MAPREDUCE-1585.2.patch
        15 kB
        Rodrigo Schmidt
      2. MAPREDUCE-1585.1.patch
        19 kB
        Rodrigo Schmidt
      3. MAPREDUCE-1585.patch
        6 kB
        Rodrigo Schmidt

        Issue Links

          Activity

          Rodrigo Schmidt created issue -
          Rodrigo Schmidt made changes -
          Field Original Value New Value
          Attachment MAPREDUCE-1585.patch [ 12438348 ]
          Hide
          Rodrigo Schmidt added a comment -

          I've uploaded a patch, but it is dependent on the code I proposed for HDFS-6591.

          Show
          Rodrigo Schmidt added a comment - I've uploaded a patch, but it is dependent on the code I proposed for HDFS-6591.
          Rodrigo Schmidt made changes -
          Link This issue is blocked by HADOOP-6591 [ HADOOP-6591 ]
          Rodrigo Schmidt made changes -
          Link This issue is related to MAPREDUCE-1579 [ MAPREDUCE-1579 ]
          Hide
          Rodrigo Schmidt added a comment -

          I mean HADOOP-6591

          Show
          Rodrigo Schmidt added a comment - I mean HADOOP-6591
          Rodrigo Schmidt made changes -
          Description Hadoop Archives version 1 don't cope with files that have spaces on their names.

          One proposal is to URLEncode filenames inside the index file (version 2, refers to HDFS-6591).

          This task is to allow the creation of version 2 files that have file names encoded appropriately. It currently depends on HDFS-6591
          Hadoop Archives version 1 don't cope with files that have spaces on their names.

          One proposal is to URLEncode filenames inside the index file (version 2, refers to HADOOP-6591).

          This task is to allow the creation of version 2 files that have file names encoded appropriately. It currently depends on HADOOP-6591
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Rodrigo, please give me some time to first fix MAPREDUCE-1579, which will be committed back to 0.20.

          Show
          Tsz Wo Nicholas Sze added a comment - Rodrigo, please give me some time to first fix MAPREDUCE-1579 , which will be committed back to 0.20.
          Hide
          Rodrigo Schmidt added a comment -

          No problem! I just wanted to give a heads up on what I was planning. I'll upload a new patch once MAPREDUCE-1579 gets committed.

          Show
          Rodrigo Schmidt added a comment - No problem! I just wanted to give a heads up on what I was planning. I'll upload a new patch once MAPREDUCE-1579 gets committed.
          Mahadev konar made changes -
          Fix Version/s 0.22.0 [ 12314184 ]
          Rodrigo Schmidt made changes -
          Attachment MAPREDUCE-1585.1.patch [ 12438889 ]
          Hide
          Rodrigo Schmidt added a comment -

          Attached a new patch, but it cannot be tested while pending patch HADOOP-6591 is not committed.

          Show
          Rodrigo Schmidt added a comment - Attached a new patch, but it cannot be tested while pending patch HADOOP-6591 is not committed.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Question: Should we throw an exception if har.space.replace.enable=true?

          Patch looks good, otherwise.

          Show
          Tsz Wo Nicholas Sze added a comment - Question: Should we throw an exception if har.space.replace.enable=true? Patch looks good, otherwise.
          Hide
          Mahadev konar added a comment -

          I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?

          Show
          Mahadev konar added a comment - I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?
          Hide
          Mahadev konar added a comment -

          I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?

          Show
          Mahadev konar added a comment - I am not sure if it does make sense. looks like both the space replacement and the url encoding are going into the same version (0.22). So its hard to see if folks will be using har.space.replace== true, since the archives created with this release will have url encoding in place... no?
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Rodrigo Schmidt added a comment -

          I agree with Mohadev. However, people running Yahoo 0.20 might apply Nicholas' patch for now and eventually migrate to 0.22. In that case, throwing an exception is not a bad idea.

          Show
          Rodrigo Schmidt added a comment - I agree with Mohadev. However, people running Yahoo 0.20 might apply Nicholas' patch for now and eventually migrate to 0.22. In that case, throwing an exception is not a bad idea.
          Rodrigo Schmidt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12438889/MAPREDUCE-1585.1.patch
          against trunk revision 923907.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12438889/MAPREDUCE-1585.1.patch against trunk revision 923907. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          I'm checking the failed unit tests.

          Show
          Rodrigo Schmidt added a comment - I'm checking the failed unit tests.
          Rodrigo Schmidt made changes -
          Link This issue is blocked by HADOOP-6645 [ HADOOP-6645 ]
          Hide
          Tsz Wo Nicholas Sze added a comment -

          TestHadoopArchives contains some useful test cases. How about change it to work with the new version but not removing it?

          Show
          Tsz Wo Nicholas Sze added a comment - TestHadoopArchives contains some useful test cases. How about change it to work with the new version but not removing it?
          Hide
          Rodrigo Schmidt added a comment -

          You definitely have a point there after the last problems with HADOOP-6591. I'll change the patch.

          Show
          Rodrigo Schmidt added a comment - You definitely have a point there after the last problems with HADOOP-6591 . I'll change the patch.
          Hide
          Mahadev konar added a comment -

          rodrigo,
          would you be uploading an updated patch?

          Show
          Mahadev konar added a comment - rodrigo, would you be uploading an updated patch?
          Hide
          Rodrigo Schmidt added a comment -

          I'm sorry I've been busy with other things and didn't have the time to update the unit tests. I'll definitely do it this weekend.

          Show
          Rodrigo Schmidt added a comment - I'm sorry I've been busy with other things and didn't have the time to update the unit tests. I'll definitely do it this weekend.
          Hide
          Rodrigo Schmidt added a comment -

          This new patch doesn't adds new test cases to cover the problems found in HADOOP-6645 and HADOOP-6591.

          It also keeps TestHadoopArchives.java, and only changes it so that there are no more tests with space replacement for invalid characters, since space replacement is removed by this patch.

          Show
          Rodrigo Schmidt added a comment - This new patch doesn't adds new test cases to cover the problems found in HADOOP-6645 and HADOOP-6591 . It also keeps TestHadoopArchives.java, and only changes it so that there are no more tests with space replacement for invalid characters, since space replacement is removed by this patch.
          Rodrigo Schmidt made changes -
          Attachment MAPREDUCE-1585.2.patch [ 12440745 ]
          Rodrigo Schmidt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Rodrigo Schmidt made changes -
          Attachment MAPREDUCE-1585.2.patch [ 12440745 ]
          Rodrigo Schmidt made changes -
          Attachment MAPREDUCE-1585.2.patch [ 12440747 ]
          Rodrigo Schmidt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Rodrigo Schmidt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Rodrigo Schmidt added a comment -

          Trying Hudson again!

          Show
          Rodrigo Schmidt added a comment - Trying Hudson again!
          Rodrigo Schmidt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Mahadev konar added a comment -

          +1 the patch looks good...

          Show
          Mahadev konar added a comment - +1 the patch looks good...
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch
          against trunk revision 930423.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch against trunk revision 930423. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/console This message is automatically generated.
          Hide
          Rodrigo Schmidt added a comment -

          I looked at the test output and it seems to be unrelated to this patch (java.lang.NoClassDefFoundError: org/apache/hadoop/metrics/jvm/JvmMetrics).

          Mahadev, what do you think? Can we commit it?

          Thanks,
          Rodrigo

          Show
          Rodrigo Schmidt added a comment - I looked at the test output and it seems to be unrelated to this patch (java.lang.NoClassDefFoundError: org/apache/hadoop/metrics/jvm/JvmMetrics). Mahadev, what do you think? Can we commit it? Thanks, Rodrigo
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch
          against trunk revision 930423.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch against trunk revision 930423. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          looks like hudson finally +1 ed it... ill go ahead and commit it...

          Show
          Mahadev konar added a comment - looks like hudson finally +1 ed it... ill go ahead and commit it...
          Hide
          Mahadev konar added a comment -

          I just committed this. thanks rodrigo!

          Show
          Mahadev konar added a comment - I just committed this. thanks rodrigo!
          Mahadev konar made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #301 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/301/)
          . Create Hadoop Archives version 2 with filenames URL-encoded (rodrigo via mahadev)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #301 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/301/ ) . Create Hadoop Archives version 2 with filenames URL-encoded (rodrigo via mahadev)
          Hide
          Rodrigo Schmidt added a comment -

          Great! Thanks, Mahadev!

          Show
          Rodrigo Schmidt added a comment - Great! Thanks, Mahadev!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #280 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/280/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #280 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/280/ )
          Tom White made changes -
          Fix Version/s 0.21.0 [ 12314045 ]
          Fix Version/s 0.22.0 [ 12314184 ]
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Rodrigo Schmidt
              Reporter:
              Rodrigo Schmidt
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development