Hadoop Common
  1. Hadoop Common
  2. HADOOP-2626

RawLocalFileStatus is badly handling URIs

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.15.2
    • Fix Version/s: 0.16.0
    • Component/s: fs
    • Labels:
      None

      Description

      as a result, files with special characters (that get encoded when translated to URIs) are badly handled using a local filesystem.

      new Path(f.toURI().toString())) should be replaced by new Path(f.toURI().getPath()))

      IMHO, each call to toURI().toString() should be considered suspicious. There's another one in the class CopyFiles at line 641.

      1. HADOOP-2626.patch
        1 kB
        Thomas Friol
      2. HADOOP-2626-1.patch
        3 kB
        Doug Cutting

        Activity

        Hide
        Thomas Friol added a comment -

        Here is a patch.

        Show
        Thomas Friol added a comment - Here is a patch.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12373296/patch-Hadoop-2626.diff
        against trunk revision r612614.

        @author +1. The patch does not contain any @author tags.

        patch -1. The patch command could not apply the patch.

        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1617/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373296/patch-Hadoop-2626.diff against trunk revision r612614. @author +1. The patch does not contain any @author tags. patch -1. The patch command could not apply the patch. Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1617/console This message is automatically generated.
        Hide
        Doug Cutting added a comment -

        This patch puts an unqualified path in the returned FileStatus. That's not strictly a bug, but we've found that it's always safest to return fully-qualified paths whenever we can.

        To convert the java.io.File to a Path, we might use new Path(file.getPath()).makeQualified(fs). Perhaps this should be added as a fileToPath method, since there's already a path2File() method.

        Show
        Doug Cutting added a comment - This patch puts an unqualified path in the returned FileStatus. That's not strictly a bug, but we've found that it's always safest to return fully-qualified paths whenever we can. To convert the java.io.File to a Path, we might use new Path(file.getPath()).makeQualified(fs). Perhaps this should be added as a fileToPath method, since there's already a path2File() method.
        Hide
        Thomas Friol added a comment -

        What about this patch then ?

        Show
        Thomas Friol added a comment - What about this patch then ?
        Hide
        Edward J. Yoon added a comment - - edited

        If local file included a '%' character in a file name, we can't copy to dfs becuase FileStatus.getPath() returns urlencoded '%25'.
        IMO, I think it is a bug.

        Show
        Edward J. Yoon added a comment - - edited If local file included a '%' character in a file name, we can't copy to dfs becuase FileStatus.getPath() returns urlencoded '%25'. IMO, I think it is a bug.
        Hide
        Edward J. Yoon added a comment -

        For example :

        FileStatus fileStatus = new RawLocalFileStatus(new File("udanax/15.7%.dat"), getDefaultBlockSize());
        LOG.info(fileStatus.getPath()) // It will log the 'file:/root/workspace/hadoop/udanax/15.7%25.dat' string.

        Show
        Edward J. Yoon added a comment - For example : FileStatus fileStatus = new RawLocalFileStatus(new File("udanax/15.7%.dat"), getDefaultBlockSize()); LOG.info(fileStatus.getPath()) // It will log the 'file:/root/workspace/hadoop/udanax/15.7%25.dat' string.
        Hide
        Nigel Daley added a comment -

        Please include a test case.

        Show
        Nigel Daley added a comment - Please include a test case.
        Hide
        Doug Cutting added a comment -

        > What about this patch then ?

        That looks better to me, in that the returned Path is now fully qualified. Does it handle escapes any better than before? If not, 'new Path(file.toUri().getPath()).makeQualified(fs)' may do better.

        As Nigel indicates, some test cases would be very useful.

        Show
        Doug Cutting added a comment - > What about this patch then ? That looks better to me, in that the returned Path is now fully qualified. Does it handle escapes any better than before? If not, 'new Path(file.toUri().getPath()).makeQualified(fs)' may do better. As Nigel indicates, some test cases would be very useful.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12373373/HADOOP-2626.patch
        against trunk revision r613022.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests -1. The patch failed contrib unit tests.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/testReport/
        Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373373/HADOOP-2626.patch against trunk revision r613022. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1631/console This message is automatically generated.
        Hide
        Thomas Friol added a comment -

        Well, it has at least fix our problem. So it seems to be ok.
        Will try add unit tests soon but if someone has time to do it now, please, do not hesitate.

        BTW, the patch is failing in hudson because of a failing unit test which seems not to be related to this modification.

        Show
        Thomas Friol added a comment - Well, it has at least fix our problem. So it seems to be ok. Will try add unit tests soon but if someone has time to do it now, please, do not hesitate. BTW, the patch is failing in hudson because of a failing unit test which seems not to be related to this modification.
        Hide
        Doug Cutting added a comment -

        Here's a new version that includes a test case. The test fails without the patch and succeeds with it.

        Show
        Doug Cutting added a comment - Here's a new version that includes a test case. The test fails without the patch and succeeds with it.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12373981/HADOOP-2626-1.patch
        against trunk revision 614721.

        @author +1. The patch does not contain any @author tags.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new compiler warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373981/HADOOP-2626-1.patch against trunk revision 614721. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1674/console This message is automatically generated.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Thomas.

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Thomas.
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #380 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/380/ )

          People

          • Assignee:
            Doug Cutting
            Reporter:
            Frédéric Bertin
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development