Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11048

Audit Log should escape control characters

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha2
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      HDFS audit logs are formatted as individual lines, each of which has a few of key-value pair fields. Some of the values come from client request (e.g. src, dst). Before this patch the control characters including \t \n etc are not escaped in audit logs. That may break lines unexpectedly or introduce additional table character (in the worst case, both) within a field. Tools that parse audit logs had to deal with this case carefully. After this patch, the control characters in the src/dst fields are escaped.
      Show
      HDFS audit logs are formatted as individual lines, each of which has a few of key-value pair fields. Some of the values come from client request (e.g. src, dst). Before this patch the control characters including \t \n etc are not escaped in audit logs. That may break lines unexpectedly or introduce additional table character (in the worst case, both) within a field. Tools that parse audit logs had to deal with this case carefully. After this patch, the control characters in the src/dst fields are escaped.

      Description

      Allowing control characters without escaping them allows for spoofing audit log entries at worst and accidentally breaking log parsing at best.

      1. HDFS-11048.001.patch
        4 kB
        Eric Badger
      2. HDFS-11048.002.patch
        2 kB
        Eric Badger

        Activity

        Hide
        ebadger Eric Badger added a comment -

        Attaching a patch to escape paths in the audit log

        Show
        ebadger Eric Badger added a comment - Attaching a patch to escape paths in the audit log
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 17s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        0 mvndep 0m 14s Maven dependency ordering for branch
        +1 mvninstall 7m 22s trunk passed
        +1 compile 7m 13s trunk passed
        +1 checkstyle 1m 38s trunk passed
        +1 mvnsite 1m 58s trunk passed
        +1 mvneclipse 0m 27s trunk passed
        +1 findbugs 3m 24s trunk passed
        +1 javadoc 1m 27s trunk passed
        0 mvndep 0m 17s Maven dependency ordering for patch
        +1 mvninstall 1m 24s the patch passed
        +1 compile 6m 55s the patch passed
        +1 javac 6m 55s the patch passed
        -0 checkstyle 1m 39s root: The patch generated 2 new + 317 unchanged - 0 fixed = 319 total (was 317)
        +1 mvnsite 1m 59s the patch passed
        +1 mvneclipse 0m 27s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 3m 34s the patch passed
        +1 javadoc 1m 26s the patch passed
        +1 unit 7m 40s hadoop-common in the patch passed.
        -1 unit 62m 49s hadoop-hdfs in the patch failed.
        +1 asflicense 0m 23s The patch does not generate ASF License warnings.
        113m 28s



        Reason Tests
        Failed junit tests hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock
          hadoop.hdfs.server.namenode.TestCacheDirectives



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Issue HDFS-11048
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835002/HDFS-11048.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux ef4353183e2e 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / a1a0281
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17264/artifact/patchprocess/diff-checkstyle-root.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/17264/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17264/testReport/
        modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17264/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 14s Maven dependency ordering for branch +1 mvninstall 7m 22s trunk passed +1 compile 7m 13s trunk passed +1 checkstyle 1m 38s trunk passed +1 mvnsite 1m 58s trunk passed +1 mvneclipse 0m 27s trunk passed +1 findbugs 3m 24s trunk passed +1 javadoc 1m 27s trunk passed 0 mvndep 0m 17s Maven dependency ordering for patch +1 mvninstall 1m 24s the patch passed +1 compile 6m 55s the patch passed +1 javac 6m 55s the patch passed -0 checkstyle 1m 39s root: The patch generated 2 new + 317 unchanged - 0 fixed = 319 total (was 317) +1 mvnsite 1m 59s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 34s the patch passed +1 javadoc 1m 26s the patch passed +1 unit 7m 40s hadoop-common in the patch passed. -1 unit 62m 49s hadoop-hdfs in the patch failed. +1 asflicense 0m 23s The patch does not generate ASF License warnings. 113m 28s Reason Tests Failed junit tests hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock   hadoop.hdfs.server.namenode.TestCacheDirectives Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HDFS-11048 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835002/HDFS-11048.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux ef4353183e2e 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / a1a0281 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17264/artifact/patchprocess/diff-checkstyle-root.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17264/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17264/testReport/ modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17264/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        ebadger Eric Badger added a comment -

        The 2 test failures are unrelated to this patch and do not fail locally for me.

        Show
        ebadger Eric Badger added a comment - The 2 test failures are unrelated to this patch and do not fail locally for me.
        Hide
        aw Allen Wittenauer added a comment -

        Marking this as an incompatible change, as users who do comparisons of audit logs looking for entries pre- and post- will have mismatched data.

        Show
        aw Allen Wittenauer added a comment - Marking this as an incompatible change, as users who do comparisons of audit logs looking for entries pre- and post- will have mismatched data.
        Hide
        liuml07 Mingliang Liu added a comment -

        +1 for the proposal.

        1141	  public static boolean containsNonPrintableChar(String s1) {
        1142	    Pattern regex = Pattern.compile("\\p{C}");
        1143	    return regex.matcher(s1).find();
        1144	  }
        

        do we better pre-compile this?

        Show
        liuml07 Mingliang Liu added a comment - +1 for the proposal. 1141 public static boolean containsNonPrintableChar( String s1) { 1142 Pattern regex = Pattern.compile( "\\p{C}" ); 1143 return regex.matcher(s1).find(); 1144 } do we better pre-compile this?
        Hide
        ebadger Eric Badger added a comment -

        do we better pre-compile this?

        Good catch, Mingliang Liu. After talking offline with Daryn Sharp, we figured it would be better to go ahead and do the full string replacement each time logging, since we're paying basically the same (if not more) penalty by doing the regex pattern matching. The initial patch was an attempt to avoid having to copy the entire string ever time we log. Attaching a new patch that does the string replacement upfront instead of doing the check first. If you have suggestions on a more efficient way of handling this, I welcome your feedback.

        Show
        ebadger Eric Badger added a comment - do we better pre-compile this? Good catch, Mingliang Liu . After talking offline with Daryn Sharp , we figured it would be better to go ahead and do the full string replacement each time logging, since we're paying basically the same (if not more) penalty by doing the regex pattern matching. The initial patch was an attempt to avoid having to copy the entire string ever time we log. Attaching a new patch that does the string replacement upfront instead of doing the check first. If you have suggestions on a more efficient way of handling this, I welcome your feedback.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 8m 6s trunk passed
        +1 compile 0m 46s trunk passed
        +1 checkstyle 0m 29s trunk passed
        +1 mvnsite 0m 54s trunk passed
        +1 mvneclipse 0m 12s trunk passed
        +1 findbugs 1m 44s trunk passed
        +1 javadoc 0m 39s trunk passed
        +1 mvninstall 0m 45s the patch passed
        +1 compile 0m 41s the patch passed
        +1 javac 0m 41s the patch passed
        -0 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 232 unchanged - 0 fixed = 233 total (was 232)
        +1 mvnsite 0m 49s the patch passed
        +1 mvneclipse 0m 9s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 47s the patch passed
        +1 javadoc 0m 36s the patch passed
        -1 unit 68m 58s hadoop-hdfs in the patch failed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        88m 46s



        Reason Tests
        Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeUUID
          hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations
          hadoop.hdfs.TestDecommission
          hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
          hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Issue HDFS-11048
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835598/HDFS-11048.002.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 253568a45f68 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 0c837db
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17325/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/17325/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17325/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17325/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 8m 6s trunk passed +1 compile 0m 46s trunk passed +1 checkstyle 0m 29s trunk passed +1 mvnsite 0m 54s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 44s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 45s the patch passed +1 compile 0m 41s the patch passed +1 javac 0m 41s the patch passed -0 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 232 unchanged - 0 fixed = 233 total (was 232) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 47s the patch passed +1 javadoc 0m 36s the patch passed -1 unit 68m 58s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 88m 46s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeUUID   hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations   hadoop.hdfs.TestDecommission   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HDFS-11048 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835598/HDFS-11048.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 253568a45f68 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 0c837db Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17325/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17325/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17325/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17325/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        liuml07 Mingliang Liu added a comment -

        +1 Will commit this in 24 hours if no more comments. I'll take care of the trivial checkstyle warning when committing.

        I think this will not get into branch-2 (which is fine to me) as its incompatible change might break existing tools?

        Show
        liuml07 Mingliang Liu added a comment - +1 Will commit this in 24 hours if no more comments. I'll take care of the trivial checkstyle warning when committing. I think this will not get into branch-2 (which is fine to me) as its incompatible change might break existing tools?
        Hide
        daryn Daryn Sharp added a comment -

        From a purist perspective, it's "incompatible". The question IMHO for branch-2 is whether the scope and impact of the incompatibility is small enough and the whether the benefit outweighs the cost. I think the answer to both is yes. The reason for this patch is a user started creating files with newlines that broke/confused an internal log ingestion tool. So some tools will "unbreak" while others that couldn't possibly have handled it right "might break". If you think about, it might be hard to do, but w/o this patch you can forge audit entries.

        Show
        daryn Daryn Sharp added a comment - From a purist perspective, it's "incompatible". The question IMHO for branch-2 is whether the scope and impact of the incompatibility is small enough and the whether the benefit outweighs the cost. I think the answer to both is yes. The reason for this patch is a user started creating files with newlines that broke/confused an internal log ingestion tool. So some tools will "unbreak" while others that couldn't possibly have handled it right "might break". If you think about, it might be hard to do, but w/o this patch you can forge audit entries.
        Hide
        liuml07 Mingliang Liu added a comment -

        This makes sense to me. Just want to open the discussion sooner than later. Will commit to branch-2 as well if no objections.

        Show
        liuml07 Mingliang Liu added a comment - This makes sense to me. Just want to open the discussion sooner than later. Will commit to branch-2 as well if no objections.
        Hide
        aw Allen Wittenauer added a comment - - edited

        IMHO, even though it's incompatible, it probably should go into a branch-2 minor (so 2.8.0, not 2.7.x) with the caveat that the release note needs to be very explicit about what happens pre- and post- patch so that teams have an idea of what to expect. e.g., "real world" examples

        Show
        aw Allen Wittenauer added a comment - - edited IMHO, even though it's incompatible, it probably should go into a branch-2 minor (so 2.8.0, not 2.7.x) with the caveat that the release note needs to be very explicit about what happens pre- and post- patch so that teams have an idea of what to expect. e.g., "real world" examples
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10723 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10723/)
        HDFS-11048. Audit Log should escape control characters. Contributed by (liuml07: rev 8a9388e5f6d622152798aaaa256064919e4fec37)

        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogs.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10723 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10723/ ) HDFS-11048 . Audit Log should escape control characters. Contributed by (liuml07: rev 8a9388e5f6d622152798aaaa256064919e4fec37) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogs.java
        Hide
        liuml07 Mingliang Liu added a comment -

        Committed to trunk through branch-2.8 branches. Thanks for the contribution Eric Badger. Thanks for the discussion Allen Wittenauer and Daryn Sharp.

        Show
        liuml07 Mingliang Liu added a comment - Committed to trunk through branch-2.8 branches. Thanks for the contribution Eric Badger . Thanks for the discussion Allen Wittenauer and Daryn Sharp .
        Hide
        ebadger Eric Badger added a comment -
        Show
        ebadger Eric Badger added a comment - Thanks Mingliang Liu , Allen Wittenauer , Daryn Sharp !
        Hide
        aw Allen Wittenauer added a comment -

        What happens if the filename has a backslash in it?

        Show
        aw Allen Wittenauer added a comment - What happens if the filename has a backslash in it?
        Hide
        ebadger Eric Badger added a comment -

        What happens if the filename has a backslash in it?

        The backslash will be escaped and printed as a single backslash.

        Show
        ebadger Eric Badger added a comment - What happens if the filename has a backslash in it? The backslash will be escaped and printed as a single backslash.
        Hide
        aw Allen Wittenauer added a comment -

        So in the log it will be "
        " or "\"?

        Show
        aw Allen Wittenauer added a comment - So in the log it will be " " or "\"?
        Hide
        ebadger Eric Badger added a comment -

        All backslashes in the input will be printed in the audit log as actual backslashes, because they will be escaped by StringEscapeUtils and replaced with double backslashes. So when they are actually printed, the double backslash will be escaped and you will see a single backslash. All control characters such as "\r" and "\n" will also be escaped and printed in their escaped form.

        You can walk through the TestAuditLogs#testAuditCharacterEscape test in a debugger to see how the backslashes are escaped using StringEscapeUtils.escapeJavaStyleString()

        Show
        ebadger Eric Badger added a comment - All backslashes in the input will be printed in the audit log as actual backslashes, because they will be escaped by StringEscapeUtils and replaced with double backslashes. So when they are actually printed, the double backslash will be escaped and you will see a single backslash. All control characters such as "\r" and "\n" will also be escaped and printed in their escaped form. You can walk through the TestAuditLogs#testAuditCharacterEscape test in a debugger to see how the backslashes are escaped using StringEscapeUtils.escapeJavaStyleString()
        Hide
        ebadger Eric Badger added a comment -

        Oops, never actually answered your question. An input of "\" would be printed as "\" in the audit log.

        Show
        ebadger Eric Badger added a comment - Oops, never actually answered your question. An input of "\" would be printed as "\" in the audit log.
        Hide
        aw Allen Wittenauer added a comment -

        OK, that's what I thought. We probably need to print that as a double backslash to avoid the ambiguity. e.g., does '\thisfile' begin with a tab or does it begin with a backlash?

        Show
        aw Allen Wittenauer added a comment - OK, that's what I thought. We probably need to print that as a double backslash to avoid the ambiguity. e.g., does '\thisfile' begin with a tab or does it begin with a backlash?
        Hide
        ebadger Eric Badger added a comment -

        e.g., does '\thisfile' begin with a tab or does it begin with a backlash?

        '\thisfile' would begin with a backslash.

        I'm not sure I understand what you mean about the ambiguity. I can think of one pretty contrived case where I think this might cause less than ideal behavior. If you had a file that started with a tab followed by "hisfile", it would be printed as "\thisfile" in the audit log. However, if you had a file called "\thisfile" (where the \t are 2 separate ascii chars), it would also be printed in the audit log as "\thisfile".

        Show
        ebadger Eric Badger added a comment - e.g., does '\thisfile' begin with a tab or does it begin with a backlash? '\thisfile' would begin with a backslash. I'm not sure I understand what you mean about the ambiguity. I can think of one pretty contrived case where I think this might cause less than ideal behavior. If you had a file that started with a tab followed by "hisfile", it would be printed as "\thisfile" in the audit log. However, if you had a file called "\thisfile" (where the \t are 2 separate ascii chars), it would also be printed in the audit log as "\thisfile".
        Hide
        aw Allen Wittenauer added a comment -

        I can think of one pretty contrived case where I think this might cause less than ideal behavior.

        That's my point. I'm looking at this from the point of view of what is in the log. "\thisfile" is ambiguous. That's super bad.

        Show
        aw Allen Wittenauer added a comment - I can think of one pretty contrived case where I think this might cause less than ideal behavior. That's my point. I'm looking at this from the point of view of what is in the log. "\thisfile" is ambiguous. That's super bad.
        Hide
        ebadger Eric Badger added a comment -

        What do you propose to fix that? Changing the single backslash to a double backslash just moves the problem instead of fixing it. Instead of 'tab + "hisfile"' being the same as '\thisfile', 'tab + "hisfile"' would be the same as '
        thisfile'.

        Show
        ebadger Eric Badger added a comment - What do you propose to fix that? Changing the single backslash to a double backslash just moves the problem instead of fixing it. Instead of 'tab + "hisfile"' being the same as '\thisfile', 'tab + "hisfile"' would be the same as ' thisfile'.
        Hide
        aw Allen Wittenauer added a comment -

        It's a pretty standard practice to escape the escape character. But I can't help but think that instead of using backslash to escape if this patch wouldn't have been better off using URI escaping to match what happens elsewhere in Apache Hadoop.

        Show
        aw Allen Wittenauer added a comment - It's a pretty standard practice to escape the escape character. But I can't help but think that instead of using backslash to escape if this patch wouldn't have been better off using URI escaping to match what happens elsewhere in Apache Hadoop.
        Hide
        ebadger Eric Badger added a comment -

        Using URI escaping wouldn't be great because it would make more paths look weird, while only giving benefit to this small use-case. I think the best solution would be to replace control characters with their escaped equivalents (e.g. tab becomes '\t', newline becomes '\n', etc.) and escape backslashes with a double backslash (e.g. '\' becomes '
        '). However, this would require creating a new library to do the escaping since we can't touch StringEscapeUtils.

        Show
        ebadger Eric Badger added a comment - Using URI escaping wouldn't be great because it would make more paths look weird, while only giving benefit to this small use-case. I think the best solution would be to replace control characters with their escaped equivalents (e.g. tab becomes '\t', newline becomes '\n', etc.) and escape backslashes with a double backslash (e.g. '\' becomes ' '). However, this would require creating a new library to do the escaping since we can't touch StringEscapeUtils.

          People

          • Assignee:
            ebadger Eric Badger
            Reporter:
            ebadger Eric Badger
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development