Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13079

Add -q option to Ls to print ? instead of non-printable characters

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None

      Description

      Add option -q to "hdfs dfs -ls" to print non-printable characters as "?". Non-printable characters are defined by isprint(3) according to the current locale.

      Default to -q behavior on terminal; otherwise, print raw characters. See the difference in these 2 command lines:

      • hadoop fs -ls /dir
      • hadoop fs -ls /dir | od -c

      In C, isatty(STDOUT_FILENO) is used to find out whether the output is a terminal. Since Java doesn't have isatty, I will use JNI to call C isatty() because the closest test System.console() == null does not work in some cases.

      1. HADOOP-13079.001.patch
        12 kB
        John Zhuge
      2. HADOOP-13079.002.patch
        17 kB
        John Zhuge
      3. HADOOP-13079.003.patch
        18 kB
        John Zhuge
      4. HADOOP-13079.004.patch
        18 kB
        John Zhuge

        Issue Links

          Activity

          Hide
          stevel@apache.org Steve Loughran added a comment -

          Test doesn't work on Windows I'm afraid; HADOOP-14199 covers it. Probably simplest just to skip the test entirely there. Also looking at the test file, it should be using GenericTestUtils.getTempPath() to get the temp dir

          Show
          stevel@apache.org Steve Loughran added a comment - Test doesn't work on Windows I'm afraid; HADOOP-14199 covers it. Probably simplest just to skip the test entirely there. Also looking at the test file, it should be using GenericTestUtils.getTempPath() to get the temp dir
          Hide
          jzhuge John Zhuge added a comment -

          Thanks Andrew Wang for review and commit. Thanks Allen Wittenauer and Colin P. McCabe for review and comments.

          Show
          jzhuge John Zhuge added a comment - Thanks Andrew Wang for review and commit. Thanks Allen Wittenauer and Colin P. McCabe for review and comments.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #9954 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9954/)
          HADOOP-13079. Add -q option to Ls to print ? instead of non-printable (wang: rev 0accc3306d830c3f2b16c4b8abf68729c7aba6cb)

          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/package-info.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestPrintableString.java
          • hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PrintableString.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFsShellList.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Ls.java
          • hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9954 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9954/ ) HADOOP-13079 . Add -q option to Ls to print ? instead of non-printable (wang: rev 0accc3306d830c3f2b16c4b8abf68729c7aba6cb) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/package-info.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestPrintableString.java hadoop-common-project/hadoop-common/src/test/resources/testConf.xml hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/PrintableString.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFsShellList.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Ls.java hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
          Hide
          andrew.wang Andrew Wang added a comment -

          Committed back through branch-2.8. Thanks John for pushing on this one, and everyone who helped with reviews.

          Show
          andrew.wang Andrew Wang added a comment - Committed back through branch-2.8. Thanks John for pushing on this one, and everyone who helped with reviews.
          Hide
          andrew.wang Andrew Wang added a comment -

          LGTM will commit shortly

          Show
          andrew.wang Andrew Wang added a comment - LGTM will commit shortly
          Hide
          jzhuge John Zhuge added a comment -

          Hi Andrew Wang and Allen Wittenauer, could you please continue to review patch 004? Thanks.

          Show
          jzhuge John Zhuge added a comment - Hi Andrew Wang and Allen Wittenauer , could you please continue to review patch 004? Thanks.
          Hide
          jzhuge John Zhuge added a comment -

          Unit test failure is not related:

          TestGangliaMetrics.testGangliaMetrics2:139->checkMetrics:161 Missing metrics: test.s1rec.Xxx
          
          Show
          jzhuge John Zhuge added a comment - Unit test failure is not related: TestGangliaMetrics.testGangliaMetrics2:139->checkMetrics:161 Missing metrics: test.s1rec.Xxx
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 27s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          +1 mvninstall 6m 35s trunk passed
          +1 compile 6m 54s trunk passed
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 1m 3s trunk passed
          +1 mvneclipse 0m 11s trunk passed
          +1 findbugs 1m 22s trunk passed
          +1 javadoc 0m 53s trunk passed
          +1 mvninstall 0m 39s the patch passed
          +1 compile 6m 56s the patch passed
          +1 javac 6m 56s the patch passed
          +1 checkstyle 0m 22s the patch passed
          +1 mvnsite 0m 52s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 1m 33s the patch passed
          +1 javadoc 0m 59s the patch passed
          -1 unit 7m 47s hadoop-common in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          38m 12s



          Reason Tests
          Failed junit tests hadoop.metrics2.impl.TestGangliaMetrics



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12808538/HADOOP-13079.004.patch
          JIRA Issue HADOOP-13079
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 5f385b64f5d4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 6de9213
          Default Java 1.8.0_91
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9674/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9674/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9674/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 27s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. +1 mvninstall 6m 35s trunk passed +1 compile 6m 54s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 1m 3s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 1m 22s trunk passed +1 javadoc 0m 53s trunk passed +1 mvninstall 0m 39s the patch passed +1 compile 6m 56s the patch passed +1 javac 6m 56s the patch passed +1 checkstyle 0m 22s the patch passed +1 mvnsite 0m 52s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 1m 33s the patch passed +1 javadoc 0m 59s the patch passed -1 unit 7m 47s hadoop-common in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 38m 12s Reason Tests Failed junit tests hadoop.metrics2.impl.TestGangliaMetrics Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12808538/HADOOP-13079.004.patch JIRA Issue HADOOP-13079 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 5f385b64f5d4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 6de9213 Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9674/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9674/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9674/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jzhuge John Zhuge added a comment -

          Patch 004:

          • Rename "questionMark" to "hideNonPrintable"
          Show
          jzhuge John Zhuge added a comment - Patch 004: Rename "questionMark" to "hideNonPrintable"
          Hide
          jzhuge John Zhuge added a comment -

          Thanks Andrew Wang for the encouragement!

          I can see your point about the question mark which is only what is used to hide the non-printable characters. How about renaming it to "hideNonPrintable" which is consistent to help message and the jira title?

          Show
          jzhuge John Zhuge added a comment - Thanks Andrew Wang for the encouragement! I can see your point about the question mark which is only what is used to hide the non-printable characters. How about renaming it to "hideNonPrintable" which is consistent to help message and the jira title?
          Hide
          andrew.wang Andrew Wang added a comment -

          Also thanks for adding all these new tests (and Allen for suggesting), it's great to see this level of thoroughness come with a contribution.

          Show
          andrew.wang Andrew Wang added a comment - Also thanks for adding all these new tests (and Allen for suggesting), it's great to see this level of thoroughness come with a contribution.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi John, overall patch LGTM. Only tiny nit, could we rename "questionMark" and "isQuestionMark" and etc to something else? I feel like "isQuestionMark" in particular is confusing, since what we're testing is whether to substitute non-printable characters, not whether something is a question mark.

          FWIW my ls man page calls this "--hide-control-characters", which seems like a reasonable variable name.

          Show
          andrew.wang Andrew Wang added a comment - Hi John, overall patch LGTM. Only tiny nit, could we rename "questionMark" and "isQuestionMark" and etc to something else? I feel like "isQuestionMark" in particular is confusing, since what we're testing is whether to substitute non-printable characters, not whether something is a question mark. FWIW my ls man page calls this "--hide-control-characters", which seems like a reasonable variable name.
          Hide
          jzhuge John Zhuge added a comment -

          Could someone kindly code review the patch please?

          Show
          jzhuge John Zhuge added a comment - Could someone kindly code review the patch please?
          Hide
          jzhuge John Zhuge added a comment -

          Please review patch 003:

          • Update hadoop-common/src/test/resources/testConf.xml to fix TestCLI failure
          Show
          jzhuge John Zhuge added a comment - Please review patch 003: Update hadoop-common/src/test/resources/testConf.xml to fix TestCLI failure
          Hide
          jzhuge John Zhuge added a comment -

          Investigate TestCLI failure.

          Show
          jzhuge John Zhuge added a comment - Investigate TestCLI failure.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 9s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          +1 mvninstall 6m 42s trunk passed
          +1 compile 5m 40s trunk passed with JDK v1.8.0_91
          +1 compile 6m 42s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 24s trunk passed
          +1 mvnsite 0m 59s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 36s trunk passed
          +1 javadoc 0m 52s trunk passed with JDK v1.8.0_91
          +1 javadoc 1m 5s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 41s the patch passed
          +1 compile 5m 39s the patch passed with JDK v1.8.0_91
          +1 javac 5m 39s the patch passed
          +1 compile 6m 41s the patch passed with JDK v1.7.0_95
          +1 javac 6m 41s the patch passed
          +1 checkstyle 0m 24s the patch passed
          +1 mvnsite 0m 57s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 53s the patch passed
          +1 javadoc 0m 55s the patch passed with JDK v1.8.0_91
          +1 javadoc 1m 6s the patch passed with JDK v1.7.0_95
          -1 unit 6m 50s hadoop-common in the patch failed with JDK v1.8.0_91.
          -1 unit 7m 11s hadoop-common in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          58m 25s



          Reason Tests
          JDK v1.8.0_91 Failed junit tests hadoop.cli.TestCLI
          JDK v1.7.0_95 Failed junit tests hadoop.cli.TestCLI



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803165/HADOOP-13079.002.patch
          JIRA Issue HADOOP-13079
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux abdbb76be9b1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 87f5e35
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/console
          Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 9s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 6m 42s trunk passed +1 compile 5m 40s trunk passed with JDK v1.8.0_91 +1 compile 6m 42s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 59s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 36s trunk passed +1 javadoc 0m 52s trunk passed with JDK v1.8.0_91 +1 javadoc 1m 5s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 41s the patch passed +1 compile 5m 39s the patch passed with JDK v1.8.0_91 +1 javac 5m 39s the patch passed +1 compile 6m 41s the patch passed with JDK v1.7.0_95 +1 javac 6m 41s the patch passed +1 checkstyle 0m 24s the patch passed +1 mvnsite 0m 57s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 53s the patch passed +1 javadoc 0m 55s the patch passed with JDK v1.8.0_91 +1 javadoc 1m 6s the patch passed with JDK v1.7.0_95 -1 unit 6m 50s hadoop-common in the patch failed with JDK v1.8.0_91. -1 unit 7m 11s hadoop-common in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 24s The patch does not generate ASF License warnings. 58m 25s Reason Tests JDK v1.8.0_91 Failed junit tests hadoop.cli.TestCLI JDK v1.7.0_95 Failed junit tests hadoop.cli.TestCLI Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803165/HADOOP-13079.002.patch JIRA Issue HADOOP-13079 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux abdbb76be9b1 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 87f5e35 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9350/console Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
          +1 mvninstall 7m 50s trunk passed
          +1 compile 5m 59s trunk passed with JDK v1.8.0_91
          +1 compile 6m 42s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 25s trunk passed
          +1 mvnsite 0m 59s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 35s trunk passed
          +1 javadoc 0m 55s trunk passed with JDK v1.8.0_91
          +1 javadoc 1m 8s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 41s the patch passed
          +1 compile 6m 2s the patch passed with JDK v1.8.0_91
          +1 javac 6m 2s the patch passed
          +1 compile 6m 47s the patch passed with JDK v1.7.0_95
          +1 javac 6m 47s the patch passed
          +1 checkstyle 0m 24s the patch passed
          +1 mvnsite 0m 56s the patch passed
          +1 mvneclipse 0m 14s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 51s the patch passed
          +1 javadoc 0m 54s the patch passed with JDK v1.8.0_91
          +1 javadoc 1m 6s the patch passed with JDK v1.7.0_95
          -1 unit 7m 5s hadoop-common in the patch failed with JDK v1.8.0_91.
          -1 unit 7m 16s hadoop-common in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          60m 54s



          Reason Tests
          JDK v1.8.0_91 Failed junit tests hadoop.cli.TestCLI
          JDK v1.7.0_95 Failed junit tests hadoop.cli.TestCLI



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803165/HADOOP-13079.002.patch
          JIRA Issue HADOOP-13079
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 77fd444c328e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 87f5e35
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/console
          Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 7m 50s trunk passed +1 compile 5m 59s trunk passed with JDK v1.8.0_91 +1 compile 6m 42s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 25s trunk passed +1 mvnsite 0m 59s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 35s trunk passed +1 javadoc 0m 55s trunk passed with JDK v1.8.0_91 +1 javadoc 1m 8s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 41s the patch passed +1 compile 6m 2s the patch passed with JDK v1.8.0_91 +1 javac 6m 2s the patch passed +1 compile 6m 47s the patch passed with JDK v1.7.0_95 +1 javac 6m 47s the patch passed +1 checkstyle 0m 24s the patch passed +1 mvnsite 0m 56s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 51s the patch passed +1 javadoc 0m 54s the patch passed with JDK v1.8.0_91 +1 javadoc 1m 6s the patch passed with JDK v1.7.0_95 -1 unit 7m 5s hadoop-common in the patch failed with JDK v1.8.0_91. -1 unit 7m 16s hadoop-common in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 24s The patch does not generate ASF License warnings. 60m 54s Reason Tests JDK v1.8.0_91 Failed junit tests hadoop.cli.TestCLI JDK v1.7.0_95 Failed junit tests hadoop.cli.TestCLI Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803165/HADOOP-13079.002.patch JIRA Issue HADOOP-13079 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 77fd444c328e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 87f5e35 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9348/console Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jzhuge John Zhuge added a comment -

          Rename the patch files.

          Show
          jzhuge John Zhuge added a comment - Rename the patch files.
          Hide
          jzhuge John Zhuge added a comment -

          Please review patch 002:

          • Change PrintableString implementation to cover 1 more category of non-printable characters: standalone surrogate characters
          • Add unit tests to verify printable characters including unicode scripts from different BMP and SMP blocks are preserved
          • Update documentation FileSystemShell.md
          • Much more javadoc
          • Add hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/package-info.java to quiet checkstyle

          Welcome to the wonderful world of Unicode. Some references:

          Show
          jzhuge John Zhuge added a comment - Please review patch 002: Change PrintableString implementation to cover 1 more category of non-printable characters: standalone surrogate characters Add unit tests to verify printable characters including unicode scripts from different BMP and SMP blocks are preserved Update documentation FileSystemShell.md Much more javadoc Add hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/package-info.java to quiet checkstyle Welcome to the wonderful world of Unicode. Some references: https://en.wikipedia.org/wiki/Universal_Character_Set_characters Java Character class doc
          Hide
          jzhuge John Zhuge added a comment -

          Thanks Yongjun Zhang.

          Show
          jzhuge John Zhuge added a comment - Thanks Yongjun Zhang .
          Show
          yzhangal Yongjun Zhang added a comment - HI John Zhuge , For your references, https://commons.apache.org/proper/commons-lang/javadocs/api-2.4/org/apache/commons/lang/CharUtils.html http://stackoverflow.com/questions/220547/printable-char-in-java Thanks.
          Hide
          jzhuge John Zhuge added a comment -

          Thanks for the review. I will fix the issues.

          John Zhuge
          Software Engineer, Cloudera

          Show
          jzhuge John Zhuge added a comment - Thanks for the review. I will fix the issues. John Zhuge Software Engineer, Cloudera
          Hide
          aw Allen Wittenauer added a comment -

          Also, test code should test "legal" unicode characters to make sure they come out unscathed, probably in multiple locales as well.

          Show
          aw Allen Wittenauer added a comment - Also, test code should test "legal" unicode characters to make sure they come out unscathed, probably in multiple locales as well.
          Hide
          aw Allen Wittenauer added a comment -

          Patch is missing documentation fixes.

          Show
          aw Allen Wittenauer added a comment - Patch is missing documentation fixes.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 9s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
          0 mvndep 0m 44s Maven dependency ordering for branch
          +1 mvninstall 6m 58s trunk passed
          +1 compile 5m 53s trunk passed with JDK v1.8.0_91
          +1 compile 6m 47s trunk passed with JDK v1.7.0_95
          +1 checkstyle 1m 6s trunk passed
          +1 mvnsite 1m 53s trunk passed
          +1 mvneclipse 0m 28s trunk passed
          +1 findbugs 3m 33s trunk passed
          +1 javadoc 1m 58s trunk passed with JDK v1.8.0_91
          +1 javadoc 2m 54s trunk passed with JDK v1.7.0_95
          0 mvndep 0m 15s Maven dependency ordering for patch
          +1 mvninstall 1m 30s the patch passed
          +1 compile 5m 44s the patch passed with JDK v1.8.0_91
          +1 javac 5m 44s the patch passed
          +1 compile 6m 44s the patch passed with JDK v1.7.0_95
          +1 javac 6m 44s the patch passed
          +1 checkstyle 1m 4s the patch passed
          +1 mvnsite 1m 52s the patch passed
          +1 mvneclipse 0m 27s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 4m 2s the patch passed
          +1 javadoc 2m 1s the patch passed with JDK v1.8.0_91
          +1 javadoc 2m 57s the patch passed with JDK v1.7.0_95
          -1 unit 7m 39s hadoop-common in the patch failed with JDK v1.8.0_91.
          -1 unit 58m 11s hadoop-hdfs in the patch failed with JDK v1.8.0_91.
          -1 unit 7m 52s hadoop-common in the patch failed with JDK v1.7.0_95.
          -1 unit 56m 32s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
          +1 asflicense 0m 27s The patch does not generate ASF License warnings.
          191m 3s



          Reason Tests
          JDK v1.8.0_91 Failed junit tests hadoop.cli.TestCLI
            hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
          JDK v1.7.0_95 Failed junit tests hadoop.cli.TestCLI
            hadoop.hdfs.web.TestWebHDFS
            hadoop.hdfs.TestHFlush
            hadoop.hdfs.TestEncryptionZones



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12802610/HDFS-13079.001.patch
          JIRA Issue HADOOP-13079
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 206fbe5386ae 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 8d48266
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/testReport/
          modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/console
          Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 9s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. 0 mvndep 0m 44s Maven dependency ordering for branch +1 mvninstall 6m 58s trunk passed +1 compile 5m 53s trunk passed with JDK v1.8.0_91 +1 compile 6m 47s trunk passed with JDK v1.7.0_95 +1 checkstyle 1m 6s trunk passed +1 mvnsite 1m 53s trunk passed +1 mvneclipse 0m 28s trunk passed +1 findbugs 3m 33s trunk passed +1 javadoc 1m 58s trunk passed with JDK v1.8.0_91 +1 javadoc 2m 54s trunk passed with JDK v1.7.0_95 0 mvndep 0m 15s Maven dependency ordering for patch +1 mvninstall 1m 30s the patch passed +1 compile 5m 44s the patch passed with JDK v1.8.0_91 +1 javac 5m 44s the patch passed +1 compile 6m 44s the patch passed with JDK v1.7.0_95 +1 javac 6m 44s the patch passed +1 checkstyle 1m 4s the patch passed +1 mvnsite 1m 52s the patch passed +1 mvneclipse 0m 27s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 4m 2s the patch passed +1 javadoc 2m 1s the patch passed with JDK v1.8.0_91 +1 javadoc 2m 57s the patch passed with JDK v1.7.0_95 -1 unit 7m 39s hadoop-common in the patch failed with JDK v1.8.0_91. -1 unit 58m 11s hadoop-hdfs in the patch failed with JDK v1.8.0_91. -1 unit 7m 52s hadoop-common in the patch failed with JDK v1.7.0_95. -1 unit 56m 32s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 27s The patch does not generate ASF License warnings. 191m 3s Reason Tests JDK v1.8.0_91 Failed junit tests hadoop.cli.TestCLI   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl JDK v1.7.0_95 Failed junit tests hadoop.cli.TestCLI   hadoop.hdfs.web.TestWebHDFS   hadoop.hdfs.TestHFlush   hadoop.hdfs.TestEncryptionZones Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12802610/HDFS-13079.001.patch JIRA Issue HADOOP-13079 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 206fbe5386ae 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8d48266 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_95.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/testReport/ modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: . Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/9300/console Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          jzhuge John Zhuge added a comment -

          Please review patch 001:

          • Add option -q to class Ls
          • Add new class PrintableString
          • Add unit test TestPrintableString
          • Add unit test TestFsShellList
          Show
          jzhuge John Zhuge added a comment - Please review patch 001: Add option -q to class Ls Add new class PrintableString Add unit test TestPrintableString Add unit test TestFsShellList
          Hide
          aw Allen Wittenauer added a comment -

          even cause your xterm to run arbitrary code by abusing escape sequences.

          Does it matter what command was used to generate the sequence? No, of course not.... which makes it a hole in xterm, not ls.

          Show
          aw Allen Wittenauer added a comment - even cause your xterm to run arbitrary code by abusing escape sequences. Does it matter what command was used to generate the sequence? No, of course not.... which makes it a hole in xterm, not ls.
          Hide
          cmccabe Colin P. McCabe added a comment -

          It's not a security bug for the reasons you think it's a security bug. After all, wc, find, du, ... tons of other UNIX commands will happily print out terminal escape sequences with no option to turn them off. It is, however, problematic for traditional ftpd implementations since it's a great way to inject buffer overflows and then get root on a remote server.

          This behavior is exploitable. That makes it a security bug, even if lots of traditional UNIX commands have it.

          Just because a behavior is traditional doesn't mean it's right. There was a time when UNIX programs used gets() everywhere. When the world became a less trusting place, they had to be fixed not to do that. We should understand the motivations behind historical decisions before blindly copying them.

          ... and my answer is the same as it was almost a decade ago, in some HDFS JIRA somewhere, where a related topic came up before: HDFS would be better served by having a limit on what consists of a legal file and directory name. With an unlimited namespace, it's impossible to test against and impossible to protect every scenario in which oddball characters show up. What's legal in one locale may not be legal in another.

          That's a very good suggestion. I think we should tackle that for Hadoop 3.

          Also, are you prepared to file a CVE for every single time Hadoop prints out a directory or file name to the screen? There are probably hundreds if not thousands of places, obvious ones like 'fs -count' and less obvious ones like 'yarn logs'. This is a 'tilting at windmills' problem. It is MUCH better to have ls blow up than be taken by surprise by something else later on.

          The problem is, ls isn't necessarily going to "blow up," just display something odd, or even cause your xterm to run arbitrary code by abusing escape sequences.

          Show
          cmccabe Colin P. McCabe added a comment - It's not a security bug for the reasons you think it's a security bug. After all, wc, find, du, ... tons of other UNIX commands will happily print out terminal escape sequences with no option to turn them off. It is, however, problematic for traditional ftpd implementations since it's a great way to inject buffer overflows and then get root on a remote server. This behavior is exploitable. That makes it a security bug, even if lots of traditional UNIX commands have it. Just because a behavior is traditional doesn't mean it's right. There was a time when UNIX programs used gets() everywhere. When the world became a less trusting place, they had to be fixed not to do that. We should understand the motivations behind historical decisions before blindly copying them. ... and my answer is the same as it was almost a decade ago, in some HDFS JIRA somewhere, where a related topic came up before: HDFS would be better served by having a limit on what consists of a legal file and directory name. With an unlimited namespace, it's impossible to test against and impossible to protect every scenario in which oddball characters show up. What's legal in one locale may not be legal in another. That's a very good suggestion. I think we should tackle that for Hadoop 3. Also, are you prepared to file a CVE for every single time Hadoop prints out a directory or file name to the screen? There are probably hundreds if not thousands of places, obvious ones like 'fs -count' and less obvious ones like 'yarn logs'. This is a 'tilting at windmills' problem. It is MUCH better to have ls blow up than be taken by surprise by something else later on. The problem is, ls isn't necessarily going to "blow up," just display something odd, or even cause your xterm to run arbitrary code by abusing escape sequences.
          Hide
          aw Allen Wittenauer added a comment -

          The fact that one component has a security bug

          It's not a security bug for the reasons you think it's a security bug. After all, wc, find, du, ... tons of other UNIX commands will happily print out terminal escape sequences with no option to turn them off. It is, however, problematic for traditional ftpd implementations since it's a great way to inject buffer overflows and then get root on a remote server.

          "Should the filename be able use control characters to hijack the admin's GNU screen session and execute arbitrary code? I would say no, what do you say?"

          ... and my answer is the same as it was almost a decade ago, in some HDFS JIRA somewhere, where a related topic came up before: HDFS would be better served by having a limit on what consists of a legal file and directory name. With an unlimited namespace, it's impossible to test against and impossible to protect every scenario in which oddball characters show up. What's legal in one locale may not be legal in another.

          However, it is not constructive to -1 a patch fixing a security vulnerability without suggesting an alternate way of fixing that vulnerability.

          Again: this is not fixing a security vulnerability.

          If this stays unfixed, it will probably get a CVE number.

          Speaking of, have you located a CVE for GNU ls yet? I mean, if this 'ls printing terminal control characters' is a security problem, surely there would be a CVE for it and du and wc and ...

          Also, are you prepared to file a CVE for every single time Hadoop prints out a directory or file name to the screen? There are probably hundreds if not thousands of places, obvious ones like 'fs -count' and less obvious ones like 'yarn logs'. This is a 'tilting at windmills' problem. It is MUCH better to have ls blow up than be taken by surprise by something else later on.

          Show
          aw Allen Wittenauer added a comment - The fact that one component has a security bug It's not a security bug for the reasons you think it's a security bug. After all, wc, find, du, ... tons of other UNIX commands will happily print out terminal escape sequences with no option to turn them off. It is, however, problematic for traditional ftpd implementations since it's a great way to inject buffer overflows and then get root on a remote server. "Should the filename be able use control characters to hijack the admin's GNU screen session and execute arbitrary code? I would say no, what do you say?" ... and my answer is the same as it was almost a decade ago, in some HDFS JIRA somewhere, where a related topic came up before: HDFS would be better served by having a limit on what consists of a legal file and directory name. With an unlimited namespace, it's impossible to test against and impossible to protect every scenario in which oddball characters show up. What's legal in one locale may not be legal in another. However, it is not constructive to -1 a patch fixing a security vulnerability without suggesting an alternate way of fixing that vulnerability. Again: this is not fixing a security vulnerability. If this stays unfixed, it will probably get a CVE number. Speaking of, have you located a CVE for GNU ls yet? I mean, if this 'ls printing terminal control characters' is a security problem, surely there would be a CVE for it and du and wc and ... Also, are you prepared to file a CVE for every single time Hadoop prints out a directory or file name to the screen? There are probably hundreds if not thousands of places, obvious ones like 'fs -count' and less obvious ones like 'yarn logs'. This is a 'tilting at windmills' problem. It is MUCH better to have ls blow up than be taken by surprise by something else later on.
          Hide
          cmccabe Colin P. McCabe added a comment -

          a) It's not standardized behavior amongst all of the platforms that Apache Hadoop runs

          Linux, OpenBSD, FreeBSD, and OS X pick the behavior of hiding control characters in ls by default. That may not be "all the platforms that Apache Hadoop runs on," but it's certainly the vast majority of real-world deployments. The remaining important platform, Windows, doesn't deal with terminals and control characters in quite the same way, so is probably not vulnerable in any case.

          In any case, the fact that the behavior isn't standardized is not a valid argument either way. Clearly Hadoop needs to pick one behavior or the other. Lack of standardization doesn't dictate that we have to pick one behavior or the other. And certainly it doesn't dictate that we should pick an unpopular and surprising behavior that almost nobody has experience with.

          b) It's not expected behavior relative to the rest of Apache Hadoop

          The fact that one component has a security bug doesn't dictate that the other components also need to have the same security bug. This is like arguing that we can't fix a buffer overflow in one component because then it wouldn't match all the other buffer-overflowable components.

          c) It's not feasible to actually make it expected behavior compared to the rest of Apache Hadoop given the proliferation of places where raw file and directory names are printed to the console

          The only places we've discussed here are ls and fsck. Perhaps there are more, but it hardly seems infeasible to change them based on what we've talked about so far. Perhaps log files are also an issue, but only for people who tail the log file of the server. And to reiterate, a security flaw in X doesn't mean we should reproduce the same security flaw in Y.

          At the end of the day, this is a security vulnerability and it needs to be fixed. I asked you before: "Should the filename be able use control characters to hijack the admin's GNU screen session and execute arbitrary code? I would say no, what do you say?" I would repeat the same question again.

          I understand that you have a personal preference for running without -q. However, it is not constructive to -1 a patch fixing a security vulnerability without suggesting an alternate way of fixing that vulnerability. If this stays unfixed, it will probably get a CVE number.

          Show
          cmccabe Colin P. McCabe added a comment - a) It's not standardized behavior amongst all of the platforms that Apache Hadoop runs Linux, OpenBSD, FreeBSD, and OS X pick the behavior of hiding control characters in ls by default. That may not be "all the platforms that Apache Hadoop runs on," but it's certainly the vast majority of real-world deployments. The remaining important platform, Windows, doesn't deal with terminals and control characters in quite the same way, so is probably not vulnerable in any case. In any case, the fact that the behavior isn't standardized is not a valid argument either way. Clearly Hadoop needs to pick one behavior or the other. Lack of standardization doesn't dictate that we have to pick one behavior or the other. And certainly it doesn't dictate that we should pick an unpopular and surprising behavior that almost nobody has experience with. b) It's not expected behavior relative to the rest of Apache Hadoop The fact that one component has a security bug doesn't dictate that the other components also need to have the same security bug. This is like arguing that we can't fix a buffer overflow in one component because then it wouldn't match all the other buffer-overflowable components. c) It's not feasible to actually make it expected behavior compared to the rest of Apache Hadoop given the proliferation of places where raw file and directory names are printed to the console The only places we've discussed here are ls and fsck. Perhaps there are more, but it hardly seems infeasible to change them based on what we've talked about so far. Perhaps log files are also an issue, but only for people who tail the log file of the server. And to reiterate, a security flaw in X doesn't mean we should reproduce the same security flaw in Y. At the end of the day, this is a security vulnerability and it needs to be fixed. I asked you before: "Should the filename be able use control characters to hijack the admin's GNU screen session and execute arbitrary code? I would say no, what do you say?" I would repeat the same question again. I understand that you have a personal preference for running without -q . However, it is not constructive to -1 a patch fixing a security vulnerability without suggesting an alternate way of fixing that vulnerability. If this stays unfixed, it will probably get a CVE number.
          Hide
          jzhuge John Zhuge added a comment -

          Split the jira. This jira will add option -q; HADOOP-13093 will decide the default behavior on a terminal.

          The split gives us more time to debate the proper default behavior, and weigh trade-offs. It may also enable different target versions.

          Show
          jzhuge John Zhuge added a comment - Split the jira. This jira will add option -q ; HADOOP-13093 will decide the default behavior on a terminal. The split gives us more time to debate the proper default behavior, and weigh trade-offs. It may also enable different target versions.
          Hide
          aw Allen Wittenauer added a comment -

          I've removed the supportability flag. This is not a supportability issue given that making -q the default would harm the ability to use ls as a canary prior to looking at other content.

          Let me be explicit on my stance:

          I'm +1 on adding a -q option.

          I'm -1 on making -q the default given that:

          a) It's not standardized behavior amongst all of the platforms that Apache Hadoop runs

          b) It's not expected behavior relative to the rest of Apache Hadoop

          c) It's not feasible to actually make it expected behavior compared to the rest of Apache Hadoop given the proliferation of places where raw file and directory names are printed to the console

          Show
          aw Allen Wittenauer added a comment - I've removed the supportability flag. This is not a supportability issue given that making -q the default would harm the ability to use ls as a canary prior to looking at other content. Let me be explicit on my stance: I'm +1 on adding a -q option. I'm -1 on making -q the default given that: a) It's not standardized behavior amongst all of the platforms that Apache Hadoop runs b) It's not expected behavior relative to the rest of Apache Hadoop c) It's not feasible to actually make it expected behavior compared to the rest of Apache Hadoop given the proliferation of places where raw file and directory names are printed to the console
          Hide
          aw Allen Wittenauer added a comment -

          It's not worth the effort to play whack-a-mole here for little-to-no real reward.

          Show
          aw Allen Wittenauer added a comment - It's not worth the effort to play whack-a-mole here for little-to-no real reward.
          Hide
          andrew.wang Andrew Wang added a comment -

          I'll note that tools like cat and grep are for printing the contents of files, not file metadata like ls. The default GNU behaviors are simply to print the non-printable characters. So for the audit log example, since the audit log is a file, IMO that grep behavior is expected.

          fsck and find I think are more appropriate analogues, along with things like lsSnapshottableDirs and listEncryptionZones. Likely many others.

          If you can help us identify these commands and their expected default behaviors, we can file more follow-ons for the rest of the FsShell and hdfs commands.

          Show
          andrew.wang Andrew Wang added a comment - I'll note that tools like cat and grep are for printing the contents of files, not file metadata like ls . The default GNU behaviors are simply to print the non-printable characters. So for the audit log example, since the audit log is a file, IMO that grep behavior is expected. fsck and find I think are more appropriate analogues, along with things like lsSnapshottableDirs and listEncryptionZones . Likely many others. If you can help us identify these commands and their expected default behaviors, we can file more follow-ons for the rest of the FsShell and hdfs commands.
          Hide
          aw Allen Wittenauer added a comment -

          Allen, do you have some counter-examples where someone coming from a NIXy background would be confused by -q as default?

          It's going to be why doesn't fsck change the file names? Why doesn't job submission change the filenames? How come ls didn't blow up but when I grep'd the audit log it did? etc, etc. Hadoop itself exposes filenames on the terminal all over the place.

          Unless we plan on touching everything, just changing one place is going to be wildly confusing.

          Show
          aw Allen Wittenauer added a comment - Allen, do you have some counter-examples where someone coming from a NIXy background would be confused by -q as default? It's going to be why doesn't fsck change the file names? Why doesn't job submission change the filenames? How come ls didn't blow up but when I grep'd the audit log it did? etc, etc. Hadoop itself exposes filenames on the terminal all over the place. Unless we plan on touching everything, just changing one place is going to be wildly confusing.
          Hide
          jzhuge John Zhuge added a comment -

          Target 3.0.0 for now.

          Show
          jzhuge John Zhuge added a comment - Target 3.0.0 for now.
          Hide
          andrew.wang Andrew Wang added a comment -

          FWIW I think FreeBSD and OpenBSD default to printing "?" rather than the control character:

          http://www.freebsd.org/cgi/man.cgi?ls
          http://man.openbsd.org/OpenBSD-current/man1/ls.1

          -q Force printing of non-graphic characters in file names as the
          character `?'; this is the default when output is to a terminal.

          Allen, do you have some counter-examples where someone coming from a NIXy background would be confused by -q as default? IIRC OSX uses a FreeBSD userspace, so it seems like for the majority of Hadoop users, -q is already the expectation.

          John, do you mind setting the target version? If we have compatibility concerns, maybe this only targets Hadoop 3.

          Show
          andrew.wang Andrew Wang added a comment - FWIW I think FreeBSD and OpenBSD default to printing "?" rather than the control character: http://www.freebsd.org/cgi/man.cgi?ls http://man.openbsd.org/OpenBSD-current/man1/ls.1 -q Force printing of non-graphic characters in file names as the character `?'; this is the default when output is to a terminal. Allen, do you have some counter-examples where someone coming from a NIXy background would be confused by -q as default? IIRC OSX uses a FreeBSD userspace, so it seems like for the majority of Hadoop users, -q is already the expectation. John, do you mind setting the target version? If we have compatibility concerns, maybe this only targets Hadoop 3.
          Hide
          cmccabe Colin P. McCabe added a comment -

          Yup, I can't think of why -q should be the default either... but more importantly, neither could POSIX to the point that it demanded the standard have -q be the default.

          Please do not misquote what I said. I was arguing that echoing control characters to the terminal should not be the default behavior. You are arguing the opposite.

          ... until such a point that they print the filename to the screen to show what files are being processed. At which point this change has accomplished absolutely nothing. Changing ls is security theater.

          There are a lot of scripts that interact with HDFS via FsShell. These scripts will never "print the filename to the screen" or if they do, it will be a filename that they got from ls itself which does not contain control characters.

          I could come up with examples of how this is helpful all day if needed. Here's another one: Some sysadmin logs in and does an hadoop fs -ls of a directory created by \$BADGUY. Should the filename be able use control characters to hijack the admin's GNU screen session and execute arbitrary code? I would say no, what do you say?

          Are we going to change cat too?

          Most system administrators will not cat a file without checking what type it is. It is well-known that catting an unknown file could mess up the terminal. On the other hand, most system administrators do not think that running ls on a directory could be a security risk. Linux and other well known operating systems also do not protect users from this, so there are no pre-existing expectations of protection.

          Then stop bringing up (traditional) UNIX if you feel it isn't relevant and especially when you've used the term incorrectly.

          There are a huge number of sysadmins who grew up with the GNU tools, which do have the behavior we're describing here. It's a powerful argument for implementing that behavior. When you add the fact that it fixes security vulnerabilities, it's an extremely compelling argument.

          I think it's clear that this change does have a big positive effect in many scenarios, does fix real-world security flaws, and does accord with the expectations of most system administrators. That's three powerful reasons to do it. I can find no valid counter-argument for any of these reasons anywhere in these comments.

          Show
          cmccabe Colin P. McCabe added a comment - Yup, I can't think of why -q should be the default either... but more importantly, neither could POSIX to the point that it demanded the standard have -q be the default. Please do not misquote what I said. I was arguing that echoing control characters to the terminal should not be the default behavior. You are arguing the opposite. ... until such a point that they print the filename to the screen to show what files are being processed. At which point this change has accomplished absolutely nothing. Changing ls is security theater. There are a lot of scripts that interact with HDFS via FsShell. These scripts will never "print the filename to the screen" or if they do, it will be a filename that they got from ls itself which does not contain control characters. I could come up with examples of how this is helpful all day if needed. Here's another one: Some sysadmin logs in and does an hadoop fs -ls of a directory created by \$BADGUY . Should the filename be able use control characters to hijack the admin's GNU screen session and execute arbitrary code? I would say no, what do you say? Are we going to change cat too? Most system administrators will not cat a file without checking what type it is. It is well-known that catting an unknown file could mess up the terminal. On the other hand, most system administrators do not think that running ls on a directory could be a security risk. Linux and other well known operating systems also do not protect users from this, so there are no pre-existing expectations of protection. Then stop bringing up (traditional) UNIX if you feel it isn't relevant and especially when you've used the term incorrectly. There are a huge number of sysadmins who grew up with the GNU tools, which do have the behavior we're describing here. It's a powerful argument for implementing that behavior. When you add the fact that it fixes security vulnerabilities, it's an extremely compelling argument. I think it's clear that this change does have a big positive effect in many scenarios, does fix real-world security flaws, and does accord with the expectations of most system administrators. That's three powerful reasons to do it. I can find no valid counter-argument for any of these reasons anywhere in these comments.
          Hide
          aw Allen Wittenauer added a comment -

          OK, so Linux is technically a UNIX-like system rather than a licensee of the UNIX trademark. I don't feel that this is relevant to the discussion here.

          Then stop bringing up (traditional) UNIX if you feel it isn't relevant and especially when you've used the term incorrectly.

          Dumping control characters out on an interactive terminal is a security vulnerability

          It is, but changing ls' behavior isn't going to fix that vulnerability. There's a reason why all of those links up there you quoted talk about terminals and terminal emulation and what is actually vulnerable without identifying specific vulnerabilities in specific commands.

          I can't think of a single reason why we would want this to be the default.

          Yup, I can't think of why -q should be the default either... but more importantly, neither could POSIX to the point that it demanded the standard have -q be the default.

          Show
          aw Allen Wittenauer added a comment - OK, so Linux is technically a UNIX-like system rather than a licensee of the UNIX trademark. I don't feel that this is relevant to the discussion here. Then stop bringing up (traditional) UNIX if you feel it isn't relevant and especially when you've used the term incorrectly. Dumping control characters out on an interactive terminal is a security vulnerability It is, but changing ls' behavior isn't going to fix that vulnerability. There's a reason why all of those links up there you quoted talk about terminals and terminal emulation and what is actually vulnerable without identifying specific vulnerabilities in specific commands. I can't think of a single reason why we would want this to be the default. Yup, I can't think of why -q should be the default either... but more importantly, neither could POSIX to the point that it demanded the standard have -q be the default.
          Hide
          aw Allen Wittenauer added a comment -

          Are we going to change cat too?

          Show
          aw Allen Wittenauer added a comment - Are we going to change cat too?
          Hide
          aw Allen Wittenauer added a comment -

          ... until such a point that they print the filename to the screen to show what files are being processed. At which point this change has accomplished absolutely nothing. Changing ls is security theater.

          Show
          aw Allen Wittenauer added a comment - ... until such a point that they print the filename to the screen to show what files are being processed. At which point this change has accomplished absolutely nothing. Changing ls is security theater.
          Hide
          cmccabe Colin P. McCabe added a comment -

          OK, so Linux is technically a UNIX-like system rather than a licensee of the UNIX trademark. I don't feel that this is relevant to the discussion here. I feel like you are just being pedantic. Linux's behavior is still the one that most people compare our behavior to, whether we like it or not. And Linux's behavior is to hide control characters by default in ls.

          More importantly, Linux's behavior makes more sense than the other behavior you are suggesting. Dumping control characters out on an interactive terminal is a security vulnerability as well as a giant annoyance. I can't think of a single reason why we would want this to be the default.

          Show
          cmccabe Colin P. McCabe added a comment - OK, so Linux is technically a UNIX-like system rather than a licensee of the UNIX trademark. I don't feel that this is relevant to the discussion here. I feel like you are just being pedantic. Linux's behavior is still the one that most people compare our behavior to, whether we like it or not. And Linux's behavior is to hide control characters by default in ls. More importantly, Linux's behavior makes more sense than the other behavior you are suggesting. Dumping control characters out on an interactive terminal is a security vulnerability as well as a giant annoyance. I can't think of a single reason why we would want this to be the default.
          Hide
          jzhuge John Zhuge added a comment -

          Allen Wittenauer, you mentioned "default -q" will break stuff, do you have any use case or test case in mind? I can only see potential problems in Expect or something similar when they might parse dfs -ls output in the terminal mode. All scripts that redirect dfs -ls stdout should be ok with "default -q" behavior.

          Show
          jzhuge John Zhuge added a comment - Allen Wittenauer , you mentioned "default -q" will break stuff, do you have any use case or test case in mind? I can only see potential problems in Expect or something similar when they might parse dfs -ls output in the terminal mode. All scripts that redirect dfs -ls stdout should be ok with "default -q" behavior.
          Hide
          aw Allen Wittenauer added a comment -

          The most popular UNIX system on Earth, Linux

          Linux is not UNIX(tm). Please stop saying things that aren't true.

          so nobody will be surprised by it.

          ... except for those that actually use Real UNIX(tm) and not some discount knock off whose overly zealous followers believe they invented everything.

          Keep in mind also that in order to do -q, you need an anti-q.

          Show
          aw Allen Wittenauer added a comment - The most popular UNIX system on Earth, Linux Linux is not UNIX(tm). Please stop saying things that aren't true. so nobody will be surprised by it. ... except for those that actually use Real UNIX(tm) and not some discount knock off whose overly zealous followers believe they invented everything. Keep in mind also that in order to do -q, you need an anti-q.
          Hide
          cmccabe Colin P. McCabe added a comment -

          Thank you for the background information. I wasn't aware that the default of suppressing non-printing characters was "optional" according to POSIX.

          I think the important thing is that we've established that:

          • Suppressing non-printing characters by default fixes several serious security vulnerabilties, including some that have CVEs,
          • This suppression behavior is explicitly allowed by POSIX,
          • The most popular UNIX system on Earth, Linux, implements this behavior, so nobody will be surprised by it.

          Essentially interactive sessions with stdin redirected [falsely show up as non-interactive from Java]

          I guess my concern about adding a JNI dependency here is that it will make things too nondeterministic. I've seen too many clusters where JNI was improperly configured.

          Show
          cmccabe Colin P. McCabe added a comment - Thank you for the background information. I wasn't aware that the default of suppressing non-printing characters was "optional" according to POSIX. I think the important thing is that we've established that: Suppressing non-printing characters by default fixes several serious security vulnerabilties, including some that have CVEs, This suppression behavior is explicitly allowed by POSIX, The most popular UNIX system on Earth, Linux, implements this behavior, so nobody will be surprised by it. Essentially interactive sessions with stdin redirected [falsely show up as non-interactive from Java] I guess my concern about adding a JNI dependency here is that it will make things too nondeterministic. I've seen too many clusters where JNI was improperly configured.
          Hide
          aw Allen Wittenauer added a comment -

          ... which is just the long form of what I just said.

          Show
          aw Allen Wittenauer added a comment - ... which is just the long form of what I just said.
          Hide
          jzhuge John Zhuge added a comment -

          Thanks Colin P. McCabe and Allen Wittenauer for the comments. Some quotes from ls(1posix):

          Implementations may make -q the default for terminals to prevent trojan horse attacks on terminals with special escape sequences. This is not required because:

          • Some control characters may be useful on some terminals; for example, a system might write them as "\001" or "^A".
          • Special behavior for terminals is not relevant to applications portability.

          However, it does not cover security concerns similar to CVE-2004-1488.

          A quick survey shows ls defaults to -q on a terminal on Linux, BSD, and OSX; but not on Solaris.

          Show
          jzhuge John Zhuge added a comment - Thanks Colin P. McCabe and Allen Wittenauer for the comments. Some quotes from ls(1posix) : Implementations may make -q the default for terminals to prevent trojan horse attacks on terminals with special escape sequences. This is not required because: Some control characters may be useful on some terminals; for example, a system might write them as "\001" or "^A". Special behavior for terminals is not relevant to applications portability. However, it does not cover security concerns similar to CVE-2004-1488 . A quick survey shows ls defaults to -q on a terminal on Linux, BSD, and OSX; but not on Solaris .
          Hide
          aw Allen Wittenauer added a comment - - edited

          It's not surprising, because it matches the traditional UNIX / Linux behavior.

          The defaulting of -q on is not traditional UNIX behavior. It may be what GNU does ("Linux"), but it's not the expected, standard behavior according to the POSIX spec. (The POSIX spec, does, however, say that individual implementations may turn it on.) The fact that -q is an stanard, single letter option and the way to turn it off is not should have been a very big hint.

          Show
          aw Allen Wittenauer added a comment - - edited It's not surprising, because it matches the traditional UNIX / Linux behavior. The defaulting of -q on is not traditional UNIX behavior. It may be what GNU does ("Linux"), but it's not the expected, standard behavior according to the POSIX spec. (The POSIX spec, does, however, say that individual implementations may turn it on.) The fact that -q is an stanard, single letter option and the way to turn it off is not should have been a very big hint.
          Hide
          jzhuge John Zhuge added a comment -

          Essentially interactive sessions with stdin redirected.

          Show
          jzhuge John Zhuge added a comment - Essentially interactive sessions with stdin redirected.
          Hide
          jzhuge John Zhuge added a comment -

          I only come up with these 2 cases:

          • echo dir1 dir2 | xargs hadoop fs -ls
          • hadoop fs -ls dir1 </dev/null
          Show
          jzhuge John Zhuge added a comment - I only come up with these 2 cases: echo dir1 dir2 | xargs hadoop fs -ls hadoop fs -ls dir1 </dev/null
          Hide
          cmccabe Colin P. McCabe added a comment -

          No way should -q be the default under any circumstances. That is extremely surprising behavior that will definitely break stuff.

          It's not surprising, because it matches the traditional UNIX / Linux behavior. In Linux, /bin/ls will not print control characters by default. you must pass the --show-control-characters option in order to see them. From the man page:

                 --show-control-chars
                        show non graphic characters as-is (default unless program is 'ls' and output is a terminal)
          

          ls blasting raw control characters into an interactive terminal is a very bad idea. It leads to some very serious security vulnerabilities because commonly used software like xterm, GNU screen, tmux and so forth interpret control characters. Using control characters, you can convince these pieces of software to execute arbitrary code. See http://marc.info/?l=bugtraq&m=104612710031920&q=p3 and https://www.proteansec.com/linux/blast-past-executing-code-terminal-emulators-via-escape-sequences/ There are even CVEs for some of these issues.

          We should make the default opt-in for printing control characters in our next compatibility-breaking release (Hadoop 3.x).

          In C, isatty(STDOUT_FILENO) is used to find out whether the output is a terminal. Since Java doesn't have isatty, I will use JNI to call C isatty() because the closest test System.console() == null does not work in some cases.

          It would really be nice if we could determine this without using JNI, because it's often not available. Under what conditions does the System.console() == null check not work? The only case I was able to find in a quick Google search was inside an eclipse console. That seems like a case where the security issues would not be a concern, because it's a debugging environment. Are there other cases where the non-JNI check would fail?

          Show
          cmccabe Colin P. McCabe added a comment - No way should -q be the default under any circumstances. That is extremely surprising behavior that will definitely break stuff. It's not surprising, because it matches the traditional UNIX / Linux behavior. In Linux, /bin/ls will not print control characters by default. you must pass the --show-control-characters option in order to see them. From the man page: --show-control-chars show non graphic characters as-is ( default unless program is 'ls' and output is a terminal) ls blasting raw control characters into an interactive terminal is a very bad idea. It leads to some very serious security vulnerabilities because commonly used software like xterm , GNU screen , tmux and so forth interpret control characters. Using control characters, you can convince these pieces of software to execute arbitrary code. See http://marc.info/?l=bugtraq&m=104612710031920&q=p3 and https://www.proteansec.com/linux/blast-past-executing-code-terminal-emulators-via-escape-sequences/ There are even CVEs for some of these issues. We should make the default opt-in for printing control characters in our next compatibility-breaking release (Hadoop 3.x). In C, isatty(STDOUT_FILENO) is used to find out whether the output is a terminal. Since Java doesn't have isatty, I will use JNI to call C isatty() because the closest test System.console() == null does not work in some cases. It would really be nice if we could determine this without using JNI, because it's often not available. Under what conditions does the System.console() == null check not work? The only case I was able to find in a quick Google search was inside an eclipse console. That seems like a case where the security issues would not be a concern, because it's a debugging environment. Are there other cases where the non-JNI check would fail?
          Hide
          aw Allen Wittenauer added a comment -

          No way should -q be the default under any circumstances. That is extremely surprising behavior that will definitely break stuff.

          Show
          aw Allen Wittenauer added a comment - No way should -q be the default under any circumstances. That is extremely surprising behavior that will definitely break stuff.

            People

            • Assignee:
              jzhuge John Zhuge
              Reporter:
              jzhuge John Zhuge
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development