Hadoop Common
  1. Hadoop Common
  2. HADOOP-6714

FsShell 'hadoop fs -text' does not support compression codecs

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently, 'hadoop fs -text myfile' looks at the first few magic bytes of a file to determine whether it is gzip compressed or a sequence file. This means 'fs -text' cannot properly decode .deflate or .bz2 files (or other codecs specified via configuration).

      It should be fairly straightforward to add support for other codecs by checking the file extension against the CompressionCodecFactory to retrieve an appropriate Codec.

      1. hadoop-6714-20-1.patch
        1 kB
        Eli Collins
      2. HADOOP-6714.patch
        1 kB
        Patrick Angeles

        Activity

        Hide
        Lianhui Wang added a comment -

        that 's great. but if a uncompressed that the first character of it is the same as the header of compressed files, that is a error.
        in addition, can we use FileOutputSteam instead of stdout?

        Show
        Lianhui Wang added a comment - that 's great. but if a uncompressed that the first character of it is the same as the header of compressed files, that is a error. in addition, can we use FileOutputSteam instead of stdout?
        Hide
        Eli Collins added a comment -

        Patch for 20 attached.

        Show
        Eli Collins added a comment - Patch for 20 attached.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk #346 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/346/)

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk #346 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/346/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #260 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/260/)
        HADOOP-6714. Resolve compressed files using CodecFactory in FsShell::text. Contributed by Patrick Angeles

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #260 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/260/ ) HADOOP-6714 . Resolve compressed files using CodecFactory in FsShell::text. Contributed by Patrick Angeles
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Patrick!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Patrick!
        Hide
        Jeff Hammerbacher added a comment -

        Sorry, linked an issue in the wrong window and can't figure out how to delete the link. This issue is in no way related to HDFS-1051.

        Show
        Jeff Hammerbacher added a comment - Sorry, linked an issue in the wrong window and can't figure out how to delete the link. This issue is in no way related to HDFS-1051 .
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12442200/HADOOP-6714.patch
        against trunk revision 934619.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442200/HADOOP-6714.patch against trunk revision 934619. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/464/console This message is automatically generated.
        Hide
        Patrick Angeles added a comment -

        Attaching simple patch to resolve a file decompressor on 'hadoop fs -text' before continuing with prior behavior.

        Show
        Patrick Angeles added a comment - Attaching simple patch to resolve a file decompressor on 'hadoop fs -text' before continuing with prior behavior.

          People

          • Assignee:
            Patrick Angeles
            Reporter:
            Patrick Angeles
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development