Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.5
    • Fix Version/s: 1.7.6
    • Component/s: java
    • Labels:

      Description

      The java Avro projects already depend on the commons-library; the attached patch adds support for the XZ codec using classes from it. I've also refactored some of the duplicated code in the tools project (the code that deals with the --codec and --level options) by moving a single copy to the Util class, and calling that from the various tools that produce Avro files.

      The XZ codec by default uses the LZMA2 algorithm to compress files - this produces noticeably smaller files than the current bzip2 or deflate codecs.

      1. xz-codec.patch
        24 kB
        Nick White
      2. AVRO-1373.patch
        23 kB
        Doug Cutting
      3. AVRO-1373.1.patch
        33 kB
        Nick White

        Activity

        Hide
        Nick White added a comment -

        Could I get a review for this patch? Thanks -

        Show
        Nick White added a comment - Could I get a review for this patch? Thanks -
        Hide
        Doug Cutting added a comment -

        This looks great. Trunk has drifted a bit since this was first submitted.

        Here's a version that applies cleanly against current trunk.

        This fails tests with:
        testRecodec(org.apache.avro.tool.TestRecodecTool): expected:<4835> but was:<290479>

        If we can fix this then I'm up for committing this.

        Show
        Doug Cutting added a comment - This looks great. Trunk has drifted a bit since this was first submitted. Here's a version that applies cleanly against current trunk. This fails tests with: testRecodec(org.apache.avro.tool.TestRecodecTool): expected:<4835> but was:<290479> If we can fix this then I'm up for committing this.
        Hide
        Nick White added a comment -

        I've attached a modified version of your re-based patch (AVRO-1373.1.patch) that fixes the test. The Recodec tool defaults to null compression, unlike the other tools (that default to deflate)! Thanks -

        Show
        Nick White added a comment - I've attached a modified version of your re-based patch ( AVRO-1373 .1.patch) that fixes the test. The Recodec tool defaults to null compression, unlike the other tools (that default to deflate)! Thanks -
        Hide
        Doug Cutting added a comment -

        The latest patch appears to be missing the file XZCodec.java.

        Show
        Doug Cutting added a comment - The latest patch appears to be missing the file XZCodec.java.
        Hide
        Nick White added a comment -

        Ah - sorry. I've updated AVRO-1373.1.patch with the missing file.

        Show
        Nick White added a comment - Ah - sorry. I've updated AVRO-1373 .1.patch with the missing file.
        Hide
        Doug Cutting added a comment -

        I committed this. Thanks, Nick.

        Show
        Doug Cutting added a comment - I committed this. Thanks, Nick.
        Hide
        ASF subversion and git services added a comment -

        Commit 1540620 from Doug Cutting in branch 'avro/trunk'
        [ https://svn.apache.org/r1540620 ]

        AVRO-1373. Java: Add support for xz compresssion codec, using LZMA2. Contributed by Nick White.

        Show
        ASF subversion and git services added a comment - Commit 1540620 from Doug Cutting in branch 'avro/trunk' [ https://svn.apache.org/r1540620 ] AVRO-1373 . Java: Add support for xz compresssion codec, using LZMA2. Contributed by Nick White.
        Hide
        Nick White added a comment -

        Thanks!

        Show
        Nick White added a comment - Thanks!
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in AvroJava #404 (See https://builds.apache.org/job/AvroJava/404/)
        AVRO-1373. Java: Add support for xz compresssion codec, using LZMA2. Contributed by Nick White. (cutting: rev 1540620)

        • /avro/trunk/CHANGES.txt
        • /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/file/CodecFactory.java
        • /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/file/DataFileConstants.java
        • /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/file/XZCodec.java
        • /avro/trunk/lang/java/avro/src/test/java/org/apache/avro/TestDataFile.java
        • /avro/trunk/lang/java/avro/src/test/java/org/apache/avro/TestDataFileConcat.java
        • /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapred/AvroOutputFormat.java
        • /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapred/tether/TetherOutputFormat.java
        • /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapreduce/AvroOutputFormatBase.java
        • /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/CreateRandomFileTool.java
        • /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/DataFileWriteTool.java
        • /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/FromTextTool.java
        • /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/RecodecTool.java
        • /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/Util.java
        Show
        Hudson added a comment - SUCCESS: Integrated in AvroJava #404 (See https://builds.apache.org/job/AvroJava/404/ ) AVRO-1373 . Java: Add support for xz compresssion codec, using LZMA2. Contributed by Nick White. (cutting: rev 1540620) /avro/trunk/CHANGES.txt /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/file/CodecFactory.java /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/file/DataFileConstants.java /avro/trunk/lang/java/avro/src/main/java/org/apache/avro/file/XZCodec.java /avro/trunk/lang/java/avro/src/test/java/org/apache/avro/TestDataFile.java /avro/trunk/lang/java/avro/src/test/java/org/apache/avro/TestDataFileConcat.java /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapred/AvroOutputFormat.java /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapred/tether/TetherOutputFormat.java /avro/trunk/lang/java/mapred/src/main/java/org/apache/avro/mapreduce/AvroOutputFormatBase.java /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/CreateRandomFileTool.java /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/DataFileWriteTool.java /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/FromTextTool.java /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/RecodecTool.java /avro/trunk/lang/java/tools/src/main/java/org/apache/avro/tool/Util.java

          People

          • Assignee:
            Nick White
            Reporter:
            Nick White
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development