Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-5944

Flink should support reading Snappy Files

    XMLWordPrintableJSON

Details

    Description

      Snappy is an extremely performant compression format that's widely used offering fast decompression/compression.

      This can be easily implemented by creating a SnappyInflaterInputStreamFactory and updating the initDefaultInflateInputStreamFactories in FileInputFormat.

      Flink already includes the Snappy dependency in the project.

      There is a minor gotcha in this. If we wish to use this with Hadoop, then we must provide two separate implementations since Hadoop uses a different version of the snappy format than Snappy Java (which is the xerial/snappy included in Flink).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ilganeli Ilya Ganelin
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: