Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-441

SequenceFile should support 'custom compressors'

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.6.0
    • io
    • None

    Description

      SequenceFiles should support 'custom compressors' which can be specified by the user on creation of the file.

      Readily available packages for gzip and zip (java.util.zip) are among obvious choices to support. Of course there will be hooks so that other compressors can be added in future as long as there is a way to construct (input/output) streams on top of the compressor/decompressor.

      The 'classname' of the 'custom compressor/decompressor' could be stored in the header of the SequenceFile which can then be used by SequenceFile.Reader to figure out the appropriate 'decompressor'. Thus I propose we add constructors to SequenceFile.Writer which take in the 'classname' of the compressor's input/output stream classes (e.g. DeflaterOutputStream/InflaterInputStream or GZIPOutputStream/GZIPInputStream), which acts as the hook for future compressors/decompressors.

      Attachments

        1. codec.patch
          20 kB
          Owen O'Malley
        2. codec_updated_interfaces_20060830.patch
          22 kB
          Arun Murthy
        3. codec20060831.patch
          21 kB
          Arun Murthy
        4. codec.patch
          45 kB
          Arun Murthy
        5. reports.tgz
          81 kB
          Arun Murthy
        6. codec_20060907.patch
          44 kB
          Arun Murthy

        Issue Links

          Activity

            People

              acmurthy Arun Murthy
              acmurthy Arun Murthy
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: