Hadoop Common
  1. Hadoop Common
  2. HADOOP-8003

Make SplitCompressionInputStream an interface instead of an abstract class


    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.0.0, 0.21.0, 0.22.0, 0.23.0
    • Fix Version/s: None
    • Component/s: io
    • Labels:


      To be splittable, a codec must extend SplittableCompressionCodec which has a function returning a SplitCompressionInputStream.

      SplitCompressionInputStream is an abstract class which extends CompressionInputStream, the lowest level compression stream class.

      So, no codec that wants to be splittable can reuse any code from DecompressorStream or BlockDecompressorStream.

      You either have to duplicate that code, or not be splittable.

      SplitCompressionInputStream adds just a few very thin functions. Can we make this an interface rather than an abstract class to allow splittable decompression streams to extend DecompressorStream, BlockDecompressorStream, or whatever else we should scheme up in the future?

      To my knowledge, this would impact only the BZip2 codec. None of the other implement this form of splittability yet.

      LineRecordReader looks only at whether the codec is an instance of SplittableCompressionCodec, and then calls the appropriate version of createInputStream. This would not change, so the application code should not have to change, just BZip and SplitCompressionInputStream.

        Issue Links


          Tim Broberg made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Tim Broberg made changes -
          Field Original Value New Value
          Link This issue relates to HADOOP-7823 [ HADOOP-7823 ]
          Tim Broberg created issue -


            • Assignee:
              Tim Broberg
            • Votes:
              0 Vote for this issue
              3 Start watching this issue


              • Created: