XMLWordPrintableJSON

Details

    • streaming compaction

    Description

      Task is to support compaction of partitions.

      Rationale: Streaming partitions are composed of a large number of small files (each commit is one file). Since compaction can be a potentially expensive operation (for e.g. converting to single ORC file), we do not compact the streaming partition at the time of rolling it into a standard partition. This allows rolling to be quick and atomic.

      Compaction will be performed at a later time. The streaming partition is converted as is (typically with a many small files) into a standard partition. This new standard partition will be queued up for compaction by a separate job.

      This decouples the compaction feature from streaming support, and makes it more generally available for any partitions.

      Attachments

        Activity

          People

            roshan_naik Roshan Naik
            roshan_naik Roshan Naik
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: