HBase
  1. HBase
  2. HBASE-3649

Separate compression setting for flush files

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      In this thread on user@hbase: http://search-hadoop.com/m/WUnLM6ojHm1 J-D conjectures that compressing flush files leads to a suboptimal situation where "the puts are sometimes blocked on the memstores which are blocked by the flusher thread which is blocked because there's too many files to compact because the compactor is given too many small files to compact and has to compact the same data a bunch of times."

      We have a separate compression setting already for major compaction vs store files written during minor compaction, for background/archival apps. Add a separate compression setting for flush files, default to none, to avoid the above condition.

        Activity

        Hide
        Andrew Purtell added a comment -

        Indeed this doesn't seem necessary/useful upon further reflection.

        Show
        Andrew Purtell added a comment - Indeed this doesn't seem necessary/useful upon further reflection.
        Hide
        Jean-Daniel Cryans added a comment -

        Are you working on this Andrew? Would you mind if we punt it as we are trying to get 0.90.2 ready soon?

        Show
        Jean-Daniel Cryans added a comment - Are you working on this Andrew? Would you mind if we punt it as we are trying to get 0.90.2 ready soon?
        Hide
        stack added a comment -

        I thought the problem was that compression slowed the flush. If problem is rather the count of files, yeah, compression doesn't factor.

        I think the better solution would be "merging flushes"?

        Its about time we did this (Its only 5 years since it was described in BT paper). I made HBASE-3656.

        Show
        stack added a comment - I thought the problem was that compression slowed the flush. If problem is rather the count of files, yeah, compression doesn't factor. I think the better solution would be "merging flushes"? Its about time we did this (Its only 5 years since it was described in BT paper). I made HBASE-3656 .
        Hide
        Todd Lipcon added a comment -

        Not sure I follow here... turning off compression for the flush files would just make them bigger, but we'd still have the same amount....

        I think the better solution would be "merging flushes"?

        Show
        Todd Lipcon added a comment - Not sure I follow here... turning off compression for the flush files would just make them bigger, but we'd still have the same amount.... I think the better solution would be "merging flushes"?

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrew Purtell
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development