Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.98.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently, the compaction policy granularity is based on single machine. we had a thought that introduce a new cluster granularity decision, such that we could improve those case per cluster running status:
      1) many nodes are compacting aggressive, we call it cluster compaction storm, we should throttle it.
      2) do more compaction if low traffic in current cluster(similar with off-peak feature), not limit by config timerange(like off-peak timerange), just trigger by load or qps or other stuff.

      comments? thanks

        Activity

        Hide
        stack added a comment -

        Sounds like good idea to me.

        How you think it would work? When a region wants to compact, it would send its desire along w/ stats on the compaction it wants to run (bytes, files) to a central location. Then the coordinator, or 'central planner' would give out who could compact when? When a compaction starts/stops, it would let the coordinator know.

        The Master should be the coordinator. Would the desire to compact come in on the back of the heartbeats? (We should try and get 'load' off the heartbeats and have 'load' instead come into the server via metrics). Master would need to keep these stats somewhere? In a system table?

        Should we call it "central planning compaction" rather than adaptive?

        Good on you Liang Xie

        Show
        stack added a comment - Sounds like good idea to me. How you think it would work? When a region wants to compact, it would send its desire along w/ stats on the compaction it wants to run (bytes, files) to a central location. Then the coordinator, or 'central planner' would give out who could compact when? When a compaction starts/stops, it would let the coordinator know. The Master should be the coordinator. Would the desire to compact come in on the back of the heartbeats? (We should try and get 'load' off the heartbeats and have 'load' instead come into the server via metrics). Master would need to keep these stats somewhere? In a system table? Should we call it "central planning compaction" rather than adaptive? Good on you Liang Xie
        Hide
        Vladimir Rodionov added a comment -

        Compaction Scheduler?

        Show
        Vladimir Rodionov added a comment - Compaction Scheduler?
        Hide
        Vladimir Rodionov added a comment -

        Is it possible to create umbrella ticket for all compaction-relate JIRAs? New feature JIRA - not a bug JIRAs?

        Show
        Vladimir Rodionov added a comment - Is it possible to create umbrella ticket for all compaction-relate JIRAs? New feature JIRA - not a bug JIRAs?
        Hide
        stack added a comment -

        Vladimir Rodionov Good idea. I made HBASE-9530. Will make this a subtask of it. I made you a contributor so you can do same if inclined. Thanks.

        Show
        stack added a comment - Vladimir Rodionov Good idea. I made HBASE-9530 . Will make this a subtask of it. I made you a contributor so you can do same if inclined. Thanks.
        Hide
        Liang Xie added a comment -

        yeh, both "central planning compaction" and "compaction scheduler" seems more suitable

        Show
        Liang Xie added a comment - yeh, both "central planning compaction" and "compaction scheduler" seems more suitable

          People

          • Assignee:
            Unassigned
            Reporter:
            Liang Xie
          • Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

            • Created:
              Updated:

              Development