Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-3325

Optimize log splitter to not output obsolete edits

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.92.0
    • Fix Version/s: None
    • Component/s: master, regionserver
    • Labels:
      None

      Description

      Currently when the master splits logs, it outputs all edits it finds, even those that have already been obsoleted by flushes. At replay time on the RS we discard the edits that have already been flushed.

      We could do a pretty simple optimization here - basically the RS should replicate a map "region id -> last flushed seq id" into ZooKeeper (this can be asynchronous by some seconds without any problems). Then when doing log splitting, if we have this map available, we can discard any edits found in the logs that were already flushed, and thus output a much smaller amount of data.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tlipcon Todd Lipcon
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: