Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12311

Version stats in HFiles?

    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • None
    • None

    Description

      In HBASE-9778 I basically punted the decision on whether doing repeated scanner.next() called instead of the issueing (re)seeks to the user.
      I think we can do better.

      One way do that is maintain simple stats of what the maximum number of versions we've seen for any row/col combination and store these in the HFile's metadata (just like the timerange, oldest Put, etc).

      Then we estimate fairly accurately whether we have to expect lots of versions (i.e. seek between columns is better) or not (in which case we'd issue repeated next()'s).

      Attachments

        1. 12311-indexed-0.98-v2.txt
          19 kB
          Lars Hofhansl
        2. 12311-indexed-0.98.txt
          7 kB
          Lars Hofhansl
        3. 12311-v3.txt
          72 kB
          Lars Hofhansl
        4. 12311-v2.txt
          67 kB
          Lars Hofhansl
        5. 12311.txt
          40 kB
          Lars Hofhansl
        6. CellStatTracker.java
          2 kB
          Lars Hofhansl

        Issue Links

          Activity

            People

              Unassigned Unassigned
              larsh Lars Hofhansl
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: