HBase
  1. HBase
  2. HBASE-2673

Investigate consistency of intra-row scans

    Details

    • Type: Task Task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.90.0
    • Fix Version/s: None
    • Component/s: documentation, regionserver
    • Labels:
      None

      Description

      I have an intuition that intra-row scanning probably does not provide a consistent view of the row. We should investigate how true this is, and document what the interaction of the feature is with the guarantee, etc.

        Activity

        Hide
        ryan rawson added a comment -

        The fix for HBASE-2616 should help here actually...

        Sent from my iPad

        Show
        ryan rawson added a comment - The fix for HBASE-2616 should help here actually... Sent from my iPad
        Hide
        Todd Lipcon added a comment -

        I thought about this a bit tonight. I think this is essentially impossible to implement unless we do the following:

        • Add logical timestamps to HFiles (a few bytes per KV if we use vints and relative to an hfile-wide meta entry)
        • Add to the scanner API so that each scan result object also returns the current logical timestamp
        • Add logical timestamps to HLog entries so that a server that replays the edits maintains the same logical timestamps of each row.

        I think these are all needed in order to maintain consistency in the face of failure or through a flush operation.

        Rather than do all of the above, I think we should simply document in Scanner.setBatch that using intra-row scanning loses the consistency guarantee. Also we'll want to augment the acid guarantees doc to state this.

        Show
        Todd Lipcon added a comment - I thought about this a bit tonight. I think this is essentially impossible to implement unless we do the following: Add logical timestamps to HFiles (a few bytes per KV if we use vints and relative to an hfile-wide meta entry) Add to the scanner API so that each scan result object also returns the current logical timestamp Add logical timestamps to HLog entries so that a server that replays the edits maintains the same logical timestamps of each row. I think these are all needed in order to maintain consistency in the face of failure or through a flush operation. Rather than do all of the above, I think we should simply document in Scanner.setBatch that using intra-row scanning loses the consistency guarantee. Also we'll want to augment the acid guarantees doc to state this.
        Hide
        stack added a comment -

        I say punt on this. I already updated our acid doc. to say that intra-row scanning does not allow consistent view.

        Show
        stack added a comment - I say punt on this. I already updated our acid doc. to say that intra-row scanning does not allow consistent view.

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Todd Lipcon
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development