Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-7774

RegionObserver.prePut() cannot rely on the Put's timestamps, can even cause data loss

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • 0.92.2, 0.94.4, 0.95.2
    • None
    • None
    • None

    Description

      We had a user that had code that looked like this in a coprocessor's prePut():

      if (put.has(expectedKv))
        put.add(kvSayingIFoundIt);
      else
        put.add(kvSayingNotFound);
      

      If you have MSLAB turned off, and you have the expectedKv in your Put, doing a Get following your insert will only return kvSayingIFoundIt and not the KV you were actually inserting.

      More so, if you only do put.has(expectedKv), you will not get anything back. Your data seems to be gone.

      The reason is that in prePut() the timestamp hasn't been set yet, so calling kv.getTimestamp() during the comparisons in put.has() will populate kv.timestampCache with Long.MAX_VALUE. Then it will stay in the MemStore with that big timestamp and be filtered out because TimeRange will compare Long.MAX_VALUE >= Long.MAX_VALUE and return SKIP.

      And the reason it works correctly with MSLAB on is that the KV is cloned in maybeCloneWithAllocator() and the cache is reset.

      Now, I think this has bigger implications. Basically, you can't rely on the timestamp at all in prePut(). I'm sure this can screw someone else in a creative way later.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jdcryans Jean-Daniel Cryans
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: