HBase
  1. HBase
  2. HBASE-2248

Provide new non-copy mechanism to assure atomic reads in get and scan

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.3
    • Fix Version/s: 0.20.4
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      This patch changes the Get code path to instead be a Scan of one row. This means than inserting cells out of timestamp order should work now (tests to verify to follow part of hbase-2294) but also that a delete at an explicit timestamp now overshadows EVEN if the effected cell is put after the delete (The old Get code path did early-out so a subsequent puts would not see the delete).
      Show
      This patch changes the Get code path to instead be a Scan of one row. This means than inserting cells out of timestamp order should work now (tests to verify to follow part of hbase-2294) but also that a delete at an explicit timestamp now overshadows EVEN if the effected cell is put after the delete (The old Get code path did early-out so a subsequent puts would not see the delete).

      Description

      HBASE-2037 introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when starting a scan.

      After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans. Some of our data repesent a time series. The data is stored in time series order, MR jobs often insert/update new data at the end of the series, and queries usually have to pick up some or all of the series. These are often scans of 0-100 rows at a time. To load one page, we'll observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete. Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key values to copy it.

      1. threads.txt
        12 kB
        Dave Latham
      2. hbase-2248.gc
        67 kB
        Dave Latham
      3. Screen shot 2010-02-23 at 10.33.38 AM.png
        70 kB
        stack
      4. HBASE-2248-demonstrate-previous-impl-bugs.patch
        14 kB
        stack
      5. HBASE-2248.patch
        11 kB
        stack
      6. readownwrites-lost.patch
        3 kB
        Todd Lipcon
      7. readownwrites-lost.2.patch
        3 kB
        Todd Lipcon
      8. hbase-2248.txt
        5 kB
        Todd Lipcon
      9. HBASE-2248-GetsAsScans3.patch
        190 kB
        stack
      10. ASF.LICENSE.NOT.GRANTED--profile.png
        183 kB
        Andrew Purtell
      11. ASF.LICENSE.NOT.GRANTED--put_call_graph.png
        127 kB
        Andrew Purtell
      12. ASF.LICENSE.NOT.GRANTED--HBASE-2248-rr-pre-durability4.txt
        126 kB
        ryan rawson
      13. ASF.LICENSE.NOT.GRANTED--HBASE-2248-rr-final1.txt
        106 kB
        ryan rawson
      14. ASF.LICENSE.NOT.GRANTED--HBASE-2248-no-row-locks.txt
        4 kB
        ryan rawson

        Issue Links

          Activity

          Dave Latham created issue -
          Dave Latham made changes -
          Field Original Value New Value
          Attachment threads.txt [ 12436646 ]
          Dan Washusen made changes -
          Link This issue relates to HBASE-2249 [ HBASE-2249 ]
          Dave Latham made changes -
          Attachment hbase-2248.gc [ 12436730 ]
          stack made changes -
          stack made changes -
          Comment [ @Yoram OK. Maybe post patch here if thats possible so others can see old implementation was broke. Good stuff. ]
          stack made changes -
          stack made changes -
          Attachment HBASE-2248.patch [ 12437174 ]
          ryan rawson made changes -
          Summary New MemStoreScanner copies memstore for each scan, makes short scans slow Provide new non-copy mechanism to assure atomic reads in get and scan
          ryan rawson made changes -
          Attachment HBASE-2248-ryan.patch [ 12437878 ]
          Todd Lipcon made changes -
          Attachment readownwrites-lost.patch [ 12437963 ]
          Todd Lipcon made changes -
          Attachment readownwrites-lost.2.patch [ 12437964 ]
          Todd Lipcon made changes -
          Attachment hbase-2248.txt [ 12437967 ]
          Todd Lipcon made changes -
          Link This issue is related to HBASE-2294 [ HBASE-2294 ]
          stack made changes -
          Attachment HBASE-2248-GetsAsScans3.patch [ 12438577 ]
          Todd Lipcon made changes -
          Priority Major [ 3 ] Blocker [ 1 ]
          Todd Lipcon made changes -
          Link This issue relates to HBASE-2322 [ HBASE-2322 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-alpha1.txt [ 12438861 ]
          Todd Lipcon made changes -
          Link This issue relates to HBASE-2265 [ HBASE-2265 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-alpha2.txt [ 12440819 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-alpha3.txt [ 12440822 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability.txt [ 12441217 ]
          ryan rawson made changes -
          Attachment HBASE-2248-fix-delete-test.txt [ 12441221 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability2.txt [ 12441240 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-alpha1.txt [ 12438861 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-alpha2.txt [ 12440819 ]
          ryan rawson made changes -
          Attachment HBASE-2248-fix-delete-test.txt [ 12441221 ]
          ryan rawson made changes -
          Attachment HBASE-2248-ryan.patch [ 12437878 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability.txt [ 12441217 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability3.txt [ 12441278 ]
          Andrew Purtell made changes -
          Attachment profile.png [ 12441500 ]
          Attachment put_call_graph.png [ 12441501 ]
          stack made changes -
          Assignee ryan rawson [ ryanobjc ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability4.txt [ 12441688 ]
          stack made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Incompatible change, Reviewed]
          Release Note This patch changes the Get code path to instead be a Scan of one row. This means than inserting cells out of timestamp order should work now (tests to verify to follow part of hbase-2294) but also that a delete at an explicit timestamp now overshadows EVEN if the effected cell is put after the delete (The old Get code path did early-out so a subsequent puts would not see the delete).
          Resolution Fixed [ 1 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-final1.txt [ 12441785 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-alpha3.txt [ 12440822 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability2.txt [ 12441240 ]
          ryan rawson made changes -
          Attachment HBASE-2248-rr-pre-durability3.txt [ 12441278 ]
          ryan rawson made changes -
          Attachment HBASE-2248-no-row-locks.txt [ 12441881 ]
          Benoit Sigoure made changes -
          Link This issue is related to HBASE-2959 [ HBASE-2959 ]
          stack made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              ryan rawson
              Reporter:
              Dave Latham
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development