HBase
  1. HBase
  2. HBASE-792

Rewrite getClosestAtOrJustBefore; doesn't scale as currently written

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      As currently written, as a table gets bigger, the number of rows .META. needs to keep count of grows.

      As written, our getClosestAtOrJustBefore, goes through every storefile and in each picks up any row that could be a possible candidate for closest before. It doesn't just get the closest from the storefile, but all keys that are closest before. Its not selective because how can it tell at the store file level which of the candidates will survive deletes that are sitting in later store files or up in memcache.

      So, if a store file has keys 0-10 and we ask to get the row that is closest or just before 7, it returns rows 0-7.. and so on per store file.

      Can bet big and slow weeding key wanted.

        Activity

        Hide
        stack added a comment -

        My fix for HBASE-751 introduced this issue, thinking on it. With this in place, it could get to a place where every request for closestAtOrBefore could end up loading all that is out on the filesystem, all of the flushes.

        We still need to rewrite this stuff; the number of seeks done per closestAtOrBefore can be astronomical but this patch takes off some of the heat.

        This patch narrows the number of possible candidates that come back.

        It goes first to the memcache to find candidate rows.

        While there, it puts any deletes found between ultimate candidate and desired row into new delete Set. This delete set is then carried down through the walk of store files. We add new deletes as we encounter them so that candidates in older store files don't shine through if they've been deleted earlier.

        Show
        stack added a comment - My fix for HBASE-751 introduced this issue, thinking on it. With this in place, it could get to a place where every request for closestAtOrBefore could end up loading all that is out on the filesystem, all of the flushes. We still need to rewrite this stuff; the number of seeks done per closestAtOrBefore can be astronomical but this patch takes off some of the heat. This patch narrows the number of possible candidates that come back. It goes first to the memcache to find candidate rows. While there, it puts any deletes found between ultimate candidate and desired row into new delete Set. This delete set is then carried down through the walk of store files. We add new deletes as we encounter them so that candidates in older store files don't shine through if they've been deleted earlier.
        Hide
        Jim Kellerman added a comment -

        Patch looks good +1

        Show
        Jim Kellerman added a comment - Patch looks good +1
        Hide
        stack added a comment -

        Committed 792.patch. Leaving issue open. Close after we do rewrite.

        Show
        stack added a comment - Committed 792.patch. Leaving issue open. Close after we do rewrite.
        Hide
        stack added a comment -

        This can actually be responsible for slowing down whole cluster (J-D saw it in 0.18 hbase up on his openspaces cluster)

        Show
        stack added a comment - This can actually be responsible for slowing down whole cluster (J-D saw it in 0.18 hbase up on his openspaces cluster)
        Hide
        stack added a comment -

        Moving it out. KeyValue changes plus caching may put need for this off a while.

        Show
        stack added a comment - Moving it out. KeyValue changes plus caching may put need for this off a while.
        Hide
        stack added a comment -

        I think this is being done in 0.20.0 as part of major refactor. Bringing in.

        Show
        stack added a comment - I think this is being done in 0.20.0 as part of major refactor. Bringing in.
        Hide
        stack added a comment -

        Moving out of 0.20.0. Not going to happen unless its already done as part of hbase-1304 (haven't heard).

        Show
        stack added a comment - Moving out of 0.20.0. Not going to happen unless its already done as part of hbase-1304 (haven't heard).
        Hide
        stack added a comment -

        HBASE-1761 rewrote getclosestatorbefore. Code is much cleaner and more focused on the target key. We can do this now because of such as the axiom that deletes only apply to the flie that follows. Doesn't carry around bulky Maps of candidates nor of deletes (now we have new style deletes) any more so should be more performant.

        The one thing left to do is an early-out if we get an answer early in the processing – in memstore say. I tried to do this as part of hbase-1761 but only worked if client asked for the first row in a region. Need to make it so getclosest when a meta table leverages HRegionInfo. If target row key falls between the start and end key or the region, answer is the one we want so exit.

        Show
        stack added a comment - HBASE-1761 rewrote getclosestatorbefore. Code is much cleaner and more focused on the target key. We can do this now because of such as the axiom that deletes only apply to the flie that follows. Doesn't carry around bulky Maps of candidates nor of deletes (now we have new style deletes) any more so should be more performant. The one thing left to do is an early-out if we get an answer early in the processing – in memstore say. I tried to do this as part of hbase-1761 but only worked if client asked for the first row in a region. Need to make it so getclosest when a meta table leverages HRegionInfo. If target row key falls between the start and end key or the region, answer is the one we want so exit.
        Hide
        stack added a comment -

        Resolving as done for now.

        Show
        stack added a comment - Resolving as done for now.

          People

          • Assignee:
            stack
            Reporter:
            stack
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development