Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      To continue scaling numbers of columns or versions in a single row, we need a mechanism to scan within a row so we can return some columns at a time. Currently, an entire row must come back as one piece.

      1. HBASE-1537-v1.patch
        17 kB
        Andrew Purtell
      2. HBASE-1537-v2.patch
        18 kB
        Andrew Purtell
      3. HBASE-1537-v2-0.20.3.patch
        16 kB
        Andrey Stepachev
      4. HBASE-1537-2.patch
        6 kB
        Andrew Purtell

        Issue Links

          Activity

          Hide
          Jean-Daniel Cryans added a comment -

          Taking it out of 0.20.6, although the last fix did go into it. Confusing.

          Show
          Jean-Daniel Cryans added a comment - Taking it out of 0.20.6, although the last fix did go into it. Confusing.
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          On 2010-07-22 23:11:32, stack wrote:

          > +1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion).

          Committed to branch and trunk, taking into account suggestions for savings in both cases. Also added a conditional to insure matcher.row is not null before testing it.

          • Andrew

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/376/#review463
          -----------------------------------------------------------

          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> On 2010-07-22 23:11:32, stack wrote: > +1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion). Committed to branch and trunk, taking into account suggestions for savings in both cases. Also added a conditional to insure matcher.row is not null before testing it. Andrew ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/376/#review463 -----------------------------------------------------------
          Hide
          HBase Review Board added a comment -

          Message from: stack@duboce.net

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/376/#review463
          -----------------------------------------------------------

          Ship it!

          +1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion).

          src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
          <http://review.hbase.org/r/376/#comment1902>

          Consider using method Ryan added today:

          public boolean matchingRows(final KeyValue left, final byte [] right) {

          Then you could do if (!KeyValue.matchingRows(matcher.row, peeked)

          .. and only do a peeked.getRow when you know you've changed rows..

          Might same a bit of byte array making.

          • stack
          Show
          HBase Review Board added a comment - Message from: stack@duboce.net ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/376/#review463 ----------------------------------------------------------- Ship it! +1 on commit. Suggestion on how to make some minor savings. Go ahead and commit with if it makes sense to you (or do w/o the suggestion). src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java < http://review.hbase.org/r/376/#comment1902 > Consider using method Ryan added today: public boolean matchingRows(final KeyValue left, final byte [] right) { Then you could do if (!KeyValue.matchingRows(matcher.row, peeked) .. and only do a peeked.getRow when you know you've changed rows.. Might same a bit of byte array making. stack
          Hide
          HBase Review Board added a comment -

          Message from: "Andrew Purtell" <apurtell@apache.org>

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          http://review.hbase.org/r/376/
          -----------------------------------------------------------

          Review request for hbase.

          Summary
          -------

          Query matcher will be confused if intra-row scanning. Avoid calling setRow() if the row has not changed. This requires a string comparison per next().

          This addresses bug HBASE-1537.
          http://issues.apache.org/jira/browse/HBASE-1537

          Diffs


          src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java dc2e234

          Diff: http://review.hbase.org/r/376/diff

          Testing
          -------

          Thanks,

          Andrew

          Show
          HBase Review Board added a comment - Message from: "Andrew Purtell" <apurtell@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/376/ ----------------------------------------------------------- Review request for hbase. Summary ------- Query matcher will be confused if intra-row scanning. Avoid calling setRow() if the row has not changed. This requires a string comparison per next(). This addresses bug HBASE-1537 . http://issues.apache.org/jira/browse/HBASE-1537 Diffs src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java dc2e234 Diff: http://review.hbase.org/r/376/diff Testing ------- Thanks, Andrew
          Hide
          Andrew Purtell added a comment -

          Current situation where this was dropped from 0.20 but is in trunk is a reasonable outcome in my opinion.

          Show
          Andrew Purtell added a comment - Current situation where this was dropped from 0.20 but is in trunk is a reasonable outcome in my opinion.
          Hide
          ryan rawson added a comment -

          stack just said he wanted to reapply to branch?

          Show
          ryan rawson added a comment - stack just said he wanted to reapply to branch?
          Hide
          Andrew Purtell added a comment -

          Right, this is only for trunk going forward.

          Show
          Andrew Purtell added a comment - Right, this is only for trunk going forward.
          Hide
          ryan rawson added a comment -

          The original patch was incompatible since it changed the Scan
          serialization without versioning.

          Show
          ryan rawson added a comment - The original patch was incompatible since it changed the Scan serialization without versioning.
          Hide
          Andrew Purtell added a comment -

          I noticed this was dropped in the pivot from 0.20_pre_durability to 0.20. But as a matter of fact we have an open internal ticket on this. Unit tests and bugfixes coming. Will submit to reviewboard when ready.

          Show
          Andrew Purtell added a comment - I noticed this was dropped in the pivot from 0.20_pre_durability to 0.20. But as a matter of fact we have an open internal ticket on this. Unit tests and bugfixes coming. Will submit to reviewboard when ready.
          Hide
          stack added a comment -

          Reapply to branch. We seem to have lost it (Or I never applied it in first place).

          Show
          stack added a comment - Reapply to branch. We seem to have lost it (Or I never applied it in first place).
          Hide
          Andrew Purtell added a comment -

          Apply limit across families.

          Show
          Andrew Purtell added a comment - Apply limit across families.
          Hide
          stack added a comment -

          Committed to branch. Thanks for backporting Andrey Stepachev. SCAN_DESCRIPTOR_VERSION being missing should be fine on branch.

          Show
          stack added a comment - Committed to branch. Thanks for backporting Andrey Stepachev. SCAN_DESCRIPTOR_VERSION being missing should be fine on branch.
          Hide
          Andrey Stepachev added a comment -

          Fixed version for 0.20.x.

          Only one difference: no SCAN_DESCRIPTOR_VERSION constant serialisation in Scan.

          Show
          Andrey Stepachev added a comment - Fixed version for 0.20.x. Only one difference: no SCAN_DESCRIPTOR_VERSION constant serialisation in Scan.
          Hide
          Yi Liang added a comment -

          This feature looks very useful for my ongoing project, but the attached patches here can't be applied to my hbase 0.20.3. There're some unmatched hunks in Scan.java and HRegion.java, especially much difference in nextInternal method of HRegion.java. As a common user of HBase, I'm not clear how to adjust the related code. Can anybody provide a patch for 0.20.3? Thanks very much!

          Show
          Yi Liang added a comment - This feature looks very useful for my ongoing project, but the attached patches here can't be applied to my hbase 0.20.3. There're some unmatched hunks in Scan.java and HRegion.java, especially much difference in nextInternal method of HRegion.java. As a common user of HBase, I'm not clear how to adjust the related code. Can anybody provide a patch for 0.20.3? Thanks very much!
          Hide
          Jean-Daniel Cryans added a comment -

          Ok sorry false alarm.

          Show
          Jean-Daniel Cryans added a comment - Ok sorry false alarm.
          Hide
          Jean-Daniel Cryans added a comment -

          Is it just me or this patch doesn't compile for org.apache.hadoop.hbase.regionserver.transactional.TransactionState?

          Show
          Jean-Daniel Cryans added a comment - Is it just me or this patch doesn't compile for org.apache.hadoop.hbase.regionserver.transactional.TransactionState?
          Hide
          Andrew Purtell added a comment -

          Thanks for the review. Committed. Reopen if insufficient.

          Show
          Andrew Purtell added a comment - Thanks for the review. Committed. Reopen if insufficient.
          Hide
          Jonathan Gray added a comment -

          Patch looks great, Andrew. I'm probably going to apply this into a 0.20 branch dev cluster soon and play around with some gigantic rows.

          Show
          Jonathan Gray added a comment - Patch looks great, Andrew. I'm probably going to apply this into a 0.20 branch dev cluster soon and play around with some gigantic rows.
          Hide
          Andrew Purtell added a comment -

          Also versions Scan.

          If Scan had been versioned already, this could have been backported to 0.20 branch.

          Show
          Andrew Purtell added a comment - Also versions Scan. If Scan had been versioned already, this could have been backported to 0.20 branch.
          Hide
          Andrew Purtell added a comment -

          Better test.

          Show
          Andrew Purtell added a comment - Better test.
          Hide
          stack added a comment -

          Patch looks good. Only think is that maybe the test should check that Result does not contain results that span a row? (We're not supposed to cross rows inside a call to next, right?)

          Show
          stack added a comment - Patch looks good. Only think is that maybe the test should check that Result does not contain results that span a row? (We're not supposed to cross rows inside a call to next, right?)
          Hide
          Andrew Purtell added a comment -

          With simple testcase. Seems to work.

          Show
          Andrew Purtell added a comment - With simple testcase. Seems to work.
          Hide
          Andrew Purtell added a comment -

          Naive implementation of per next() limits via new InternalScanner method. Aiming for a minimalist approach. Needs testing. Find where/if it doesn't work as expected.

          Show
          Andrew Purtell added a comment - Naive implementation of per next() limits via new InternalScanner method. Aiming for a minimalist approach. Needs testing. Find where/if it doesn't work as expected.
          Hide
          Andrew Purtell added a comment -

          Patch soon with first cut. Will add a bit of state to scanner and change next() semantics on the client such that more than one call to next may be needed to retireve the full row. Next() will not span rows. New next() behavior will be configurable, off by default, toggled by config var or Scan parameter.

          Show
          Andrew Purtell added a comment - Patch soon with first cut. Will add a bit of state to scanner and change next() semantics on the client such that more than one call to next may be needed to retireve the full row. Next() will not span rows. New next() behavior will be configurable, off by default, toggled by config var or Scan parameter.
          Hide
          Jonathan Gray added a comment -

          The optimization you mention is a worthwhile one we might look into, in order to reduce RPC payload.

          Show
          Jonathan Gray added a comment - The optimization you mention is a worthwhile one we might look into, in order to reduce RPC payload.
          Hide
          Jonathan Gray added a comment -

          Result is actually just KeyValue[]... Each KeyValue holds ALL the data, row, family, qualifier, timestamp, value, type. So we already have all the information we need to put things back in any way we want.

          The byte [] row inside Result is actually computed when you ask for it, and it just grabs the row from the first KV, so it's actually already capable from a data structure perspective to hold multiple rows.

          Good stuff, Andrew.

          Show
          Jonathan Gray added a comment - Result is actually just KeyValue[]... Each KeyValue holds ALL the data, row, family, qualifier, timestamp, value, type. So we already have all the information we need to put things back in any way we want. The byte [] row inside Result is actually computed when you ask for it, and it just grabs the row from the first KV, so it's actually already capable from a data structure perspective to hold multiple rows. Good stuff, Andrew.
          Hide
          Andrew Purtell added a comment -

          What we did for Stargate scanners is make them iterators over cells and then allow scanners to specify the number of cells they'd like to have come back in one batch. The internal mechanics are more complicated for region servers to do this, but I think similar semantics would be good. How to handle crossing row boundaries presents a couple of options:

          • Include row key as well as column and timestamp with each cell value. This is not as expensive as it might sound if a simple string table encoding is used with a marker or two meaning "use last given row key" and "use last given column". Either Thrift or pbufs can handle this by marking row and column keys as optional.
          • Make Result capable of holding more than one row.
          • Return early to the client at row boundary and make it do scanner.next() to start up again on the next row.
          Show
          Andrew Purtell added a comment - What we did for Stargate scanners is make them iterators over cells and then allow scanners to specify the number of cells they'd like to have come back in one batch. The internal mechanics are more complicated for region servers to do this, but I think similar semantics would be good. How to handle crossing row boundaries presents a couple of options: Include row key as well as column and timestamp with each cell value. This is not as expensive as it might sound if a simple string table encoding is used with a marker or two meaning "use last given row key" and "use last given column". Either Thrift or pbufs can handle this by marking row and column keys as optional. Make Result capable of holding more than one row. Return early to the client at row boundary and make it do scanner.next() to start up again on the next row.

            People

            • Assignee:
              Andrew Purtell
              Reporter:
              Jonathan Gray
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development