HBase
  1. HBase
  2. HBASE-5229

Provide basic building blocks for "multi-row" local transactions.

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.0
    • Component/s: Client, regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In the final iteration, this issue provides a generalized, public mutateRowsWithLocks method on HRegion, that can be used by coprocessors to implement atomic operations efficiently.
      Coprocessors are already region aware, which makes this is a good pairing of APIs. This feature is by design not available to the client via the HTable API.

      It took a long time to arrive at this and I apologize for the public exposure of my (erratic in retrospect) thought processes.

      Was:
      HBase should provide basic building blocks for multi-row local transactions. Local means that we do this by co-locating the data. Global (cross region) transactions are not discussed here.

      After a bit of discussion two solutions have emerged:
      1. Keep the row-key for determining grouping and location and allow efficient intra-row scanning. A client application would then model tables as HBase-rows.
      2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions will then never be split inside a grouping prefix.

      #1 is true to the current storage paradigm of HBase.
      #2 is true to the current client side API.

      I will explore these two with sample patches here.

      --------------------
      Was:
      As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple.
      I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere.

      Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete.

      1. 5229.txt
        27 kB
        Lars Hofhansl
      2. 5229-endpoint.txt
        20 kB
        Lars Hofhansl
      3. 5229-final.txt
        21 kB
        Lars Hofhansl
      4. 5229-multiRow.txt
        29 kB
        Lars Hofhansl
      5. 5229-multiRow-v2.txt
        25 kB
        Lars Hofhansl
      6. 5229-seekto.txt
        7 kB
        Lars Hofhansl
      7. 5229-seekto-v2.txt
        8 kB
        Lars Hofhansl

        Issue Links

          Activity

          Lars Hofhansl made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Scott Chen made changes -
          Link This issue relates to HBASE-5542 [ HBASE-5542 ]
          Lars Hofhansl made changes -
          Attachment 5229-final.txt [ 12513841 ]
          Lars Hofhansl made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Resolution Fixed [ 1 ]
          Lars Hofhansl made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Lars Hofhansl made changes -
          Summary Explore building blocks for "multi-row" local transactions. Provide basic building blocks for "multi-row" local transactions.
          Description HBase should provide basic building blocks for multi-row local transactions. Local means that we do this by co-locating the data. Global (cross region) transactions are not discussed here.

          After a bit of discussion two solutions have emerged:
          1. Keep the row-key for determining grouping and location and allow efficient intra-row scanning. A client application would then model tables as HBase-rows.
          2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions will then never be split inside a grouping prefix.

          #1 is true to the current storage paradigm of HBase.
          #2 is true to the current client side API.

          I will explore these two with sample patches here.

          --------------------
          Was:
          As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple.
          I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere.

          Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete.
          In the final iteration, this issue provides a generalized, public mutateRowsWithLocks method on HRegion, that can be used by coprocessors to implement atomic operations efficiently.
          Coprocessors are already region aware, which makes this is a good pairing of APIs. This feature is by design not available to the client via the HTable API.

          It took a long time to arrive at this and I apologize for the public exposure of my (erratic in retrospect) thought processes.

          Was:
          HBase should provide basic building blocks for multi-row local transactions. Local means that we do this by co-locating the data. Global (cross region) transactions are not discussed here.

          After a bit of discussion two solutions have emerged:
          1. Keep the row-key for determining grouping and location and allow efficient intra-row scanning. A client application would then model tables as HBase-rows.
          2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions will then never be split inside a grouping prefix.

          #1 is true to the current storage paradigm of HBase.
          #2 is true to the current client side API.

          I will explore these two with sample patches here.

          --------------------
          Was:
          As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple.
          I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere.

          Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete.
          Lars Hofhansl made changes -
          Attachment 5229-endpoint.txt [ 12513647 ]
          Lars Hofhansl made changes -
          Status Reopened [ 4 ] Patch Available [ 10002 ]
          Lars Hofhansl made changes -
          Attachment 5229-multiRow-v2.txt [ 12513190 ]
          Lars Hofhansl made changes -
          Attachment 5229-multiRow.txt [ 12513089 ]
          Lars Hofhansl made changes -
          Resolution Not A Problem [ 8 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Lars Hofhansl made changes -
          Link This issue relates to HBASE-5257 [ HBASE-5257 ]
          Lars Hofhansl made changes -
          Link This issue is related to HBASE-5104 [ HBASE-5104 ]
          Lars Hofhansl made changes -
          Link This issue is related to HBASE-4256 [ HBASE-4256 ]
          Lars Hofhansl made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Not A Problem [ 8 ]
          Lars Hofhansl made changes -
          Summary Support atomic region operations Explore building blocks for "multi-row" local transactions.
          Description As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple.
          I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere.

          Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete.
          HBase should provide basic building blocks for multi-row local transactions. Local means that we do this by co-locating the data. Global (cross region) transactions are not discussed here.

          After a bit of discussion two solutions have emerged:
          1. Keep the row-key for determining grouping and location and allow efficient intra-row scanning. A client application would then model tables as HBase-rows.
          2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions will then never be split inside a grouping prefix.

          #1 is true to the current storage paradigm of HBase.
          #2 is true to the current client side API.

          I will explore these two with sample patches here.

          --------------------
          Was:
          As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed, supporting atomic cross row transactions within a region becomes simple.
          I am aware of the hesitation about the usefulness of this feature, but we have to start somewhere.

          Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make this concrete.
          Lars Hofhansl made changes -
          Attachment 5229-seekto-v2.txt [ 12511449 ]
          Lars Hofhansl made changes -
          Attachment 5229-seekto.txt [ 12511366 ]
          Lars Hofhansl made changes -
          Attachment 5229-scanner-seekto.txt [ 12511365 ]
          Lars Hofhansl made changes -
          Attachment 5229-scanner-seekto.txt [ 12511365 ]
          Lars Hofhansl made changes -
          Attachment 5229-scanner-seekto.txt [ 12511361 ]
          Lars Hofhansl made changes -
          Attachment 5229-scanner-seekto.txt [ 12511361 ]
          Lars Hofhansl made changes -
          Field Original Value New Value
          Attachment 5229.txt [ 12511073 ]
          Lars Hofhansl created issue -

            People

            • Assignee:
              Lars Hofhansl
              Reporter:
              Lars Hofhansl
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development