HBase
  1. HBase
  2. HBASE-707

High-load import of data into single table/family never triggers split

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.3
    • Fix Version/s: 0.1.3
    • Component/s: None
    • Labels:
      None
    • Environment:

      Linux 2.6.25-14.fc9.x86_64, Fedora Core 9

      Description

      Importing a heavy amount of data into a single table and family.

      One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data. This column grows and grows but never causes a region split.

      Currently there is a single mapfile containing nearly 10GB.

      Eventually this has caused regions to crash with OOME, as described in HBASE-706

      Table in question:

      hql > describe items;
      -----------------------------------------------------------------------------

      Column Family Descriptor

      -----------------------------------------------------------------------------

      name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng
      th: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------

      name: clusters, max versions: 2, compression: NONE, in memory: false, max le
      ngth: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------

      name: content, max versions: 2, compression: NONE, in memory: false, max len
      gth: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------

      name: readby, max versions: 2, compression: NONE, in memory: false, max leng
      th: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------

      name: receivedby, max versions: 2, compression: NONE, in memory: false, max
      length: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------

      name: savedby, max versions: 2, compression: NONE, in memory: false, max len
      gth: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------

      name: sentby, max versions: 2, compression: NONE, in memory: false, max leng
      th: 2147483647, bloom filter: none

      -----------------------------------------------------------------------------
      7 columnfamily(s) in set. (0.34 sec)

        Issue Links

          Activity

          Hide
          Jonathan Gray added a comment -

          The lack of splitting eventually lead to the OOME when attempting compaction

          Show
          Jonathan Gray added a comment - The lack of splitting eventually lead to the OOME when attempting compaction
          Hide
          Jonathan Gray added a comment -

          Added table description

          Show
          Jonathan Gray added a comment - Added table description
          Hide
          stack added a comment -

          Have been working with John on his cluster on this issue. This patch seems to fix the issue (more testing to do).

          Splits are triggered if the compaction run returns true. The return up out of compaction was coming up from the depths of store file and on the way could be mangled if multiple families in a region; one might compact but the subsequent one might not. Because of the latter, we'd not run split check.

          Show
          stack added a comment - Have been working with John on his cluster on this issue. This patch seems to fix the issue (more testing to do). Splits are triggered if the compaction run returns true. The return up out of compaction was coming up from the depths of store file and on the way could be mangled if multiple families in a region; one might compact but the subsequent one might not. Because of the latter, we'd not run split check.
          Hide
          Jonathan Gray added a comment -

          Recreated entire use case scenario and the issue is gone. We are now seeing normal region splits.

          However, we have experienced a new behavior during those splits. We are writing and the client receives an IllegalStateException:

          Trying to commit: Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalStateException: region offline: items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891
          at org.apache.hadoop.hbase.HTable.getRegionServerWithRetries(HTable.java:1062)
          at org.apache.hadoop.hbase.HTable.commit(HTable.java:763)
          at org.apache.hadoop.hbase.HTable.commit(HTable.java:744)
          at HBase.AddAttributes(HBase.java:220)
          at PoJaMigratorItems.Add(PoJaMigratorItems.java:143)
          at PoJaMigrator.AddItems(PoJaMigrator.java:123)
          at PoJaMigrator.AddAllData(PoJaMigrator.java:57)
          at PoJaMigrator.<init>(PoJaMigrator.java:27)
          at PoJaMigrator.main(PoJaMigrator.java:35)
          Caused by: java.lang.IllegalStateException: region offline: items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891
          at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:438)
          at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:350)
          at org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:318)
          at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:114)
          at org.apache.hadoop.hbase.HTable$ServerCallable.instantiateServer(HTable.java:1021)
          at org.apache.hadoop.hbase.HTable.getRegionServerWithRetries(HTable.java:1036)
          ... 8 more

          Show
          Jonathan Gray added a comment - Recreated entire use case scenario and the issue is gone. We are now seeing normal region splits. However, we have experienced a new behavior during those splits. We are writing and the client receives an IllegalStateException: Trying to commit: Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalStateException: region offline: items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891 at org.apache.hadoop.hbase.HTable.getRegionServerWithRetries(HTable.java:1062) at org.apache.hadoop.hbase.HTable.commit(HTable.java:763) at org.apache.hadoop.hbase.HTable.commit(HTable.java:744) at HBase.AddAttributes(HBase.java:220) at PoJaMigratorItems.Add(PoJaMigratorItems.java:143) at PoJaMigrator.AddItems(PoJaMigrator.java:123) at PoJaMigrator.AddAllData(PoJaMigrator.java:57) at PoJaMigrator.<init>(PoJaMigrator.java:27) at PoJaMigrator.main(PoJaMigrator.java:35) Caused by: java.lang.IllegalStateException: region offline: items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891 at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:438) at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:350) at org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:318) at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:114) at org.apache.hadoop.hbase.HTable$ServerCallable.instantiateServer(HTable.java:1021) at org.apache.hadoop.hbase.HTable.getRegionServerWithRetries(HTable.java:1036) ... 8 more
          Hide
          stack added a comment -

          Thanks for confirming patch Jon. The ISE is because your clocks are way skewed. Will fix that over in HBASE-710 I'll commit this patch later tonight.

          Show
          stack added a comment - Thanks for confirming patch Jon. The ISE is because your clocks are way skewed. Will fix that over in HBASE-710 I'll commit this patch later tonight.
          Hide
          stack added a comment -

          Committed to branch. Trunk doesn't have this issue. It has another:

          Show
          stack added a comment - Committed to branch. Trunk doesn't have this issue. It has another:

            People

            • Assignee:
              stack
              Reporter:
              Jonathan Gray
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development