HBase
  1. HBase
  2. HBASE-5929

HBaseAdmin.compact and flush are giving confusing errors for ROOT, META, and regions that don't exist

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.92.1
    • Fix Version/s: None
    • Component/s: Client, shell
    • Labels:
      None
    • Environment:

      Linux Ubuntu Lucid 64bit

      Description

      I have been noticing that calls to HBaseAdmin.majorCompact throws exceptions randomly for some regions. I could not find a pattern to these exception. The code I have simply does this admin.majorCompact(region.getRegionNameAsString()). admin is an instance of HBaseAdmin and region is an instance of HRegionInfo. The exception I get is

      org.apache.hadoop.hbase.TableNotFoundException: ROOT,,0
      at org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473) ~[hbase-0.92.1.jar:0.92.1]
      at org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) ~[hbase-0.92.1.jar:0.92.1]
      at org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) ~[hbase-0.92.1.jar:0.92.1]
      at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown Source) [hbase_compact.jar:na]

      In this case it's the root region, but I get similar exceptions for other tables, like this.

      2012-05-03 19:03:42,994 WARN [main] HBaseCompact: Could not compact:
      org.apache.hadoop.hbase.TableNotFoundException: ad_daily,49842:2009-07-10,1269763588508.1997607018
      at org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473) ~[hbase-0.92.1.jar:0.92.1]
      at org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) ~[hbase-0.92.1.jar:0.92.1]
      at org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) ~[hbase-0.92.1.jar:0.92.1]
      at org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1196) ~[hbase-0.92.1.jar:0.92.1]
      at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown Source) [hbase_compact.jar:na]
      at com.stumbleupon.hbaseadmin.HBaseCompact.main(Unknown Source) [hbase_compact.jar:na]

      I see this on hbase shell as well. However, I don't see these exceptions if I use admin.majorCompact(region.getRegionName()), so it looks like something gets lost when I use getRegionNameAsString().

      Let me know if I can provide more information.

        Activity

        Aravind Gottipati created issue -
        Hide
        stack added a comment -

        This seems uninterpretable as table name or region name 'org.apache.hadoop.hbase.TableNotFoundException: ROOT,,0'... I'd have expected it to be "ROOT,,0" if hbase was to have any chance? Is this coming in via jruby mighty Aravind? Does 'ad_daily,49842:2009-07-10,1269763588508.1997607018' exist on the cluster? (I know I should look myself....).

        Show
        stack added a comment - This seems uninterpretable as table name or region name 'org.apache.hadoop.hbase.TableNotFoundException: ROOT,,0'... I'd have expected it to be " ROOT ,,0" if hbase was to have any chance? Is this coming in via jruby mighty Aravind? Does 'ad_daily,49842:2009-07-10,1269763588508.1997607018' exist on the cluster? (I know I should look myself....).
        Hide
        Aravind Gottipati added a comment -

        Here is the output from hbase shell for a similar table:

        hbase(main):004:0> major_compact 'ad_campaign_daily_stumbles,81738:2009-02-08,1269765634190.1290583321'

        ERROR: Unknown table ad_campaign_daily_stumbles,81738:2009-02-08,1269765634190.1290583321!

        Here is some help for this command:
        Run major compaction on passed table or pass a region row
        to major compact an individual region

        hbase(main):005:0>

        I get these region names by querying the HRegionInterface of the server, and then proceed to compress them. This is all on the dev cluster (if you want to replicate/test).

        Show
        Aravind Gottipati added a comment - Here is the output from hbase shell for a similar table: hbase(main):004:0> major_compact 'ad_campaign_daily_stumbles,81738:2009-02-08,1269765634190.1290583321' ERROR: Unknown table ad_campaign_daily_stumbles,81738:2009-02-08,1269765634190.1290583321! Here is some help for this command: Run major compaction on passed table or pass a region row to major compact an individual region hbase(main):005:0> I get these region names by querying the HRegionInterface of the server, and then proceed to compress them. This is all on the dev cluster (if you want to replicate/test).
        Hide
        Jean-Daniel Cryans added a comment -

        The reason it doesn't work with META is because we don't write the full meta row names in ROOT. The row key should be ".META.,,1.1028785192" but if I scan ROOT I see:

        .META.,,1 column=info:regioninfo, timestamp=1336512980545, value=....
        .META.,,1 column=info:server, timestamp=1336512994161, value=h-25-183.sfo.stumble.net:50063
        .META.,,1 column=info:serverstartcode, timestamp=1336512994161, value=1336512979497
        .META.,,1 column=info:v, timestamp=1336512980545, value=\x00\x00

        It's missing the encoded name. That won't work. Requesting the compaction of a root region won't work either because the code that figures out if a region or a table is passed depends on MetaReader.getCatalogHTable which doesn't take ROOT regions (which is normal since the ROOT address isn't contained in an HTable but in ZK).

        For the other regions that had that issue, it was a bad interaction in our code where the scan of .META. is done possible hours before the call for compaction is issued. I was able to confirm that all the regions that were affected had been recently split.

        The issues that remain:

        • It's impossible to compact the root region directly, but calling compact on the table itself works.
        • It's possible to compact the meta region directly but the user needs to pass ".META.,,1" instead of the full region name.
        • Trying to compact a region that doesn't exist throws a TableNotFoundException, which confuses the user.

        I'd say that this is minor but we should probably fix for usability.

        Show
        Jean-Daniel Cryans added a comment - The reason it doesn't work with META is because we don't write the full meta row names in ROOT. The row key should be ".META.,,1.1028785192" but if I scan ROOT I see: .META.,,1 column=info:regioninfo, timestamp=1336512980545, value=.... .META.,,1 column=info:server, timestamp=1336512994161, value=h-25-183.sfo.stumble.net:50063 .META.,,1 column=info:serverstartcode, timestamp=1336512994161, value=1336512979497 .META.,,1 column=info:v, timestamp=1336512980545, value=\x00\x00 It's missing the encoded name. That won't work. Requesting the compaction of a root region won't work either because the code that figures out if a region or a table is passed depends on MetaReader.getCatalogHTable which doesn't take ROOT regions (which is normal since the ROOT address isn't contained in an HTable but in ZK). For the other regions that had that issue, it was a bad interaction in our code where the scan of .META. is done possible hours before the call for compaction is issued. I was able to confirm that all the regions that were affected had been recently split. The issues that remain: It's impossible to compact the root region directly, but calling compact on the table itself works. It's possible to compact the meta region directly but the user needs to pass ".META.,,1" instead of the full region name. Trying to compact a region that doesn't exist throws a TableNotFoundException, which confuses the user. I'd say that this is minor but we should probably fix for usability.
        Hide
        Jean-Daniel Cryans added a comment -

        Fixing the title.

        Show
        Jean-Daniel Cryans added a comment - Fixing the title.
        Jean-Daniel Cryans made changes -
        Field Original Value New Value
        Summary HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked to majorcompact regions. HBaseAdmin.compact and flush are giving confusing errors for ROOT, META, and regions that don't exist
        Hide
        Jean-Daniel Cryans added a comment -

        Fixing the title.

        Show
        Jean-Daniel Cryans added a comment - Fixing the title.
        Hide
        Jean-Daniel Cryans added a comment -

        I spawed HBASE-5969 for the apparent randomness of the issue with the region that's pre 0.89

        Show
        Jean-Daniel Cryans added a comment - I spawed HBASE-5969 for the apparent randomness of the issue with the region that's pre 0.89

          People

          • Assignee:
            Unassigned
            Reporter:
            Aravind Gottipati
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development