Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15121

ConnectionImplementation#locateRegionInMeta() issue when master is restarted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.0.0
    • None
    • Client
    • None

    Description

      I notice this issue while i was running IntegrationTestMTTR#testRestartMaster() test was failing on put operation. Here is sequence of events from logs leading to failed put operation:
      Master restart

      INFO  [pool-5-thread-1] util.Shell: Executing full command [/usr/bin/ssh  hnode2 "sudo -u hbase ps aux | grep proc_master | grep -v grep | tr -s ' ' | cut -d ' ' -f2 | xargs kill -s SIGKILL"]
      

      Client trying to locate region for row=70efdf2ec9b086079795c442636b55fb-17 (this is additional logging inspecting metaKey which is used to search hbase:meta )

      2016-01-15 10:26:05,169 INFO  [HBaseWriterThread_9] client.ConnectionImplementation: metaKey inspection: table=IntegrationTestMTTRLoadTestTool row= 70efdf2ec9b086079795c442636b55fb-17 metaKey= IntegrationTestMTTRLoadTestTool,70efdf2ec9b086079795c442636b55fb-17,99999999999999
      

      Client throwing TableNotFoundException (hbase:meta scan returned null)

      2016-01-15 10:32:58,154 INFO  [HBaseWriterThread_5] client.ConnectionImplementation: regionInfo result is null: HBaseWriterThread_5 throwing TableNotFoundException logging details table=IntegrationTestMTTRLoadTestTool row=70efdf2ec9b086079795c442636b55fb-17 metaKey=IntegrationTestMTTRLoadTestTool,70efdf2ec9b086079795c442636b55fb-17,99999999999999
      2016-01-15 10:32:58,154 ERROR [HBaseWriterThread_5] client.AsyncProcess: Failed to get region location
      org.apache.hadoop.hbase.TableNotFoundException: IntegrationTestMTTRLoadTestTool
              at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:890)
              at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:781)
              at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:396)
              at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344)
              at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:239)
              at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191)
              at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949)
              at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569)
              at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:146)
              at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:111)
      

      And as result we have failed insert operation:

      2016-01-15 10:32:58,179 ERROR [HBaseWriterThread_5] util.MultiThreadedWriter: Failed to insert: 17 after 60046ms; region information: cached: region=IntegrationTestMTTRLoadTestTool,66666660,1452849956427.05b437185a9437f178726a55a29a79b7., hostname=hnode4,16020,1452776418437, seqNum=5; cache is up to date; errors: exception from null for 70efdf2ec9b086079795c442636b55fb-17
      org.apache.hadoop.hbase.TableNotFoundException: IntegrationTestMTTRLoadTestTool
              at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:890)
              at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:781)
              at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:396)
              at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344)
              at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:239)
              at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191)
              at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949)
              at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569)
              at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:146)
              at org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:111)
      

      leading to test failing:

      Failed to write key: 17
      2016-01-15 10:33:53,984 INFO  [main] mttr.IntegrationTestMTTR: RestartMaster failed after 469878ms.
      java.util.concurrent.ExecutionException: java.lang.AssertionError: Load failed expected:<0> but was:<1>
      

      Here is snippet from ConnectionImplementation#locateRegionInMeta() throwing exception:

            try {
              Result regionInfoRow = null;
              ReversedClientScanner rcs = null;
              try {
                rcs = new ClientSmallReversedScanner(conf, s, TableName.META_TABLE_NAME, this,
                  rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0);
                regionInfoRow = rcs.next();
              } finally {
                if (rcs != null) {
                  rcs.close();
                }
              }
      
              if (regionInfoRow == null) {
                throw new TableNotFoundException(tableName);
      

      I was able to avoid this issue by removing thrown declaration and adding continue allowing client to retry to locate region. This sounds like simplest solution here.
      Thoughts ?

      Attachments

        1. HBASE-15121-v0.patch
          1 kB
          Samir Ahmic
        2. HBASE-15121-v0.patch
          1 kB
          Samir Ahmic

        Activity

          People

            asamir Samir Ahmic
            asamir Samir Ahmic
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: