Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-1603

MR failed "RetriesExhaustedException: Trying to contact region server Some server for region TestTable..."

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • None
    • None
    • None
    • None

    Description

      Here is the master. Region TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358 was split at 16:11:42,865. My MR job failed at 18:12:26,462 with this:

      2009-07-01 18:12:26,462 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
      org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region TestTable,�,1246464670313, row '��		', but failed after 10 attempts.
      Exceptions:
      ...
      

      Why after ten attempts did the client not find the region?

      2009-07-01 16:11:42,865 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358: Daughters; TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313, TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
      2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
      2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
      2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
      2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 2 of 3
      2009-07-01 16:11:45,906 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 3 of 3
      2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 open on 208.76.44.142:60020
      2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 in region .META.,,1 with startcode 1246461673026 and server 208.76.44.142:60020
      2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 open on 208.76.44.142:60020
      2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 in region .META.,,1 with startcode 1246461673026 and server 208.76.44.142:60020
      2009-07-01 17:46:42,670 [IPC Server handler 0 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313: Daughters; TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246470379467, TestTable,\x00\x00\x08\x04\x05\x07\x02\x05\x04\x08,1246470379467 from aa0-000-15.u.powerset.com,60020,1246461673026; 5 of 7
      

      Here is over on the regionserver:

      2009-07-01 16:11:42,865 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358: Daughters; TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313, TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
      2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
      2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
      2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
      2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 2 of 3
      2009-07-01 16:11:45,906 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 3 of 3
      2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 open on X.X.X.142:60020
      2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 in region .META.,,1 with startcode 1246461673026 and server X.X.X4.142:60020
      2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 open on X.X.X.142:60020
      2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 in region .META.,,1 with startcode 1246461673026 and server X.X.X.142:60020
      

      Attachments

        1. debugging-v4.patch
          10 kB
          Michael Stack

        Activity

          People

            Unassigned Unassigned
            stack Michael Stack
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: