HBase
  1. HBase
  2. HBASE-506

When an exception has to escape ServerCallable due to exhausted retries, show all the exceptions that lead to this situation

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0, 0.2.0
    • Component/s: Client
    • Labels:
      None

      Description

      Every so often, we find ourselves trying to debug a problem that happens in HTable where we exhaust all our retries trying to contact the region server hosting the region we want to operate on. Oftentimes the last exception that comes out is something like WrongRegionException, which should just never be the case.

      As a way to improve our debugging capabilities, when we decide to throw an exception out of ServerCallable, let's show not just the last exception but all the exceptions that caused all the retries in the first place. This will help us understand the sequence of events that led to us running out of retries.

      1. 506-0.1.patch
        2 kB
        Bryan Duxbury
      2. 506.patch
        2 kB
        Bryan Duxbury

        Activity

        Hide
        Bryan Duxbury added a comment -

        I just committed this to 0.1 and trunk.

        Show
        Bryan Duxbury added a comment - I just committed this to 0.1 and trunk.
        Hide
        Bryan Duxbury added a comment -

        I ran TestEmptyMetaInfo against this patch now that it's committed, and it works fine.

        Show
        Bryan Duxbury added a comment - I ran TestEmptyMetaInfo against this patch now that it's committed, and it works fine.
        Hide
        Jim Kellerman added a comment -

        That is really odd that TestEmptyMetaInfo should fail. Basically what it does is open the META table, stick in a bunch of rows that don't have info:regioninfo in them and wait for the master to clean them up.

        Show
        Jim Kellerman added a comment - That is really odd that TestEmptyMetaInfo should fail. Basically what it does is open the META table, stick in a bunch of rows that don't have info:regioninfo in them and wait for the master to clean them up.
        Hide
        Bryan Duxbury added a comment -

        Please review.

        Show
        Bryan Duxbury added a comment - Please review.
        Hide
        Bryan Duxbury added a comment -

        Here's the same thing but for trunk.

        Show
        Bryan Duxbury added a comment - Here's the same thing but for trunk.
        Hide
        stack added a comment -

        Right. My test bed was polluted w/ HBASE-27.

        Show
        stack added a comment - Right. My test bed was polluted w/ HBASE-27 .
        Hide
        Bryan Duxbury added a comment -

        Don't see that failure in my test suite because the TestEmptyMetaInfo test doesn't exist yet. HBASE-27 hasn't been applied, right?

        Show
        Bryan Duxbury added a comment - Don't see that failure in my test suite because the TestEmptyMetaInfo test doesn't exist yet. HBASE-27 hasn't been applied, right?
        Hide
        stack added a comment -

        I saw below running tests on patch:

        [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 72.628 sec
        [junit] Test org.apache.hadoop.hbase.TestEmptyMetaInfo FAILED

        Do you see same?

        Show
        stack added a comment - I saw below running tests on patch: [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 72.628 sec [junit] Test org.apache.hadoop.hbase.TestEmptyMetaInfo FAILED Do you see same?
        Hide
        stack added a comment -

        +1 on patch. Falls into the debugging tools category – will help prove/disprove IRC theory that WREs just happen to be the last of a set of retryes – so fine to apply to 0.1 branch.

        Show
        stack added a comment - +1 on patch. Falls into the debugging tools category – will help prove/disprove IRC theory that WREs just happen to be the last of a set of retryes – so fine to apply to 0.1 branch.
        Hide
        Bryan Duxbury added a comment -

        Here's a patch to add this functionality for 0.1. (Nearly the same patch would apply to trunk, but let's see if this works first.)

        Show
        Bryan Duxbury added a comment - Here's a patch to add this functionality for 0.1. (Nearly the same patch would apply to trunk, but let's see if this works first.)

          People

          • Assignee:
            Bryan Duxbury
            Reporter:
            Bryan Duxbury
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development