Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-2421

Put hangs for 10 retries on failed region servers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.90.0
    • None
    • None
    • Reviewed

    Description

      Since MultiPut got in, instead of calling getRegionLocationForRowWithRetries we now call getRegionServerWithRetries to send an array list of Puts. The problem is that if the region server failed, we'll still retry the 10 times in a backoff fashion even tho we get connections refused. This is also true for a single put since it's the same code path.

      Marking as critical since it almost disables our responsiveness to machine failures in certain cases where we are already sending a batch of edits when the server fails. Assigning to Ryan since he's been there recently.

      Attachments

        1. HBASE-2421.txt
          8 kB
          ryan rawson
        2. ASF.LICENSE.NOT.GRANTED--HBASE-2421-2.txt
          10 kB
          ryan rawson
        3. hbase-2421.txt
          10 kB
          Todd Lipcon
        4. HBASE-2421-trunk.patch
          9 kB
          Jean-Daniel Cryans

        Activity

          People

            ryanobjc ryan rawson
            jdcryans Jean-Daniel Cryans
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: