Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-4280

[replication] ReplicationSink can deadlock itself via handlers

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.90.4
    • Fix Version/s: 0.90.5
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I've experienced this problem a few times, ReplicationSink calls are received through the normal handlers and potentially can call itself which, in certain situations, call fill up all the handlers. For example, 10 handlers that are all replication calls are all trying to talk to the local server at the same time.

      HRS.replicateLogEntries should have @QosPriority(priority=HIGH_QOS) to use the other set of handlers.

      1. HBASE-4280-0.90.patch
        0.5 kB
        Jean-Daniel Cryans

        Activity

        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        Puts HRS.replicateLogEntries in the high priority bucket. Maybe we need another bucket? Maybe for those RS -> RS communications.

        Show
        jdcryans Jean-Daniel Cryans added a comment - Puts HRS.replicateLogEntries in the high priority bucket. Maybe we need another bucket? Maybe for those RS -> RS communications.
        Hide
        stack stack added a comment -

        Should replication traffic be preferred to server's main load? Should there be a lower-priority than handler-set that the replication traffic uses?

        Show
        stack stack added a comment - Should replication traffic be preferred to server's main load? Should there be a lower-priority than handler-set that the replication traffic uses?
        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        It seems like we shouldn't be the ones deciding that... but currently the user has no control over QOS since it's hard-coded.

        Show
        jdcryans Jean-Daniel Cryans added a comment - It seems like we shouldn't be the ones deciding that... but currently the user has no control over QOS since it's hard-coded.
        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        I opened HBASE-4441.

        Show
        jdcryans Jean-Daniel Cryans added a comment - I opened HBASE-4441 .
        Hide
        stack stack added a comment -

        Ok. +1 on this patch for 0.90 (and for 0.92 till we do hbase-4441)

        Show
        stack stack added a comment - Ok. +1 on this patch for 0.90 (and for 0.92 till we do hbase-4441)
        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        I've been testing it under heavy an heavy upload workload that used to fail before that for 12 hours and it's still working, going to commit. Thanks for the +1 Stack.

        Show
        jdcryans Jean-Daniel Cryans added a comment - I've been testing it under heavy an heavy upload workload that used to fail before that for 12 hours and it's still working, going to commit. Thanks for the +1 Stack.
        Hide
        jdcryans Jean-Daniel Cryans added a comment -

        Committed to 0.90, 0.92, trunk. Thanks for your comments Stack!

        Show
        jdcryans Jean-Daniel Cryans added a comment - Committed to 0.90, 0.92, trunk. Thanks for your comments Stack!
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-0.92 #18 (See https://builds.apache.org/job/HBase-0.92/18/)
        HBASE-4280 [replication] ReplicationSink can deadlock itself via handlers

        jdcryans :
        Files :

        • /hbase/branches/0.92/CHANGES.txt
        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        hudson Hudson added a comment - Integrated in HBase-0.92 #18 (See https://builds.apache.org/job/HBase-0.92/18/ ) HBASE-4280 [replication] ReplicationSink can deadlock itself via handlers jdcryans : Files : /hbase/branches/0.92/CHANGES.txt /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-TRUNK #2247 (See https://builds.apache.org/job/HBase-TRUNK/2247/)
        HBASE-4280 [replication] ReplicationSink can deadlock itself via handlers

        jdcryans :
        Files :

        • /hbase/trunk/CHANGES.txt
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        hudson Hudson added a comment - Integrated in HBase-TRUNK #2247 (See https://builds.apache.org/job/HBase-TRUNK/2247/ ) HBASE-4280 [replication] ReplicationSink can deadlock itself via handlers jdcryans : Files : /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Hide
        lars_francke Lars Francke added a comment -

        This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).

        Show
        lars_francke Lars Francke added a comment - This issue was closed as part of a bulk closing operation on 2015-11-20. All issues that have been resolved and where all fixVersions have been released have been closed (following discussions on the mailing list).

          People

          • Assignee:
            jdcryans Jean-Daniel Cryans
            Reporter:
            jdcryans Jean-Daniel Cryans
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development