Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15449

Increase default timeout of ZK session to avoid frequent NameNode failover

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.4
    • Fix Version/s: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3, 2.8.5
    • Component/s: common
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We observed from several users regarding Namenode flip-over is due to either zookeeper disk slowness (higher fsync cost) or network issue. We would need to avoid flip-over issue to some extent by increasing HA session timeout, ha.zookeeper.session-timeout.ms.

      Default value is 5000 ms, seems very low in any production environment.  I would suggest 10000 ms as default session timeout.

       

      
      ..
      
      2018-05-04 03:54:36,848 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 4689ms for sessionid 0x260e24bac500aa3, closing socket connection and attempting reconnect 
      2018-05-04 03:56:49,088 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3981ms for sessionid 0x360fd152b8700fe, closing socket connection and attempting reconnect
      
      .. 
      
      

        Attachments

        1. HADOOP-15449-002.patch
          2 kB
          Karthik Palanisamy
        2. HADOOP-15449.patch
          0.6 kB
          Karthik Palanisamy

          Activity

            People

            • Assignee:
              kpalanisamy Karthik Palanisamy
              Reporter:
              kpalanisamy Karthik Palanisamy
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: