Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15449

Increase default timeout of ZK session to avoid frequent NameNode failover

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.7.4
    • 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3, 2.8.5
    • common
    • None
    • Reviewed

    Description

      We observed from several users regarding Namenode flip-over is due to either zookeeper disk slowness (higher fsync cost) or network issue. We would need to avoid flip-over issue to some extent by increasing HA session timeout, ha.zookeeper.session-timeout.ms.

      Default value is 5000 ms, seems very low in any production environment.  I would suggest 10000 ms as default session timeout.

       

      
      ..
      
      2018-05-04 03:54:36,848 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 4689ms for sessionid 0x260e24bac500aa3, closing socket connection and attempting reconnect 
      2018-05-04 03:56:49,088 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(1140)) - Client session timed out, have not heard from server in 3981ms for sessionid 0x360fd152b8700fe, closing socket connection and attempting reconnect
      
      .. 
      
      

      Attachments

        1. HADOOP-15449-002.patch
          2 kB
          Karthik Palanisamy
        2. HADOOP-15449.patch
          0.6 kB
          Karthik Palanisamy

        Activity

          People

            kpalanisamy Karthik Palanisamy
            kpalanisamy Karthik Palanisamy
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: