Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9340

statestore_max_missed_heartbeats is off by one

    XMLWordPrintableJSON

Details

    • ghx-label-11

    Description

      The flag statestore_max_missed_heartbeats says:

      Maximum number of consecutiveĀ heartbeat messages an impalad can miss before being declared failed by theĀ statestore.

      However, the implementation actually waits for statestore_max_missed_heartbeats + 1 missed heartbeats before considering the impalad as failed.

      Example when statestore_max_missed_heartbeats is set to 10 (the default value):

      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.214053 29932 failure-detector.cc:90] 1 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is OK
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.267143 29937 failure-detector.cc:90] 2 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is OK
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.320443 29938 failure-detector.cc:90] 3 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is OK
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.373548 29934 failure-detector.cc:90] 4 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is OK
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.426955 29929 failure-detector.cc:90] 5 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is OK
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.479981 29933 failure-detector.cc:90] 6 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is SUSPECTED
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.533097 29930 failure-detector.cc:90] 7 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is SUSPECTED
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.586172 29934 failure-detector.cc:90] 8 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is SUSPECTED
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.639999 29936 failure-detector.cc:90] 9 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is SUSPECTED
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.692075 29929 failure-detector.cc:90] 10 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is SUSPECTED
      logs/custom_cluster_tests/statestored.impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com.jenkins.log.INFO.20200128-105531.29877:I0128 10:58:04.745105 29931 failure-detector.cc:90] 11 consecutive heartbeats failed for 'impalad@impala-ec2-centos74-m5-4xlarge-ondemand-09f9.vpc.cloudera.com:22002'. State is FAILED 

      Attachments

        Activity

          People

            rizaon Riza Suminto
            stakiar Sahil Takiar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: