Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6785

Starting an Impalad on an already running cluster may result in inconsistent cluster subscription

    XMLWordPrintableJSON

Details

    Description

      After starting a couple of Impala daemons on an already running cluster without restarting the Statestore I noticed that queries are no running against the newly starts daemons.
      Checking the /backends page on any of the hosts showed that the list only includes the daemons that have been running for a while.

      Repro steps, on a large cluster with 400 nodes

      1. Start 300 nodes, run for a workload for a couple of hours
      2. Start 100 nodes, wait for a couple of minutes
      3. Check the /backends on page on any of the newly started hosts, they include the original 300 nodes.
      4. Statestore shows 400 subscribers as expected, same thing for statestore.live-backends.list
      5. Subscribers page on SS shows that all the old daemons have 2 transient entires and 0 for the newly started daemons

      The newly added hosts have Subscriber list that doesn't include it self.

      Metrics from daemon that was up from the start

      rpc-method.statestore-subscriber.StatestoreSubscriber.Heartbeat.call_duration	Count: 120972, min / max: 0 / 1ms, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0, 95th %-ile: 0, 99.9th %-ile: 1ms	
      rpc-method.statestore-subscriber.StatestoreSubscriber.UpdateState.call_duration	Count: 1243000, min / max: 0 / 1s531ms, 25th %-ile: 0, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 4ms, 99.9th %-ile: 21ms	
      statestore-subscriber.connected	true	Whether the Impala Daemon considers itself connected to the StateStore.
      statestore-subscriber.heartbeat-interval-time	Last (of 120972): 1.00025. Min: 0, max: 1.04143, avg: 1.00048	The time (sec) between Statestore heartbeats.
      statestore-subscriber.last-recovery-duration	0	The amount of time the StateStore subscriber took to recover the connection the last time it was lost.
      statestore-subscriber.last-recovery-time	N/A	The local time that the last statestore recovery happened.
      statestore-subscriber.topic-catalog-update.processing-time-s	Last (of 60048): 0.0112636. Min: 0, max: 1.53041, avg: 0.0133049	Statestore Subscriber Topic catalog-update Processing Time
      statestore-subscriber.topic-catalog-update.update-interval	Last (of 60048): 2.00077. Min: 1, max: 18.4453, avg: 2.00238	Interval between topic updates for Topic catalog-update
      statestore-subscriber.topic-impala-membership.processing-time-s	Last (of 1182952): 0.000539773. Min: 0, max: 0.0523181, avg: 0.000404144	Statestore Subscriber Topic impala-membership Processing Time
      statestore-subscriber.topic-impala-membership.update-interval	Last (of 1182952): 0.101232. Min: 0, max: 76.2051, avg: 0.101726	Interval between topic updates for Topic impala-membership
      statestore-subscriber.topic-impala-request-queue.processing-time-s	Last (of 1182952): 0.000763235. Min: 0, max: 0.177429, avg: 0.000591509	Statestore Subscriber Topic impala-request-queue Processing Time
      statestore-subscriber.topic-impala-request-queue.update-interval	Last (of 1182952): 0.101234. Min: 0, max: 76.2051, avg: 0.101727	Interval between topic updates for Topic impala-request-queue
      statestore-subscriber.topic-update-duration	Last (of 1243000): 0.000765367. Min: 0, max: 1.53041, avg: 0.00120732	The time (sec) taken to process Statestore subscriber topic updates.
      statestore-subscriber.topic-update-interval-time	Last (of 2425952): 0.101234. Min: 0, max: 76.2051, avg: 0.148772	The time (sec) between Statestore subscriber topic updates.
      

      Metrics from daemon that was newly started

      rpc-method.statestore-subscriber.StatestoreSubscriber.Heartbeat.call_duration	Count: 1280, min / max: 0 / 1ms, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0, 95th %-ile: 0, 99.9th %-ile: 1ms	
      rpc-method.statestore-subscriber.StatestoreSubscriber.UpdateState.call_duration	Count: 632, min / max: 8ms / 1s048ms, 25th %-ile: 13ms, 50th %-ile: 13ms, 75th %-ile: 14ms, 90th %-ile: 15ms, 95th %-ile: 17ms, 99.9th %-ile: 81ms	
      statestore-subscriber.connected	true	Whether the Impala Daemon considers itself connected to the StateStore.
      statestore-subscriber.heartbeat-interval-time	Last (of 1280): 1.00096. Min: 0, max: 1.04206, avg: 1.00102	The time (sec) between Statestore heartbeats.
      statestore-subscriber.last-recovery-duration	0	The amount of time the StateStore subscriber took to recover the connection the last time it was lost.
      statestore-subscriber.last-recovery-time	N/A	The local time that the last statestore recovery happened.
      statestore-subscriber.topic-catalog-update.processing-time-s	Last (of 631): 0.0135166. Min: 0, max: 1.0252, avg: 0.0154726	Statestore Subscriber Topic catalog-update Processing Time
      statestore-subscriber.topic-catalog-update.update-interval	Last (of 631): 1.99982. Min: 1, max: 10.2491, avg: 2.01458	Interval between topic updates for Topic catalog-update
      statestore-subscriber.topic-impala-membership.processing-time-s	Last (of 1): 0.0161542. Min: 0, max: 0.0161542, avg: 0.0161542	Statestore Subscriber Topic impala-membership Processing Time
      statestore-subscriber.topic-impala-membership.update-interval	Last (of 1): 0.98709. Min: 0, max: 0.98709, avg: 0.98709	Interval between topic updates for Topic impala-membership
      statestore-subscriber.topic-impala-request-queue.processing-time-s	Last (of 1): 0.0168564. Min: 0, max: 0.0168564, avg: 0.0168564	Statestore Subscriber Topic impala-request-queue Processing Time
      statestore-subscriber.topic-impala-request-queue.update-interval	Last (of 1): 0.986915. Min: 0, max: 0.986915, avg: 0.986915	Interval between topic updates for Topic impala-request-queue
      statestore-subscriber.topic-update-duration	Last (of 632): 0.0135185. Min: 0, max: 1.0252, avg: 0.0154767	The time (sec) taken to process Statestore subscriber topic updates.
      statestore-subscriber.topic-update-interval-time	Last (of 633): 1.99982. Min: 0, max: 10.2491, avg: 2.01134	The time (sec) between Statestore subscriber topic updates.
      

      Attachments

        Issue Links

          Activity

            People

              tarmstrong Tim Armstrong
              mmokhtar Mostafa Mokhtar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: