Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6785

Starting an Impalad on an already running cluster may result in inconsistent cluster subscription

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      After starting a couple of Impala daemons on an already running cluster without restarting the Statestore I noticed that queries are no running against the newly starts daemons.
      Checking the /backends page on any of the hosts showed that the list only includes the daemons that have been running for a while.

      Repro steps, on a large cluster with 400 nodes

      1. Start 300 nodes, run for a workload for a couple of hours
      2. Start 100 nodes, wait for a couple of minutes
      3. Check the /backends on page on any of the newly started hosts, they include the original 300 nodes.
      4. Statestore shows 400 subscribers as expected, same thing for statestore.live-backends.list
      5. Subscribers page on SS shows that all the old daemons have 2 transient entires and 0 for the newly started daemons

      The newly added hosts have Subscriber list that doesn't include it self.

      Metrics from daemon that was up from the start

      rpc-method.statestore-subscriber.StatestoreSubscriber.Heartbeat.call_duration	Count: 120972, min / max: 0 / 1ms, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0, 95th %-ile: 0, 99.9th %-ile: 1ms	
      rpc-method.statestore-subscriber.StatestoreSubscriber.UpdateState.call_duration	Count: 1243000, min / max: 0 / 1s531ms, 25th %-ile: 0, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 4ms, 99.9th %-ile: 21ms	
      statestore-subscriber.connected	true	Whether the Impala Daemon considers itself connected to the StateStore.
      statestore-subscriber.heartbeat-interval-time	Last (of 120972): 1.00025. Min: 0, max: 1.04143, avg: 1.00048	The time (sec) between Statestore heartbeats.
      statestore-subscriber.last-recovery-duration	0	The amount of time the StateStore subscriber took to recover the connection the last time it was lost.
      statestore-subscriber.last-recovery-time	N/A	The local time that the last statestore recovery happened.
      statestore-subscriber.topic-catalog-update.processing-time-s	Last (of 60048): 0.0112636. Min: 0, max: 1.53041, avg: 0.0133049	Statestore Subscriber Topic catalog-update Processing Time
      statestore-subscriber.topic-catalog-update.update-interval	Last (of 60048): 2.00077. Min: 1, max: 18.4453, avg: 2.00238	Interval between topic updates for Topic catalog-update
      statestore-subscriber.topic-impala-membership.processing-time-s	Last (of 1182952): 0.000539773. Min: 0, max: 0.0523181, avg: 0.000404144	Statestore Subscriber Topic impala-membership Processing Time
      statestore-subscriber.topic-impala-membership.update-interval	Last (of 1182952): 0.101232. Min: 0, max: 76.2051, avg: 0.101726	Interval between topic updates for Topic impala-membership
      statestore-subscriber.topic-impala-request-queue.processing-time-s	Last (of 1182952): 0.000763235. Min: 0, max: 0.177429, avg: 0.000591509	Statestore Subscriber Topic impala-request-queue Processing Time
      statestore-subscriber.topic-impala-request-queue.update-interval	Last (of 1182952): 0.101234. Min: 0, max: 76.2051, avg: 0.101727	Interval between topic updates for Topic impala-request-queue
      statestore-subscriber.topic-update-duration	Last (of 1243000): 0.000765367. Min: 0, max: 1.53041, avg: 0.00120732	The time (sec) taken to process Statestore subscriber topic updates.
      statestore-subscriber.topic-update-interval-time	Last (of 2425952): 0.101234. Min: 0, max: 76.2051, avg: 0.148772	The time (sec) between Statestore subscriber topic updates.
      

      Metrics from daemon that was newly started

      rpc-method.statestore-subscriber.StatestoreSubscriber.Heartbeat.call_duration	Count: 1280, min / max: 0 / 1ms, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0, 95th %-ile: 0, 99.9th %-ile: 1ms	
      rpc-method.statestore-subscriber.StatestoreSubscriber.UpdateState.call_duration	Count: 632, min / max: 8ms / 1s048ms, 25th %-ile: 13ms, 50th %-ile: 13ms, 75th %-ile: 14ms, 90th %-ile: 15ms, 95th %-ile: 17ms, 99.9th %-ile: 81ms	
      statestore-subscriber.connected	true	Whether the Impala Daemon considers itself connected to the StateStore.
      statestore-subscriber.heartbeat-interval-time	Last (of 1280): 1.00096. Min: 0, max: 1.04206, avg: 1.00102	The time (sec) between Statestore heartbeats.
      statestore-subscriber.last-recovery-duration	0	The amount of time the StateStore subscriber took to recover the connection the last time it was lost.
      statestore-subscriber.last-recovery-time	N/A	The local time that the last statestore recovery happened.
      statestore-subscriber.topic-catalog-update.processing-time-s	Last (of 631): 0.0135166. Min: 0, max: 1.0252, avg: 0.0154726	Statestore Subscriber Topic catalog-update Processing Time
      statestore-subscriber.topic-catalog-update.update-interval	Last (of 631): 1.99982. Min: 1, max: 10.2491, avg: 2.01458	Interval between topic updates for Topic catalog-update
      statestore-subscriber.topic-impala-membership.processing-time-s	Last (of 1): 0.0161542. Min: 0, max: 0.0161542, avg: 0.0161542	Statestore Subscriber Topic impala-membership Processing Time
      statestore-subscriber.topic-impala-membership.update-interval	Last (of 1): 0.98709. Min: 0, max: 0.98709, avg: 0.98709	Interval between topic updates for Topic impala-membership
      statestore-subscriber.topic-impala-request-queue.processing-time-s	Last (of 1): 0.0168564. Min: 0, max: 0.0168564, avg: 0.0168564	Statestore Subscriber Topic impala-request-queue Processing Time
      statestore-subscriber.topic-impala-request-queue.update-interval	Last (of 1): 0.986915. Min: 0, max: 0.986915, avg: 0.986915	Interval between topic updates for Topic impala-request-queue
      statestore-subscriber.topic-update-duration	Last (of 632): 0.0135185. Min: 0, max: 1.0252, avg: 0.0154767	The time (sec) taken to process Statestore subscriber topic updates.
      statestore-subscriber.topic-update-interval-time	Last (of 633): 1.99982. Min: 0, max: 10.2491, avg: 2.01134	The time (sec) between Statestore subscriber topic updates.
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tarmstrong Tim Armstrong
            mmokhtar Mostafa Mokhtar
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment