Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10619

Optimize using cache for DistributedQueue in case of single-consumer

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.6, 7.0
    • Component/s: None
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      Right now, for every time childWatcher is kicked off. We refetch all children of DQ's node. It is a wasted in case of single consumer.

      1. SOLR-10619.patch
        7 kB
        Cao Manh Dat
      2. SOLR-10619.patch
        5 kB
        Cao Manh Dat
      3. SOLR-10619.patch
        5 kB
        Cao Manh Dat
      4. SOLR-10619.patch
        5 kB
        Cao Manh Dat
      5. SOLR-10619-dragonsinth.patch
        7 kB
        Scott Blum

        Issue Links

          Activity

          Hide
          caomanhdat Cao Manh Dat added a comment - - edited

          Initial patch for this ticket. The Overseer.testPerformance time reduced from 2m18s to 31s.

          Before apply the patch, running time = 2m18s

          Overseer loop finished processing:
          avgRequestsPerSecond: 0.007718809350800701
          5minRateRequestsPerSecond: 0.0
          15minRateRequestsPerSecond: 0.0
          avgTimePerRequest: 129550895586000000
          medianRequestTime: 129550895586000000
          75thPcRequestTime: 129550895586000000
          95thPcRequestTime: 129550895586000000
          99thPcRequestTime: 129550895586000000
          999thPcRequestTime: 129550895586000000
          op: am_i_leader, success: 3, failure: 0
          avgRequestsPerSecond: 0.02319808972135658
          5minRateRequestsPerSecond: 0.2681280184142557
          15minRateRequestsPerSecond: 0.35006932761717896
          avgTimePerRequest: 279884000000
          medianRequestTime: 113274000000
          75thPcRequestTime: 113274000000
          95thPcRequestTime: 1470039000000
          99thPcRequestTime: 1470039000000
          999thPcRequestTime: 1470039000000
          op: update_state, success: 20011, failure: 0
          avgRequestsPerSecond: 154.80879143800885
          5minRateRequestsPerSecond: 102.1642410535161
          15minRateRequestsPerSecond: 87.06135680694149
          avgTimePerRequest: 226193000000
          medianRequestTime: 213152000000
          75thPcRequestTime: 228076000000
          95thPcRequestTime: 258815000000
          99thPcRequestTime: 332711000000
          999thPcRequestTime: 3070637000000
          op: state, success: 20001, failure: 0
          avgRequestsPerSecond: 154.95644369471393
          5minRateRequestsPerSecond: 103.31921470642384
          15minRateRequestsPerSecond: 88.36168489656214
          avgTimePerRequest: 15785000000
          medianRequestTime: 14638000000
          75thPcRequestTime: 19736000000
          95thPcRequestTime: 26959000000
          99thPcRequestTime: 41592000000
          999thPcRequestTime: 82301000000

          After apply the patch, running time = 31s

          Overseer loop finished processing:
          avgRequestsPerSecond: 0.04702007081952135
          5minRateRequestsPerSecond: 0.0
          15minRateRequestsPerSecond: 0.0
          avgTimePerRequest: 21266388653000000
          medianRequestTime: 21266388653000000
          75thPcRequestTime: 21266388653000000
          95thPcRequestTime: 21266388653000000
          99thPcRequestTime: 21266388653000000
          999thPcRequestTime: 21266388653000000
          op: am_i_leader, success: 3, failure: 0
          avgRequestsPerSecond: 0.1424634644143451
          5minRateRequestsPerSecond: 0.3804917698002856
          15minRateRequestsPerSecond: 0.393388581528647
          avgTimePerRequest: 715047000000
          medianRequestTime: 354218000000
          75thPcRequestTime: 1893782000000
          95thPcRequestTime: 1893782000000
          99thPcRequestTime: 1893782000000
          999thPcRequestTime: 1893782000000
          op: update_state, success: 20011, failure: 0
          avgRequestsPerSecond: 952.638561665829
          5minRateRequestsPerSecond: 794.828260100909
          15minRateRequestsPerSecond: 786.4717084310993
          avgTimePerRequest: 214683000000
          medianRequestTime: 197207000000
          75thPcRequestTime: 217411000000
          95thPcRequestTime: 298345000000
          99thPcRequestTime: 363035000000
          999thPcRequestTime: 2295814000000
          op: state, success: 20001, failure: 0
          avgRequestsPerSecond: 953.595055990491
          5minRateRequestsPerSecond: 799.4104704273237
          15minRateRequestsPerSecond: 791.1978869728018
          avgTimePerRequest: 12457000000
          medianRequestTime: 8845000000
          75thPcRequestTime: 14149000000
          95thPcRequestTime: 28728000000
          99thPcRequestTime: 50239000000
          999thPcRequestTime: 79344000000

          Show
          caomanhdat Cao Manh Dat added a comment - - edited Initial patch for this ticket. The Overseer.testPerformance time reduced from 2m18s to 31s. Before apply the patch, running time = 2m18s Overseer loop finished processing: avgRequestsPerSecond: 0.007718809350800701 5minRateRequestsPerSecond: 0.0 15minRateRequestsPerSecond: 0.0 avgTimePerRequest: 129550895586000000 medianRequestTime: 129550895586000000 75thPcRequestTime: 129550895586000000 95thPcRequestTime: 129550895586000000 99thPcRequestTime: 129550895586000000 999thPcRequestTime: 129550895586000000 op: am_i_leader, success: 3, failure: 0 avgRequestsPerSecond: 0.02319808972135658 5minRateRequestsPerSecond: 0.2681280184142557 15minRateRequestsPerSecond: 0.35006932761717896 avgTimePerRequest: 279884000000 medianRequestTime: 113274000000 75thPcRequestTime: 113274000000 95thPcRequestTime: 1470039000000 99thPcRequestTime: 1470039000000 999thPcRequestTime: 1470039000000 op: update_state, success: 20011, failure: 0 avgRequestsPerSecond: 154.80879143800885 5minRateRequestsPerSecond: 102.1642410535161 15minRateRequestsPerSecond: 87.06135680694149 avgTimePerRequest: 226193000000 medianRequestTime: 213152000000 75thPcRequestTime: 228076000000 95thPcRequestTime: 258815000000 99thPcRequestTime: 332711000000 999thPcRequestTime: 3070637000000 op: state, success: 20001, failure: 0 avgRequestsPerSecond: 154.95644369471393 5minRateRequestsPerSecond: 103.31921470642384 15minRateRequestsPerSecond: 88.36168489656214 avgTimePerRequest: 15785000000 medianRequestTime: 14638000000 75thPcRequestTime: 19736000000 95thPcRequestTime: 26959000000 99thPcRequestTime: 41592000000 999thPcRequestTime: 82301000000 After apply the patch, running time = 31s Overseer loop finished processing: avgRequestsPerSecond: 0.04702007081952135 5minRateRequestsPerSecond: 0.0 15minRateRequestsPerSecond: 0.0 avgTimePerRequest: 21266388653000000 medianRequestTime: 21266388653000000 75thPcRequestTime: 21266388653000000 95thPcRequestTime: 21266388653000000 99thPcRequestTime: 21266388653000000 999thPcRequestTime: 21266388653000000 op: am_i_leader, success: 3, failure: 0 avgRequestsPerSecond: 0.1424634644143451 5minRateRequestsPerSecond: 0.3804917698002856 15minRateRequestsPerSecond: 0.393388581528647 avgTimePerRequest: 715047000000 medianRequestTime: 354218000000 75thPcRequestTime: 1893782000000 95thPcRequestTime: 1893782000000 99thPcRequestTime: 1893782000000 999thPcRequestTime: 1893782000000 op: update_state, success: 20011, failure: 0 avgRequestsPerSecond: 952.638561665829 5minRateRequestsPerSecond: 794.828260100909 15minRateRequestsPerSecond: 786.4717084310993 avgTimePerRequest: 214683000000 medianRequestTime: 197207000000 75thPcRequestTime: 217411000000 95thPcRequestTime: 298345000000 99thPcRequestTime: 363035000000 999thPcRequestTime: 2295814000000 op: state, success: 20001, failure: 0 avgRequestsPerSecond: 953.595055990491 5minRateRequestsPerSecond: 799.4104704273237 15minRateRequestsPerSecond: 791.1978869728018 avgTimePerRequest: 12457000000 medianRequestTime: 8845000000 75thPcRequestTime: 14149000000 95thPcRequestTime: 28728000000 99thPcRequestTime: 50239000000 999thPcRequestTime: 79344000000
          Hide
          caomanhdat Cao Manh Dat added a comment -

          I think this patch is ready to be committed. Scott Blum Can you take a look at the patch?

          Show
          caomanhdat Cao Manh Dat added a comment - I think this patch is ready to be committed. Scott Blum Can you take a look at the patch?
          Hide
          dragonsinth Scott Blum added a comment -

          What's the whole "singleConsumer" thing? Can you just give me an overview of what's happening here?

          Show
          dragonsinth Scott Blum added a comment - What's the whole "singleConsumer" thing? Can you just give me an overview of what's happening here?
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Scott Blum "singleConsumer" ( multiple producer single consumer ) means many thread enqueue data to DQ but only one thread remove ( process ) message from DQ.
          The scenario here is whenever we call to firstChild() ( peek or poll ) with isDirty state equals true. We always fetchZkChildren even when knowChildren is not empty. This is wasteful in case of Overseer.DQ ( single consumer ), ex:

          • knowChildren already have 20,000 elements
          • some other thread offer one more element to the queue,
          • childWatcher is kicked off in the Overseer.DQ, so isDirty state is set to true
          • the next time we call peek, fetchZkChildren is called to refill knowChildren with 20,001 elements
          Show
          caomanhdat Cao Manh Dat added a comment - Scott Blum "singleConsumer" ( multiple producer single consumer ) means many thread enqueue data to DQ but only one thread remove ( process ) message from DQ. The scenario here is whenever we call to firstChild() ( peek or poll ) with isDirty state equals true. We always fetchZkChildren even when knowChildren is not empty. This is wasteful in case of Overseer.DQ ( single consumer ), ex: knowChildren already have 20,000 elements some other thread offer one more element to the queue, childWatcher is kicked off in the Overseer.DQ, so isDirty state is set to true the next time we call peek , fetchZkChildren is called to refill knowChildren with 20,001 elements
          Hide
          dragonsinth Scott Blum added a comment -

          I completely follow your logic. But then, why do we need a singleConsumer flag? Couldn't we just do the optimization in all cases?

          In other words, let's say we're dirty but we have children in memory. Fine, let's optimistically fetch the first child's data based on the node name we have in memory. If it succeeds, we're done, just return it.

          But if it fails, if there's no node then, and we're in a dirty state, THEN we can refetch the child list from ZK.

          Show
          dragonsinth Scott Blum added a comment - I completely follow your logic. But then, why do we need a singleConsumer flag? Couldn't we just do the optimization in all cases? In other words, let's say we're dirty but we have children in memory. Fine, let's optimistically fetch the first child's data based on the node name we have in memory. If it succeeds, we're done, just return it. But if it fails, if there's no node then, and we're in a dirty state, THEN we can refetch the child list from ZK.
          Hide
          caomanhdat Cao Manh Dat added a comment -

          I completely follow your logic. But then, why do we need a singleConsumer flag? Couldn't we just do the optimization in all cases?

          I thought the reason about not doing this one in the first place. The only reason is in case of multiple consumer, the current logic makes sense.

          Yeah, your idea seems work. Will update the patch in that way

          Show
          caomanhdat Cao Manh Dat added a comment - I completely follow your logic. But then, why do we need a singleConsumer flag? Couldn't we just do the optimization in all cases? I thought the reason about not doing this one in the first place. The only reason is in case of multiple consumer, the current logic makes sense. Yeah, your idea seems work. Will update the patch in that way
          Hide
          caomanhdat Cao Manh Dat added a comment -

          But instead of refetch the child list from ZK when there's no node. I think we should try to go to next node name we have in memory.

          Show
          caomanhdat Cao Manh Dat added a comment - But instead of refetch the child list from ZK when there's no node. I think we should try to go to next node name we have in memory.
          Hide
          dragonsinth Scott Blum added a comment -

          Good thought. I was debating the same thing in my head. In normal Solr operation, it would be an exceptional case, since Overseer is single-consumer, it should almost never be the case that resolving the head node to ZK fails. But for the sake of generality, I can see both sides:

          • If you continue trying to iterate through in-memory children, you might hit a really long list of failures (many, many round trips to ZK) before hitting the next live child. Theoretically, re-fetching the child list from ZK would be faster to get a fresh view.
          • But, in a true multi-consumer use case, re-fetching the child list every time you get a "miss" because someone else consumed ahead of you would cause the many consumers to thrash a lot.

          So I'm a little torn, since option 1 might be slightly better for Overseer in particular, but option 2 is clearly more general. Thoughts?

          Show
          dragonsinth Scott Blum added a comment - Good thought. I was debating the same thing in my head. In normal Solr operation, it would be an exceptional case, since Overseer is single-consumer, it should almost never be the case that resolving the head node to ZK fails. But for the sake of generality, I can see both sides: If you continue trying to iterate through in-memory children, you might hit a really long list of failures (many, many round trips to ZK) before hitting the next live child. Theoretically, re-fetching the child list from ZK would be faster to get a fresh view. But, in a true multi-consumer use case, re-fetching the child list every time you get a "miss" because someone else consumed ahead of you would cause the many consumers to thrash a lot. So I'm a little torn, since option 1 might be slightly better for Overseer in particular, but option 2 is clearly more general. Thoughts?
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Yeah, that's the reason why I added a flag ( single-consumer ) to the class. It's much easier to optimize if we know how DQ is used.
          But in case of single-consumer, the event of a node not exist is kinda rare ( right? ), if option 2 is more general, we should go with option 2?

          Show
          caomanhdat Cao Manh Dat added a comment - Yeah, that's the reason why I added a flag ( single-consumer ) to the class. It's much easier to optimize if we know how DQ is used. But in case of single-consumer, the event of a node not exist is kinda rare ( right? ), if option 2 is more general, we should go with option 2?
          Hide
          dragonsinth Scott Blum added a comment -

          Sounds good!

          Show
          dragonsinth Scott Blum added a comment - Sounds good!
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          I was under the impression that the DistribuedQueue only set a watch on ZK if knownChildren is empty. It says so in the implementation notes section of the class. Did that change or did that never work at all?

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - I was under the impression that the DistribuedQueue only set a watch on ZK if knownChildren is empty. It says so in the implementation notes section of the class. Did that change or did that never work at all?
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment - - edited

          I read the code again and yeah my understanding was wrong. The watcher is set on each fetch from zk which either happens because we are in a dirty state or when we run out of items in the knownChildren so we end up having a watcher all the time. The docs in the implementation notes are also wrong.

          I'll lean towards the option that keeps things simple for the use-case we need this queue for – which is the overseer. There's no harm in explicitly marking/documenting DistributedQueue as a single-consumer queue.

          If the overseer sees a node disappear from ZK it is either because someone mucked around with ZK manually or a new overseer has been elected and the old overseer hasn't realised that yet. In such a case, it might be worth to add a callback which the overseer can use to check if it is still the leader. The result of this callback can be used to abort further calls to ZK.

          Wow, I'm glad Dat caught this! Thanks Dat!

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - - edited I read the code again and yeah my understanding was wrong. The watcher is set on each fetch from zk which either happens because we are in a dirty state or when we run out of items in the knownChildren so we end up having a watcher all the time. The docs in the implementation notes are also wrong. I'll lean towards the option that keeps things simple for the use-case we need this queue for – which is the overseer. There's no harm in explicitly marking/documenting DistributedQueue as a single-consumer queue. If the overseer sees a node disappear from ZK it is either because someone mucked around with ZK manually or a new overseer has been elected and the old overseer hasn't realised that yet. In such a case, it might be worth to add a callback which the overseer can use to check if it is still the leader. The result of this callback can be used to abort further calls to ZK. Wow, I'm glad Dat caught this! Thanks Dat!
          Hide
          dragonsinth Scott Blum added a comment -

          I'll take another look tomorrow.

          Show
          dragonsinth Scott Blum added a comment - I'll take another look tomorrow.
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Updated patch for this ticket. In this patch, firstChild() will always return from cache first, if the node is not exist, the callers of firstChild() will clear the cache and recall firstChild()

          Show
          caomanhdat Cao Manh Dat added a comment - Updated patch for this ticket. In this patch, firstChild() will always return from cache first, if the node is not exist, the callers of firstChild() will clear the cache and recall firstChild()
          Hide
          dragonsinth Scott Blum added a comment -

          Cao Manh Dat patch LGTM. I would update the main class comment on DistributedQueue to read:

          "A distributed queue. Optimized for single-consumer, multiple-producer: if there are multiple consumers on the same ZK queue, the results should be correct but inefficient."

          And then where you call "knownChildren.clear();" you could add a comment "Efficient only for single-consumer"

          Show
          dragonsinth Scott Blum added a comment - Cao Manh Dat patch LGTM. I would update the main class comment on DistributedQueue to read: "A distributed queue. Optimized for single-consumer, multiple-producer: if there are multiple consumers on the same ZK queue, the results should be correct but inefficient." And then where you call "knownChildren.clear();" you could add a comment "Efficient only for single-consumer"
          Hide
          caomanhdat Cao Manh Dat added a comment -

          The patch make MultiThreadedOCPTest.testFillWorkQueue() fail, because DQ.peekElements() changed its behaviour, trying to find a ways to solve this problem.

          Show
          caomanhdat Cao Manh Dat added a comment - The patch make MultiThreadedOCPTest.testFillWorkQueue() fail, because DQ.peekElements() changed its behaviour, trying to find a ways to solve this problem.
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Introduced a flag to firstChild so peekElements can expect the same behaviour as before. We can open a new ticket if we found peekElements is not efficient enough.

          Show
          caomanhdat Cao Manh Dat added a comment - Introduced a flag to firstChild so peekElements can expect the same behaviour as before. We can open a new ticket if we found peekElements is not efficient enough.
          Hide
          dragonsinth Scott Blum added a comment -

          Hmm lemme take a look. Maybe that test should just have different expectations?

          Show
          dragonsinth Scott Blum added a comment - Hmm lemme take a look. Maybe that test should just have different expectations?
          Hide
          dragonsinth Scott Blum added a comment -

          Cao Manh Dat Okay, I see what's going on here. Your change looks good. Although... I think maybe we should change peekElements so that it only forces a refresh if it's going to have to block?

          Show
          dragonsinth Scott Blum added a comment - Cao Manh Dat Okay, I see what's going on here. Your change looks good. Although... I think maybe we should change peekElements so that it only forces a refresh if it's going to have to block?
          Hide
          dragonsinth Scott Blum added a comment -

          What do you think about this?

          Show
          dragonsinth Scott Blum added a comment - What do you think about this?
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment - - edited

          +1 LGTM for Scott's patch that does not refetch unless we need to block

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - - edited +1 LGTM for Scott's patch that does not refetch unless we need to block
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Scott Blum The patch LGTM, I will commit the patch soon.

          Show
          caomanhdat Cao Manh Dat added a comment - Scott Blum The patch LGTM, I will commit the patch soon.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit a24fa8d7db5d880723cfc0eaa26d7ae320c4cbeb in lucene-solr's branch refs/heads/master from Cao Manh Dat
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a24fa8d ]

          SOLR-10619: Optimize using cache for DistributedQueue in case of single-consumer

          Show
          jira-bot ASF subversion and git services added a comment - Commit a24fa8d7db5d880723cfc0eaa26d7ae320c4cbeb in lucene-solr's branch refs/heads/master from Cao Manh Dat [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a24fa8d ] SOLR-10619 : Optimize using cache for DistributedQueue in case of single-consumer
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 7b23d96d326100313fda184811e0a0ff434d7d02 in lucene-solr's branch refs/heads/branch_6x from Cao Manh Dat
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7b23d96 ]

          SOLR-10619: Optimize using cache for DistributedQueue in case of single-consumer

          Show
          jira-bot ASF subversion and git services added a comment - Commit 7b23d96d326100313fda184811e0a0ff434d7d02 in lucene-solr's branch refs/heads/branch_6x from Cao Manh Dat [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7b23d96 ] SOLR-10619 : Optimize using cache for DistributedQueue in case of single-consumer
          Hide
          caomanhdat Cao Manh Dat added a comment -
          Show
          caomanhdat Cao Manh Dat added a comment - Thanks Scott Blum and Shalin Shekhar Mangar

            People

            • Assignee:
              caomanhdat Cao Manh Dat
              Reporter:
              caomanhdat Cao Manh Dat
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development