Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-14524

Harden MultiThreadedOCPTest

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Test
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 9.0
    • 9.0
    • SolrCloud

    Description

      MultiThreadedOCPTest.test() fails occasionally in Jenkins because of timing of tasks enqueue to the Collection API queue.

      This test in testFillWorkQueue() enqueues a large number of tasks (115, more than the 100 Collection API parallel executors) to the Collection API queue for a collection COLL_A, then observes a short delay and enqueues a task for another collection COLL_B.
      It verifies that the COLL_B task (that does not require the same lock as the COLL_A tasks) completes before the third COLL_A task.

      Test failures happen because when enqueues are slowed down enough, the first 3 tasks on COLL_A complete even before the COLL_B task gets enqueued!

      In one sample failed Jenkins test execution, the COLL_B task enqueue happened 1275ms after the enqueue of the first COLL_A, leaving plenty of time for a few (and possibly all) COLL_A tasks to complete.

      Fix will be along the lines of:

      • Make the “blocking” COLL_A task longer to execute (currently 1 second) to compensate for slow enqueues.
      • Verify the COLL_B task (a 1ms task) finishes before the long running COLL_A task does. This would be a good indication that even though the collection queue was filled with tasks waiting for a busy lock, a non competing task was picked and executed right away.
      • Delay the enqueue of the COLL_B task to the end of processing of the first COLL_A task. This would guarantee that COLL_B is enqueued once at least some COLL_A tasks started processing at the Overseer. Possibly also verify that the long running task of COLL_A didn't finish execution yet when the COLL_B task is enqueued...
      • It might be possible to set a (very) long duration for the slow task of COLL_A (to be less vulnerable to execution delays) without requiring the test to wait for that task to complete, but only wait for the COLL_B task to complete (so the test doesn't run for too long).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ilan Ilan Ginzburg
            ilan Ilan Ginzburg
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 50m
                1h 50m

                Slack

                  Issue deployment