Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9632

Transient test failure: PartitionLockTest.testAppendReplicaFetchWithUpdateIsr

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.5.0
    • 2.6.0
    • core
    • None

    Description

      When running this test with numRecordsPerProducer=500, the test fails intermittently. The test uses MockTime and runs concurrent log operations. This can cause issues when attempting to roll a segment since Log and MockScheduler don't work well together. MockScheduler currently runs tasks while holding the MockScheduler lock. This can cause a deadlock if a thread attempts to schedule a task while holding a lock which is also acquired within a scheduled task.

      The issue in this test occurs when these two operations happen concurrently:

      1) LogManager.cleanupLogs is a scheduled task that acquires Log lock. When run with MockScheduler, the thread holds MockScheduler lock and then attempts to acquire Log lock.

      2) Partition.appendLogsToLeader holds Log lock and attempts to acquire MockScheduler lock in order to schedule a roll().

      Since locking order is reversed in 1) and 2), this causes a deadlock.

      The test itself can be easily fixed by avoiding roll() in the test. But it will be good to fix MockScheduler to enable it to be used in this case.

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rsivaram Rajini Sivaram
            rsivaram Rajini Sivaram
            Manikumar Manikumar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment