Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4848

Stream thread getting into deadlock state while trying to get rocksdb lock in retryWithBackoff

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.2.0
    • Fix Version/s: 0.10.2.1, 0.11.0.0
    • Component/s: streams
    • Labels:
      None

      Description

      We see a deadlock state when streams thread to process a task takes longer than MAX_POLL_INTERVAL_MS_CONFIG time. In this case this threads partitions are assigned to some other thread including rocksdb lock. When it tries to process the next task it cannot get rocks db lock and simply keeps waiting for that lock forever.

      in retryWithBackoff for AbstractTaskCreator we have a backoffTimeMs = 50L.
      If it does not get lock the we simply increase the time by 10x and keep trying inside the while true loop.

      We need to have a upper bound for this backoffTimeM. If the time is greater than MAX_POLL_INTERVAL_MS_CONFIG and it still hasn't got the lock means this thread's partitions are moved somewhere else and it may not get the lock again.

        Attachments

        1. thr-1
          279 kB
          Sachin Mittal

          Issue Links

            Activity

              People

              • Assignee:
                sjmittal Sachin Mittal
                Reporter:
                sjmittal Sachin Mittal
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: