Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4848

Stream thread getting into deadlock state while trying to get rocksdb lock in retryWithBackoff

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.2.0
    • 0.10.2.1, 0.11.0.0
    • streams
    • None

    Description

      We see a deadlock state when streams thread to process a task takes longer than MAX_POLL_INTERVAL_MS_CONFIG time. In this case this threads partitions are assigned to some other thread including rocksdb lock. When it tries to process the next task it cannot get rocks db lock and simply keeps waiting for that lock forever.

      in retryWithBackoff for AbstractTaskCreator we have a backoffTimeMs = 50L.
      If it does not get lock the we simply increase the time by 10x and keep trying inside the while true loop.

      We need to have a upper bound for this backoffTimeM. If the time is greater than MAX_POLL_INTERVAL_MS_CONFIG and it still hasn't got the lock means this thread's partitions are moved somewhere else and it may not get the lock again.

      Attachments

        1. thr-1
          279 kB
          Sachin Mittal

        Issue Links

          Activity

            People

              sjmittal Sachin Mittal
              sjmittal Sachin Mittal
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: