Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9458

Kafka crashed in windows environment

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Patch Available
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0
    • Fix Version/s: None
    • Component/s: log
    • Labels:
    • Environment:
      Windows Server 2019

      Description

      Hi,

      while I was trying to validate Kafka retention policy, Kafka Server crashed with below exception trace. 

      [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log)
      [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka (kafka.server.LogDirFailureChannel)
      java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.

      at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
      at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
      at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
      at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
      at java.base/java.nio.file.Files.move(Files.java:1425)
      at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
      at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
      at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
      at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
      at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
      at scala.collection.immutable.List.foreach(List.scala:305)
      at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
      at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
      at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
      at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
      at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
      at kafka.log.Log.deleteSegments(Log.scala:1691)
      at kafka.log.Log.deleteOldSegments(Log.scala:1686)
      at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
      at kafka.log.Log.deleteOldSegments(Log.scala:1753)
      at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
      at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
      at scala.collection.immutable.List.foreach(List.scala:305)
      at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
      at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
      at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
      at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
      at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      at java.base/java.lang.Thread.run(Thread.java:830)
      Suppressed: java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.

      at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
      at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
      at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
      at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
      at java.base/java.nio.file.Files.move(Files.java:1425)
      at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
      ... 27 more
      [2020-01-21 17:10:40,495] INFO [ReplicaManager broker=0] Stopping serving replicas in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka (kafka.server.ReplicaManager)
      [2020-01-21 17:10:40,495] ERROR Uncaught exception in scheduled task 'kafka-log-retention' (kafka.utils.KafkaScheduler)
      org.apache.kafka.common.errors.KafkaStorageException: Error while deleting segments for test1-3 in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka
      Caused by: java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.

      at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
      at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
      at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
      at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
      at java.base/java.nio.file.Files.move(Files.java:1425)
      at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
      at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
      at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
      at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
      at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
      at scala.collection.immutable.List.foreach(List.scala:305)
      at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
      at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
      at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
      at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
      at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
      at kafka.log.Log.deleteSegments(Log.scala:1691)
      at kafka.log.Log.deleteOldSegments(Log.scala:1686)
      at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
      at kafka.log.Log.deleteOldSegments(Log.scala:1753)
      at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
      at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
      at scala.collection.immutable.List.foreach(List.scala:305)
      at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
      at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
      at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
      at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
      at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      at java.base/java.lang.Thread.run(Thread.java:830)
      Suppressed: java.nio.file.FileSystemException: C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex -> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\00000000000000000000.timeindex.deleted: The process cannot access the file because it is being used by another process.

      at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
      at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
      at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
      at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
      at java.base/java.nio.file.Files.move(Files.java:1425)
      at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
      ... 27 more
      [2020-01-21 17:10:40,505] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions HashSet(test1-3, test1-7, test-0, test1-0, test1-1, test1-5, test1-2, test1-8, test1-4, test1-9, test1-6) (kafka.server.ReplicaFetcherManager)
      [2020-01-21 17:10:40,507] INFO [ReplicaAlterLogDirsManager on broker 0] Removed fetcher for partitions HashSet(test1-3, test1-7, test-0, test1-0, test1-1, test1-5, test1-2, test1-8, test1-4, test1-9, test1-6) (kafka.server.ReplicaAlterLogDirsManager)
      [2020-01-21 17:10:40,522] INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions test1-3,test1-7,test-0,test1-0,test1-1,test1-5,test1-2,test1-8,test1-4,test1-9,test1-6 and stopped moving logs for partitions because they are in the failed log directory C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka. (kafka.server.ReplicaManager)
      [2020-01-21 17:10:40,523] INFO Stopping serving logs in dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka (kafka.log.LogManager)
      [2020-01-21 17:10:40,526] ERROR Shutdown broker because all log dirs in C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka have failed (kafka.log.LogManager)

        Attachments

        1. kafka_windows_crash_by_delete_topic_and_Partition_migration
          19 kB
          Wenbing Shen
        2. logs.zip
          29 kB
          hirik
        3. Windows_crash_fix.patch
          38 kB
          hirik

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                hirik hirik
              • Votes:
                2 Vote for this issue
                Watchers:
                13 Start watching this issue

                Dates

                • Created:
                  Updated: