Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15490

Invalid path provided to the log failure channel upon I/O error when writing broker metadata checkpoint

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.4.0, 3.4.1, 3.5.1, 3.6.1
    • 3.6.2
    • core
    • None

    Description

      There is a small bug/typo in the handling of I/O error when writing broker metadata checkpoint in KafkaServer. The path provided to the log dir failure channel is the full path of the checkpoint file whereas only the log directory is expected (source).

      case e: IOException =>
         val dirPath = checkpoint.file.getAbsolutePath
         logDirFailureChannel.maybeAddOfflineLogDir(dirPath, s"Error while writing meta.properties to $dirPath", e)

      As a result, after an IOException is captured and enqueued in the log dir failure channel (<logDir> is to be replaced with the actual path of the log directory):

      [2023-09-22 17:07:32,052] ERROR Error while writing meta.properties to <logDir>/meta.properties (kafka.server.LogDirFailureChannel) java.io.IOException

      The log dir failure handler cannot lookup the log directory:

      [2023-09-22 17:07:32,053] ERROR [LogDirFailureHandler]: Error due to (kafka.server.ReplicaManager$LogDirFailureHandler) org.apache.kafka.common.errors.LogDirNotFoundException: Log dir <logDir>/meta.properties is not found in the config.

      An immediate fix for this is to use the logDir provided from to the checkpointing method instead of the path of the metadata file.

      For brokers with only one log directory, this bug will result in preventing the broker from shutting down as expected.

      The LogDirNotFoundException then kills the log dir failure handler thread, and subsequent IOException are not handled, and the broker never stops.

      [2024-02-27 02:13:13,564] INFO [LogDirFailureHandler]: Stopped (kafka.server.ReplicaManager$LogDirFailureHandler)

      Another consideration here is whether the LogDirNotFoundException should terminate the log dir failure handler thread.

      Attachments

        Issue Links

          Activity

            People

              divijvaidya Divij Vaidya
              adupriez Alexandre Dupriez
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: