Kafka
  1. Kafka
  2. KAFKA-742

Existing directories under the Kafka data directory without any data cause process to not start

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.3
    • Component/s: config
    • Labels:
      None

      Description

      I incorrectly setup the configuration file to have the metrics go to /var/kafka/metrics while the logs were in /var/kafka. On startup I received the following error then the daemon exited:

      30 [main] INFO kafka.log.LogManager - [Log Manager on Broker 0] Loading log 'metrics'
      32 [main] FATAL kafka.server.KafkaServerStartable - Fatal error during KafkaServerStable startup. Prepare to shutdown
      java.lang.StringIndexOutOfBoundsException: String index out of range: -1
      at java.lang.String.substring(String.java:1937)
      at kafka.log.LogManager.kafka$log$LogManager$$parseTopicPartitionName(LogManager.scala:335)
      at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$3.apply(LogManager.scala:112)
      at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$3.apply(LogManager.scala:109)
      at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
      at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34)
      at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:109)
      at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:101)
      at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
      at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32)
      at kafka.log.LogManager.loadLogs(LogManager.scala:101)
      at kafka.log.LogManager.<init>(LogManager.scala:62)
      at kafka.server.KafkaServer.startup(KafkaServer.scala:59)
      at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
      at kafka.Kafka$.main(Kafka.scala:46)
      at kafka.Kafka.main(Kafka.scala)
      34 [main] INFO kafka.server.KafkaServer - [Kafka Server 0], shutting down

      This was on a brand new cluster so no data or metrics logs existed yet.

      Moving the metrics to their own directory (not a child of the logs) allowed the daemon to start.

      Took a few minutes to figure out what was wrong.

      1. KAFKA-742.patch
        4 kB
        Ashish K Singh
      2. KAFKA-742.1.patch
        5 kB
        Ashish K Singh

        Activity

        Hide
        Jay Kreps added a comment -

        Getting an error is not really bad since we assume we have control of everything under the log directory (it is obviously hard for us to distinguish a log directory from a non-log directory and silently ignoring could be worse than an error). So I think there are two problems here:
        1. Are we enabling metrics logging by default? Are we creating the metrics dir even if metrics logging is not enabled. This needs to be sanity checked...
        2. If there are bogus directories under the log directory I think the right thing to do is to give a better error message (something like "Found directory /x/y/z, 'z' is not in the form topic-partition").

        If you agree with those fixes I will take this on.

        Show
        Jay Kreps added a comment - Getting an error is not really bad since we assume we have control of everything under the log directory (it is obviously hard for us to distinguish a log directory from a non-log directory and silently ignoring could be worse than an error). So I think there are two problems here: 1. Are we enabling metrics logging by default? Are we creating the metrics dir even if metrics logging is not enabled. This needs to be sanity checked... 2. If there are bogus directories under the log directory I think the right thing to do is to give a better error message (something like "Found directory /x/y/z, 'z' is not in the form topic-partition"). If you agree with those fixes I will take this on.
        Hide
        Chris Curtin added a comment -

        Thanks Jay, a better error is a good idea for this.

        Show
        Chris Curtin added a comment - Thanks Jay, a better error is a good idea for this.
        Hide
        Ashish K Singh added a comment -

        Jay Kreps, Chris Curtin, I would like to take a stab at this. Assigning it to myself.

        Show
        Ashish K Singh added a comment - Jay Kreps , Chris Curtin , I would like to take a stab at this. Assigning it to myself.
        Hide
        Ashish K Singh added a comment -

        Jay Kreps Now that I actually started to work on this. I re-read your comment above and realized you mentioned that you intend to work on this. I missed the line when I assigned the JIRA to myself. My apologies for the same. Kindly feel free to take it on and assign it to yourself. However, if you are not planning to work on this, then let me know and then I can work on this. My apologies for the confusion.

        Show
        Ashish K Singh added a comment - Jay Kreps Now that I actually started to work on this. I re-read your comment above and realized you mentioned that you intend to work on this. I missed the line when I assigned the JIRA to myself. My apologies for the same. Kindly feel free to take it on and assign it to yourself. However, if you are not planning to work on this, then let me know and then I can work on this. My apologies for the confusion.
        Hide
        Jay Kreps added a comment -

        No, I said that but then never did any work. Definitely take it!

        Show
        Jay Kreps added a comment - No, I said that but then never did any work. Definitely take it!
        Hide
        Ashish K Singh added a comment -

        Jay Kreps ok, I am on it then

        Show
        Ashish K Singh added a comment - Jay Kreps ok, I am on it then
        Hide
        Ashish K Singh added a comment - - edited

        Created RB: https://reviews.apache.org/r/29030/

        Are we enabling metrics logging by default? Are we creating the metrics dir even if metrics logging is not enabled. This needs to be sanity checked...

        metrics dir is created by a metrics reporter during its init, which is called by KafkaMetricsReporter only if the reported is registered.

        If there are bogus directories under the log directory I think the right thing to do is to give a better error message (something like "Found directory /x/y/z, 'z' is not in the form topic-partition")

        parseTopicPartitionName only gets log directory name and not the path. I can make changes for it to receive directory path rather than just the name. However, I think just specifying that dir "some_dir" is not in the form topic-partition should be good enough. Let me know if you guys think otherwise. Other possible solution is I can catch the exception thrown by parseTopicPartitionName in the caller, which has the path for dir, and throw exception with full path of the dir.

        Show
        Ashish K Singh added a comment - - edited Created RB: https://reviews.apache.org/r/29030/ Are we enabling metrics logging by default? Are we creating the metrics dir even if metrics logging is not enabled. This needs to be sanity checked... metrics dir is created by a metrics reporter during its init, which is called by KafkaMetricsReporter only if the reported is registered. If there are bogus directories under the log directory I think the right thing to do is to give a better error message (something like "Found directory /x/y/z, 'z' is not in the form topic-partition") parseTopicPartitionName only gets log directory name and not the path. I can make changes for it to receive directory path rather than just the name. However, I think just specifying that dir "some_dir" is not in the form topic-partition should be good enough. Let me know if you guys think otherwise. Other possible solution is I can catch the exception thrown by parseTopicPartitionName in the caller, which has the path for dir, and throw exception with full path of the dir.
        Hide
        Gwen Shapira added a comment -

        I think we need to print the full path to the directory we can't parse - I've met at least two customers who installed Kafka through a distribution and had no idea where their Kafka log directory is.

        Also, I'd add something to the error message along lines of "If a directory does not contain Kafka topic data it should not exist in Kafka's log directory. Please move it elsewhere."

        Show
        Gwen Shapira added a comment - I think we need to print the full path to the directory we can't parse - I've met at least two customers who installed Kafka through a distribution and had no idea where their Kafka log directory is. Also, I'd add something to the error message along lines of "If a directory does not contain Kafka topic data it should not exist in Kafka's log directory. Please move it elsewhere."
        Hide
        Ashish K Singh added a comment -

        Makes sense. Updated patch.

        Show
        Ashish K Singh added a comment - Makes sense. Updated patch.
        Hide
        Neha Narkhede added a comment -

        Thanks for the patch, Ashish K Singh. Pushed to trunk

        Show
        Neha Narkhede added a comment - Thanks for the patch, Ashish K Singh . Pushed to trunk
        Hide
        Andrew Olson added a comment -

        Just a side note, we've seen this problem on Redhat Linux when the OS creates a special directory named "lost+found" in a mounted filesystem.

        Show
        Andrew Olson added a comment - Just a side note, we've seen this problem on Redhat Linux when the OS creates a special directory named "lost+found" in a mounted filesystem.

          People

          • Assignee:
            Ashish K Singh
            Reporter:
            Chris Curtin
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development