Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-281

support multiple root log directories

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: core
    • Labels:
      None

      Description

      Currently, the log layout is

      {log.dir}/topicname-partitionid and one can only specify 1 {log.dir}

      . This limits the # of topics we can have per broker. We can potentially support multiple directories for

      {log.dir}

      and just assign topics using hashing or round-robin.

        Issue Links

          Activity

          Hide
          tgautier Taylor Gautier added a comment -

          I would recommend not using round-robin as that would lead to having to have some meta-data that keeps track of what directory goes where. Hashing is easy, but the downside is that it's not trivially discoverable if a person is using a command line shell to browse the directory structure.

          Show
          tgautier Taylor Gautier added a comment - I would recommend not using round-robin as that would lead to having to have some meta-data that keeps track of what directory goes where. Hashing is easy, but the downside is that it's not trivially discoverable if a person is using a command line shell to browse the directory structure.
          Hide
          jkreps Jay Kreps added a comment -

          Is this to work around the max subdirectory limits some filesystems have (e.g. I think ext4 has a limit of 64k subdirectories per directory)?

          The other advantage of this is that you can actually get rid of RAID and just run with JBOD using a separate mount point for each drive and having a data directory per drive (a la Hadoop). We wouldn't do this now, but if we had replication this would be a big win. The overhead of RAID is usually like a 20-30% perf hit, plus the additional disk space it takes up. In this setup you would be depending on replication for disk failures. The trade-off is that a single drive failure would kill a machine. In practice due to raid resync perf hit we seem to have this problem already.

          Show
          jkreps Jay Kreps added a comment - Is this to work around the max subdirectory limits some filesystems have (e.g. I think ext4 has a limit of 64k subdirectories per directory)? The other advantage of this is that you can actually get rid of RAID and just run with JBOD using a separate mount point for each drive and having a data directory per drive (a la Hadoop). We wouldn't do this now, but if we had replication this would be a big win. The overhead of RAID is usually like a 20-30% perf hit, plus the additional disk space it takes up. In this setup you would be depending on replication for disk failures. The trade-off is that a single drive failure would kill a machine. In practice due to raid resync perf hit we seem to have this problem already.
          Hide
          tgautier Taylor Gautier added a comment -

          Yes. There's not just a hard limit - there is a practical limit. We've found that EXT3 that limit is around 20k. The limit has to do with some of the low level posix apis and how they are implemented, I saw a post some time ago about how to make this better, but for the time being it's generally inefficient in most filesystems to have large numbers of files/directories in a single directory.

          Also, as you point out, it makes it next to impossible to easily add additional storage since there is only basically one mount point.

          Show
          tgautier Taylor Gautier added a comment - Yes. There's not just a hard limit - there is a practical limit. We've found that EXT3 that limit is around 20k. The limit has to do with some of the low level posix apis and how they are implemented, I saw a post some time ago about how to make this better, but for the time being it's generally inefficient in most filesystems to have large numbers of files/directories in a single directory. Also, as you point out, it makes it next to impossible to easily add additional storage since there is only basically one mount point.
          Hide
          brugidou Maxime Brugidou added a comment -

          I guess this was done on 0.8 as part of KAFKA-188

          Show
          brugidou Maxime Brugidou added a comment - I guess this was done on 0.8 as part of KAFKA-188
          Hide
          junrao Jun Rao added a comment -

          fixed in KAFKA-188

          Show
          junrao Jun Rao added a comment - fixed in KAFKA-188

            People

            • Assignee:
              Unassigned
              Reporter:
              junrao Jun Rao
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development