Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      It would be nice to have Kafka brokers auto-assign node ids rather than having that be a configuration. Having a configuration is irritating because (1) you have to generate a custom config for each broker and (2) even though it is in configuration, changing the node id can cause all kinds of bad things to happen.

        Activity

        Hide
        Jay Kreps added a comment -

        I think the right way to do this is to have a sequence in zookeeper that atomically increments and use this for id generation. On startup a node that has no id can generate one for itself and store it.

        One tricky bit is that this node id needs to be stored with the data, but we actually partition data up over multiple disks now and hope to be able to survive the destruction of any of them. Which disk should we store the node id on? I would recommend we store it on all of them--if it is missing on some we will add it there, if the id is inconsistent between disks we will error out (this should never happen).

        I would recommend adding a properties file named "meta" in every data directory containing the "id=x" value, we can extend this later with more perminant values. For example, I think it would be nice to add a data format version to help with in-place data upgrades.

        On startup the broker would check this value for consistency across directories. If it is not present in any directory it would auto-generate a node id and persist that for future use.

        For compatibility we would retain the current id configuration value--if it is present we will use it and ensure the id sequence is larger than this value.

        Show
        Jay Kreps added a comment - I think the right way to do this is to have a sequence in zookeeper that atomically increments and use this for id generation. On startup a node that has no id can generate one for itself and store it. One tricky bit is that this node id needs to be stored with the data, but we actually partition data up over multiple disks now and hope to be able to survive the destruction of any of them. Which disk should we store the node id on? I would recommend we store it on all of them--if it is missing on some we will add it there, if the id is inconsistent between disks we will error out (this should never happen). I would recommend adding a properties file named "meta" in every data directory containing the "id=x" value, we can extend this later with more perminant values. For example, I think it would be nice to add a data format version to help with in-place data upgrades. On startup the broker would check this value for consistency across directories. If it is not present in any directory it would auto-generate a node id and persist that for future use. For compatibility we would retain the current id configuration value--if it is present we will use it and ensure the id sequence is larger than this value.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jay Kreps
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development