Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      Currently the LogManager is responsible for deleting and cleaning up messages based on time or size. It'd be nice to be able enhance the LogManager to not only perform the cleanup but maybe also backup the messages eligible for deletion to a custom location (hdfs). This would allow a backup plan in the case of a consumer not able to keep up with the messages and data being lost due to log rolling.

      Currently LogManager is sealed so no one can extend it but additionally we'd need a way to inject the custom LogManager into the KafkaServer.

        Activity

        Hide
        Neha Narkhede added a comment -

        LogManager is an internal component of a Kafka server that manages a server's log storage. It may not be a good idea to make it pluggable. For the purposes of log backup into HDFS, it makes sense to use something like - https://github.com/linkedin/camus.

        Show
        Neha Narkhede added a comment - LogManager is an internal component of a Kafka server that manages a server's log storage. It may not be a good idea to make it pluggable. For the purposes of log backup into HDFS, it makes sense to use something like - https://github.com/linkedin/camus .
        Hide
        Jay Kreps added a comment -

        Yeah I would also be reticent to do this. The reason is just that each interface that is "injectable" is something we need to plan compatibility around. Currently we don't do this for internal interfaces.

        With replication it is possible to hang onto data for a long time on the kafka servers and have data replicated so it is likely to survive even server failures. If you want more assurance still, since Kafka is easy to consume from its easy enough to build a consumer that backs up the data elsewhere. If you want something more low-tech I think you could just have a shell script or background java thread that scps the kafka files without needing to inject it into the actual log roll logic.

        Show
        Jay Kreps added a comment - Yeah I would also be reticent to do this. The reason is just that each interface that is "injectable" is something we need to plan compatibility around. Currently we don't do this for internal interfaces. With replication it is possible to hang onto data for a long time on the kafka servers and have data replicated so it is likely to survive even server failures. If you want more assurance still, since Kafka is easy to consume from its easy enough to build a consumer that backs up the data elsewhere. If you want something more low-tech I think you could just have a shell script or background java thread that scps the kafka files without needing to inject it into the actual log roll logic.
        Hide
        Micah Whitacre added a comment -

        Thanks for the feedback. The fact that LogManager was sealed helped to indicate this was not a common integration point. The concern of losing a message due to a consumer being slow or down for a period and the log manager removing messages due to space was what initially inspired looking at it through this approach.

        Is there by chance a way of alarming/tracking if a log roll is about to occur and not all consumers have an offset higher than the values to be deleted?

        Show
        Micah Whitacre added a comment - Thanks for the feedback. The fact that LogManager was sealed helped to indicate this was not a common integration point. The concern of losing a message due to a consumer being slow or down for a period and the log manager removing messages due to space was what initially inspired looking at it through this approach. Is there by chance a way of alarming/tracking if a log roll is about to occur and not all consumers have an offset higher than the values to be deleted?
        Hide
        Jay Kreps added a comment -

        My recommendation is just to increase the retention. We maintain everything for 7 days. We alert off the consumer lag, but typically these alerts would trigger if the application falls 20 mins behind we would not wait for 7 days.

        Show
        Jay Kreps added a comment - My recommendation is just to increase the retention. We maintain everything for 7 days. We alert off the consumer lag, but typically these alerts would trigger if the application falls 20 mins behind we would not wait for 7 days.

          People

          • Assignee:
            Unassigned
            Reporter:
            Micah Whitacre
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development