Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13633

Introduce Apache Kafka as a Service into Hadoop

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      In HDFS-7343 we want to develop a comprehensive storage management solution originated from community discussions, in order for allowing convenient, intelligent and effective utilization of various HDFS facilities such as erasure coding, HDFS cache, HSM offering, and etc. based on valuable insights from events and data collected from namenodes, datanodes, frameworks and applications via a pub-sub messaging system. In HDFS-8940 it was discussed that the proposed large scale inotify feature would be better to be implemented via Kafka system to allowing thousands of consumers or inotify clients.

      Apache Kafka is a distributed messaging system that aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds, and currently it’s widely used in real-time streaming process field. Considering the above two important use cases desired in Hadoop, we’d like to propose to introduce Kafka as a fundamental event pub-sub service into Hadoop platform. Like FileSystem offering, we’d like to provide MessagingSystem in Hadoop style and conforming Hadoop security, backed by an internal or external existing Kafka cluster. Generally the new service is very convenient to use, and can be used to distribute and exchange various types of events across IO, storage, and computation that produced by Hadoop itself, frameworks or applications on top of it. Then on this basis valuable events can be analyzed in a centralized way so that meaningful applications and usages can be developed.

      The design document is under-going and will be submitted in a week. Feedback are very welcome. Thanks!

      Attachments

        Issue Links

          Activity

            Be aware that 3.0.0-alpha1 introduced the hadoop-kafka module, underneath hadoop-tools. That should be taken into consideration when/if this gets introduced into the hadoop source tree. For example, it might be worthwhile to rename that one to hadoop-kafkametrics and this one to hadoop-kafkafs or make this feature bundled into that module or ....

            aw Allen Wittenauer added a comment - Be aware that 3.0.0-alpha1 introduced the hadoop-kafka module, underneath hadoop-tools. That should be taken into consideration when/if this gets introduced into the hadoop source tree. For example, it might be worthwhile to rename that one to hadoop-kafkametrics and this one to hadoop-kafkafs or make this feature bundled into that module or ....
            HuafengWang Huafeng Wang added a comment -

            Hi Allen, thanks for the reminding and we do know there is a hadoop-kafka module, which contains only one class. I think it would be better to integrate that kafka metric sink with the new hadoop-kafka module.

            HuafengWang Huafeng Wang added a comment - Hi Allen, thanks for the reminding and we do know there is a hadoop-kafka module, which contains only one class. I think it would be better to integrate that kafka metric sink with the new hadoop-kafka module.
            HuafengWang Huafeng Wang added a comment -

            Here is the draft design document.
            Great thanks to zhz, drankye, rakeshr, umamaheswararao, hayabusa, zhouwei for the co-work on this design.
            Any advice or comment on the design is appreciated.

            HuafengWang Huafeng Wang added a comment - Here is the draft design document. Great thanks to zhz , drankye , rakeshr , umamaheswararao , hayabusa , zhouwei for the co-work on this design. Any advice or comment on the design is appreciated.

            People

              HuafengWang Huafeng Wang
              HuafengWang Huafeng Wang
              Votes:
              0 Vote for this issue
              Watchers:
              31 Start watching this issue

              Dates

                Created:
                Updated: