Details
Description
[GSoC] RocketMQ TieredStore Integration with HDFS
Github Issue: https://github.com/apache/rocketmq/issues/6282
Apache RocketMQ and HDFS
- Apache RocketMQ is a cloud native messaging and streaming platform, making it simple to build event-driven applications.
- Hadoop Distributed File System (HDFS) is a distributed file system designed to store and manage large data sets across multiple servers or clusters. HDFS provides a reliable, scalable, and fault-tolerant platform for storing and accessing data that can be accessed by a variety of applications running on the hadoop cluster.
Background
High-speed storage media, such as solid-state drives (SSDs), are typically more expensive than traditional hard disk drives (HDDs). To minimize storage costs, the local data disk size of a rocketmq broker is often limited. HDFS can store large amounts of data at a lower cost, it has better support for storing and retrieving data sequentially rather than randomly. In order to preserve message data over a long period or facilitate message export, the RocketMQ project previously introduced a tiered storage plugin. Now it is necessary to implement a storage plugin to save data on hdfs.
Relevant Skills
- Interest in messging middleware and distributed storage system
- Java development skills
- Having a good understanding of rocketmq and hdfs models
Anyways, the most important relevant skill is motivation and readiness to learn during the project!
Tasks
- understand the basic concepts and principles in distributed systems
- provide related design documents
- develop one that uses hdfs as the backend storage plugin to store rocketmq message data
- write effective unit test code
- *suggest improvements to the tiered storage interface
- *what ever comes in your mind further ideas are always welcome
Learning Material
- RocketMQ HomePage (https://rocketmq.apache.org) Github: https://github.com/apache/rocketmq
- RocketMQ Tiered Storage Design (https://github.com/apache/rocketmq/wiki/RIP-57-Tiered-storage-for-RocketMQ)
- HDFS HomePage (https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html)
Name and contact information
- Mentor: Zhimin Li, Apache RocketMQ Committer, lizhimin@apache.org
- Mailing List: dev@rocketmq.apache.org
- Website: https://rocketmq.apache.org/ and https://hadoop.apache.org/