Type: New Feature
Status: Patch Available
Affects Version/s: None
Fix Version/s: None
HDFS Scalability-v2.pdf describes areas where HDFS does well and its scaling challenges and how to address those challenges. Scaling HDFS requires scaling the namespace layer and also the block layer. This jira provides a new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer by grouping blocks into containers thereby reducing the block-to-location map and also reducing the number of block reports and their processing
A scalable namespace can be put on top this scalable block layer:
- HDFS-10419 describes how the existing NN can be modified to use the new block layer.
HDFS-13074also provides, as an interim step; a scalable flat Key-Value namespace on top of the new block layer; while it does not provide the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext).
This jira proposes to add object store capabilities into HDFS.
As part of the federation work (
HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes.
In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata.
I will soon update with a detailed design document.