Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
This jira splits out the Ozone's Key Value namespace out of HDFS-7240, leaving that jira to focus on the block layer Hadoop Distributed Storage Layer (HDSL). HDFS-10419 focuses on the traditional hierarchical namespace/NN on top of HDFS while this jira focuses on a flat Key-Value namespace call ed Ozone FS on top HDSL.
owen.omalley suggested the split in HDFS-7240 in this comment.
Ozone provides two APIs:
- A KV API
- A Hadoop compatible FS (Haddop FileSystem and Hadoop FileContext), on top of KV API.
Ozone FS serves the following purpose
- It helps test the new storage layer (HDSL)
- It can be directly used by applications (such. Hive, Spark) that are ready for cloud's native storage systems that use a KV namespace such as S3, Azure ADLS.
Ozone's namespace server scales well because of following reasons:
- It keeps only the working set of metadata in memory - this scales the Key-Value namespace.
- Ozone does not need to keep the equivalent of the block-map; instead the corresponding container-to-location mapping resides in in the SCM - this leaves additional free space for caching the metadata. This benefit is also available to NN that adapts to the new block-storage container layer as explained in Evolving NN using new block container layer attached in HDFS-10419.
- It does not have a single global lock - this will scale well against a large number of concurrent clients/rpcs.
- It can be partitioned/sharded much more easily because it is a flat namespace. Indeed a natural partitioning level is the bucket. Partitioning would further scale the number of clients/rpcs and also the Key-value namespace.
Attachments
Issue Links
- is related to
-
HDFS-7240 Scaling HDFS
- Patch Available