Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
-
None
Description
On large clusters, the NameNode can become a performance bottleneck. The NameNode is also a single-point of failure. Recent improvements to HDFS to support High Availability and Federation [See ACCUMULO-118] help address these issues, but at greater administrative costs and specialized hardware.
We have seen demonstrations of using HBase to host a NameNode. There's Aaron Cordova's example of a Distributed Name Node:
Design for a Distributed Name Node
And giraffa:
Dynamic Namespace Partitioning with Giraffa File System
We could incrementally implement a self-hosted Accumulo, which would run as its own NameNode. This would be useful for large Accumulo installations. Over the long term, we could incorporate all NameNode functions to provide a scalable, distributed NameNode for other large Hadoop installations.
Hopefully the approach used could be trivially ported to HBase as well.
Attachments
Attachments
Issue Links
- is related to
-
ACCUMULO-118 accumulo could work across HDFS instances, which would help it to scale past a single namenode
- Resolved