Accumulo
  1. Accumulo
  2. ACCUMULO-722

Accumulo using Accumulo as its own NameNode

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      On large clusters, the NameNode can become a performance bottleneck. The NameNode is also a single-point of failure. Recent improvements to HDFS to support High Availability and Federation [See ACCUMULO-118] help address these issues, but at greater administrative costs and specialized hardware.

      We have seen demonstrations of using HBase to host a NameNode. There's Aaron Cordova's example of a Distributed Name Node:

      Design for a Distributed Name Node

      And giraffa:

      Dynamic Namespace Partitioning with Giraffa File System

      We could incrementally implement a self-hosted Accumulo, which would run as its own NameNode. This would be useful for large Accumulo installations. Over the long term, we could incorporate all NameNode functions to provide a scalable, distributed NameNode for other large Hadoop installations.

      Hopefully the approach used could be trivially ported to HBase as well.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          621d 2h 24m 1 Eric Newton 21/Apr/14 21:03
          Christopher Tubbs made changes -
          Fix Version/s 1.7.0 [ 12324607 ]
          Eric Newton made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Eric Newton made changes -
          Link This issue is related to ACCUMULO-118 [ ACCUMULO-118 ]
          Hide
          Eric Newton added a comment -

          Closing this in favor of the multi-volume approach.

          Show
          Eric Newton added a comment - Closing this in favor of the multi-volume approach.
          Hide
          Sean Busbey added a comment - - edited

          among recent changes in HDFS:

          Could we please forgo this ticket in favor of interested parties working on relevant bits within HDFS?

          Show
          Sean Busbey added a comment - - edited among recent changes in HDFS: HA NameNode ( HDFS-3077 ) Pluggable locality / failure tolerances ( HDFS-385 , HADOOP-8468 ) Improving scalability of namespace operations ( HDFS-5389 ) and block location mapping ( HDFS-5711 ) Could we please forgo this ticket in favor of interested parties working on relevant bits within HDFS?
          John Vines made changes -
          Fix Version/s 1.7.0 [ 12324607 ]
          Fix Version/s 1.6.0 [ 12322468 ]
          Hide
          John Vines added a comment -

          Bumping to 1.7, however I'm wondering if this is a feature that should have an undefined fix version

          Show
          John Vines added a comment - Bumping to 1.7, however I'm wondering if this is a feature that should have an undefined fix version
          Eric Newton made changes -
          Assignee Eric Newton [ ecn ]
          John Vines made changes -
          Issue Type Bug [ 1 ] Improvement [ 4 ]
          Hide
          Eric Newton added a comment -

          Mike,

          Agreed... been following bookkeeper as a WAL destination for Accumulo, too.

          Show
          Eric Newton added a comment - Mike, Agreed... been following bookkeeper as a WAL destination for Accumulo, too.
          Hide
          Josh Elser added a comment -

          Ignoring the issues of "is my NN scalable", I think this is a novel idea which I would be interested in trying to help. I don't have an opinion on whether or not it merits a 1.6 feature request, but that's not my decision to make.

          Show
          Josh Elser added a comment - Ignoring the issues of "is my NN scalable", I think this is a novel idea which I would be interested in trying to help. I don't have an opinion on whether or not it merits a 1.6 feature request, but that's not my decision to make.
          Hide
          Mike Drob added a comment -

          A HA NN no longer requires expensive hardware - see HDFS-3077

          Show
          Mike Drob added a comment - A HA NN no longer requires expensive hardware - see HDFS-3077
          Hide
          Eric Newton added a comment -

          We've gathered statistics on the number of NN write operations per second from an Accumulo instance running continuous ingest. We are seeing 3-5 write updates to HDFS metadata per-node, per-second. This implies that the NN becomes a limitation at 1200 to 2000 node range

          Show
          Eric Newton added a comment - We've gathered statistics on the number of NN write operations per second from an Accumulo instance running continuous ingest. We are seeing 3-5 write updates to HDFS metadata per-node, per-second. This implies that the NN becomes a limitation at 1200 to 2000 node range
          Eric Newton made changes -
          Field Original Value New Value
          Fix Version/s 1.6.0 [ 12322468 ]
          Hide
          Eric Newton added a comment -

          David, I browsed the CFS site. The first question on that page asks about the tricky part where there's only one writer to a file. This is a key problem, since having a single writer to a write-ahead-log is key for proper failure conditions. In particular, we must ensure that the writes by a failing tablet server to its write-ahead log are denied while we use that file for recovery. The response "HDFS does not implement posix semantics" is true, but it understates the importance of this feature to the HBase and/or Accumulo WAL. And, it appears this is not an open-source solution, so I'm unable to test it without committing additional resources. Does anyone know if they support exclusive writer semantics?

          Show
          Eric Newton added a comment - David, I browsed the CFS site. The first question on that page asks about the tricky part where there's only one writer to a file. This is a key problem, since having a single writer to a write-ahead-log is key for proper failure conditions. In particular, we must ensure that the writes by a failing tablet server to its write-ahead log are denied while we use that file for recovery. The response "HDFS does not implement posix semantics" is true, but it understates the importance of this feature to the HBase and/or Accumulo WAL. And, it appears this is not an open-source solution, so I'm unable to test it without committing additional resources. Does anyone know if they support exclusive writer semantics?
          Hide
          David Medinets added a comment -

          As background, I'll reference the Cassandra File System from DataStax. Check http://www.datastax.com/dev/blog/cassandra-file-system-design and http://www.datastax.com/resources/whitepapers/hdfs-vs-cfs for more information.

          Show
          David Medinets added a comment - As background, I'll reference the Cassandra File System from DataStax. Check http://www.datastax.com/dev/blog/cassandra-file-system-design and http://www.datastax.com/resources/whitepapers/hdfs-vs-cfs for more information.
          Hide
          Eric Newton added a comment - - edited

          It kinda works... but it's very much a prototype.

          I am able to run the accumulo continuous ingest test, and the verify map-reduce without a name node running.

          Lots to think about, like permissions, balancing, replication and even basic schema is open for review/change.

          This is based primarily on Aaron's prototype, with ideas taken from Giraffa for cleanly wedging a proxy in for the NameNode. There are a few changes to Accumulo-1.4 required, which are in the ACCUMULO-722 branch. Tested against hadoop 1.0.3.

          Show
          Eric Newton added a comment - - edited It kinda works... but it's very much a prototype. I am able to run the accumulo continuous ingest test, and the verify map-reduce without a name node running. Lots to think about, like permissions, balancing, replication and even basic schema is open for review/change. This is based primarily on Aaron's prototype, with ideas taken from Giraffa for cleanly wedging a proxy in for the NameNode. There are a few changes to Accumulo-1.4 required, which are in the ACCUMULO-722 branch. Tested against hadoop 1.0.3.
          Eric Newton created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Eric Newton
            • Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development