Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1565

DFSScalability: reduce memory usage of namenode



    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.15.0
    • None
    • None


      Experiments have demonstrated that a single file/block needs about 300 to 500 bytes of main memory on a 64-bit Namenode. This puts some limitations on the size of the file system that a single namenode can support. Most of this overhead occurs because a block and/or filename is inserted into multiple TreeMaps and/or HashSets.

      Here are a few ideas that can be measured to see if an appreciable reduction of memory usage occurs:

      1. Change FSDirectory.children from a TreeMap to an array. Do binary search in this array while looking up children. This saves a TreeMap object for every intermediate node in the directory tree.
      2. Change INode from an inner class. This saves on one "parent object" reference for each INODE instance. 4 bytes per inode.
      3. Keep all DatanodeDescriptors in an array. BlocksMap.nodes[] is currently a 64-bit reference to the DatanodeDescriptor object. Instead, it can be a 'short'. This will probably save about 16 bytes per block.
      4. Change DatanodeDescriptor.blocks from a SortedTreeMap to a HashMap? Block report processing CPU cost can increase.

      For the records: TreeMap has the following fields:
      Object key;
      Object value;
      Entry left = null;
      Entry right = null;
      Entry parent;
      boolean color = BLACK;

      and HashMap object:
      final Object key;
      Object value;
      final int hash;
      Entry next;


        1. memoryReduction3.patch
          5 kB
          Dhruba Borthakur



            dhruba Dhruba Borthakur
            dhruba Dhruba Borthakur
            0 Vote for this issue
            0 Start watching this issue