Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1114

Reducing NameNode memory usage by an alternate hash table

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      NameNode uses a java.util.HashMap to store BlockInfo objects. When there are many blocks in HDFS, this map uses a lot of memory in the NameNode. We may optimize the memory usage by a light weight hash table implementation.

      1. gset20100702.tex
        14 kB
        Tsz Wo Nicholas Sze
      2. gset20100702.pdf
        107 kB
        Tsz Wo Nicholas Sze
      3. h1114_20100617b2_y0.20.1xx.patch
        27 kB
        Tsz Wo Nicholas Sze
      4. h1114_20100617b_y0.20.1xx.patch
        27 kB
        Tsz Wo Nicholas Sze
      5. benchmark20100618.patch
        4 kB
        Tsz Wo Nicholas Sze
      6. h1114_20100617b.patch
        27 kB
        Tsz Wo Nicholas Sze
      7. h1114_20100617.patch
        27 kB
        Tsz Wo Nicholas Sze
      8. h1114_20100616b.patch
        27 kB
        Tsz Wo Nicholas Sze
      9. h1114_20100615.patch
        25 kB
        Tsz Wo Nicholas Sze
      10. h1114_20100614b.patch
        24 kB
        Tsz Wo Nicholas Sze
      11. gset20100608.pdf
        105 kB
        Tsz Wo Nicholas Sze
      12. h1114_20100607.patch
        14 kB
        Tsz Wo Nicholas Sze
      13. GSet20100525.pdf
        105 kB
        Tsz Wo Nicholas Sze

        Activity

        Hide
        Tsz Wo Nicholas Sze added a comment -

        In Java 1.6, java.util.HashMap has an array of HashMap.Entry

        //HashMap
            transient Entry[] table;
        

        where HashMap.Entry has the following members

        //HashMap.Entry
                final K key;
                V value;
                Entry<K,V> next;
                final int hash;
        

        In the BlocksMap, we have the following invariants

        • K == V == BlockInfo
        • key == value, i.e. they point to the same BlockInfo object.

        Therefore, we may reduce the memory footprint by

        1. eliminating HashMap.Entry class
        2. keeping only one object reference
        3. possibly eliminating next and hash

        Will provide more design details.

        Show
        Tsz Wo Nicholas Sze added a comment - In Java 1.6, java.util.HashMap has an array of HashMap.Entry //HashMap transient Entry[] table; where HashMap.Entry has the following members //HashMap.Entry final K key; V value; Entry<K,V> next; final int hash; In the BlocksMap, we have the following invariants K == V == BlockInfo key == value, i.e. they point to the same BlockInfo object. Therefore, we may reduce the memory footprint by eliminating HashMap.Entry class keeping only one object reference possibly eliminating next and hash Will provide more design details.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        The data structure we need in BlocksMap is a GettableSet.

        /**
         * A set which supports the get operation.
         * @param <E> The type of the elements.
         */
        public interface GettableSet<E> extends Iterable<E> {
          /**
           * @return the size of this set.
           */
          int size();
        
          /**
           * @return true if the given element equals to a stored element.
           *         Otherwise, return false.
           */
          boolean contains(Object element);
        
          /**
           * @return the stored element if there is any.  Otherwise, return null.
           */
          E get(Object element);
        
          /**
           * Add the given element to this set.
           * @return the previous stored element if there is any.
           *         Otherwise, return null.
           */
          E add(E element);
        
          /**
           * Remove the element from the set.
           * @return the stored element if there is any.  Otherwise, return null.
           */
          E remove(Object element);
        }
        
        Show
        Tsz Wo Nicholas Sze added a comment - The data structure we need in BlocksMap is a GettableSet. /** * A set which supports the get operation. * @param <E> The type of the elements. */ public interface GettableSet<E> extends Iterable<E> { /** * @ return the size of this set. */ int size(); /** * @ return true if the given element equals to a stored element. * Otherwise, return false . */ boolean contains( Object element); /** * @ return the stored element if there is any. Otherwise, return null . */ E get( Object element); /** * Add the given element to this set. * @ return the previous stored element if there is any. * Otherwise, return null . */ E add(E element); /** * Remove the element from the set. * @ return the stored element if there is any. Otherwise, return null . */ E remove( Object element); }
        Hide
        Konstantin Shvachko added a comment -

        Do you have an estimate on how much space this will save in NN's memory footprint?

        Show
        Konstantin Shvachko added a comment - Do you have an estimate on how much space this will save in NN's memory footprint?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I believe we can save from 24 to 40 bytes per entry. It depends on the chosen implementation (will give more details later).

        In a large clusters, there are ~60m blocks. Then, we may save from 1.5GB to 2.5GB NN memory.

        Show
        Tsz Wo Nicholas Sze added a comment - I believe we can save from 24 to 40 bytes per entry. It depends on the chosen implementation (will give more details later). In a large clusters, there are ~60m blocks. Then, we may save from 1.5GB to 2.5GB NN memory.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        GSet20100525.pdf: a design doc.

        Show
        Tsz Wo Nicholas Sze added a comment - GSet20100525.pdf: a design doc.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Thanks Suresh that he found out an interesting post explaining why java.util.HashMap uses power of two table length.

        Show
        Tsz Wo Nicholas Sze added a comment - Thanks Suresh that he found out an interesting post explaining why java.util.HashMap uses power of two table length .
        Hide
        Suresh Srinivas added a comment -
        1. Not using supplemental hash function will result in severe clustering when we move to sequential block IDs (as only higher bits are used for hash).
        2. Why do we need configurability of either using java HashMap or this new implementation?
          • With new impl, BlockInfo implements LinkedElement interface. On switching to java HashMap would it continue to implement this interface and incur the cost of next member in BlockInfo?
        3. In "Arrays" section the GC behavior description was not clear. Not sure how the GC behavior is better with arrays?
        4. Static array size for the map simplifies the code, but pushes complexity to the cluster admin by adding one more configuration. This configuration is an internal implementation detail which a cluster admin may not understand and get it right. If it configured wrong and the cluster continues to work, cluster admin may not be aware of performance degradation.
        5. I feel we should implement resizing to avoid introducing config param. It is a rare event on a stable cluster. NN has enough heap head room to account for floating garage and YG guarantee. Hence availability of memory should not be an issue. Worst case scenario, resize may trigger a full GC.
        6. If we implement resizing we should also think about 2^N table size as it has potential to waste a lot of memory during doubling, especially considering millions of entries in the table.
        Show
        Suresh Srinivas added a comment - Not using supplemental hash function will result in severe clustering when we move to sequential block IDs (as only higher bits are used for hash). Why do we need configurability of either using java HashMap or this new implementation? With new impl, BlockInfo implements LinkedElement interface. On switching to java HashMap would it continue to implement this interface and incur the cost of next member in BlockInfo? In "Arrays" section the GC behavior description was not clear. Not sure how the GC behavior is better with arrays? Static array size for the map simplifies the code, but pushes complexity to the cluster admin by adding one more configuration. This configuration is an internal implementation detail which a cluster admin may not understand and get it right. If it configured wrong and the cluster continues to work, cluster admin may not be aware of performance degradation. I feel we should implement resizing to avoid introducing config param. It is a rare event on a stable cluster. NN has enough heap head room to account for floating garage and YG guarantee. Hence availability of memory should not be an issue. Worst case scenario, resize may trigger a full GC. If we implement resizing we should also think about 2^N table size as it has potential to waste a lot of memory during doubling, especially considering millions of entries in the table.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Suresh, thanks for your comment. I will update the design doc soon.

        h1114_20100607.patch: LightWeightGSet

        • Not yet changed BlocksMap
        • Need more tests and decrease the test running time (currently is ~15 minutes).
        Show
        Tsz Wo Nicholas Sze added a comment - Suresh, thanks for your comment. I will update the design doc soon. h1114_20100607.patch: LightWeightGSet Not yet changed BlocksMap Need more tests and decrease the test running time (currently is ~15 minutes).
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > 1. Not using supplemental hash function will result in severe clustering when we move to sequential block IDs (as only higher bits are used for hash).

        It is not a problem in our case because the hashCode() implematation in Block uses both higher and lower bits of the block ID.

        > 3. In "Arrays" section the GC behavior description was not clear. Not sure how the GC behavior is better with arrays?

        The GC algorithm traverses the objects to determine which objects can be garbage collected. The GC behavior is better in arrays in the sense that there are fewer references in arrays.

        > 2, 4, 5 & 6

        All this items are related to configuration and the hash table length. See the new design doc.

        gset20100608.pdf: rewrote Section 7

        Show
        Tsz Wo Nicholas Sze added a comment - > 1. Not using supplemental hash function will result in severe clustering when we move to sequential block IDs (as only higher bits are used for hash). It is not a problem in our case because the hashCode() implematation in Block uses both higher and lower bits of the block ID. > 3. In "Arrays" section the GC behavior description was not clear. Not sure how the GC behavior is better with arrays? The GC algorithm traverses the objects to determine which objects can be garbage collected. The GC behavior is better in arrays in the sense that there are fewer references in arrays. > 2, 4, 5 & 6 All this items are related to configuration and the hash table length. See the new design doc. gset20100608.pdf: rewrote Section 7
        Hide
        Scott Carey added a comment -

        if you are using a power of two hash table, you can avoid problems caused by hash value clustering by using a Fibonacci Hash. Essentially, use the multiplicative hash with a special value g:

        (h * g) & mask

        where h is the hash value and g is the 'golden ratio' number for the size of the table used. Since multiplication on today's processors is far faster than division or remainders, this can be used to 'uncluster' hash values. A single consecutive run of values gets maximally distributed into the space, and high and low bits are redistributed evenly so that the mask does not increase collisions. Whether this is a desired property or not will depend on the properties of the hash values and whether or not an open addressing solution is used.

        Open addressing can further reduce the memory footprint by allowing the raw object to be placed in the map instead of a container object or list.

        some links found from a few searches:
        http://www.brpreiss.com/books/opus4/html/page214.html
        http://staff.newport.ac.uk/ctubb01/ct/advp/hashtables.pdf

        Show
        Scott Carey added a comment - if you are using a power of two hash table, you can avoid problems caused by hash value clustering by using a Fibonacci Hash. Essentially, use the multiplicative hash with a special value g: (h * g) & mask where h is the hash value and g is the 'golden ratio' number for the size of the table used. Since multiplication on today's processors is far faster than division or remainders, this can be used to 'uncluster' hash values. A single consecutive run of values gets maximally distributed into the space, and high and low bits are redistributed evenly so that the mask does not increase collisions. Whether this is a desired property or not will depend on the properties of the hash values and whether or not an open addressing solution is used. Open addressing can further reduce the memory footprint by allowing the raw object to be placed in the map instead of a container object or list. some links found from a few searches: http://www.brpreiss.com/books/opus4/html/page214.html http://staff.newport.ac.uk/ctubb01/ct/advp/hashtables.pdf
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Scott, thanks for your comments and the useful links.

        Since Block IDs are random, we don't need the extra computation in our case. For general hash table implementation, it is definitely a good idea.

        Show
        Tsz Wo Nicholas Sze added a comment - Scott, thanks for your comments and the useful links. Since Block IDs are random, we don't need the extra computation in our case. For general hash table implementation, it is definitely a good idea.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h1114_20100614b.patch:

        • changed BlocksMap and BlockInfo in order to use LightWeightGSet.
        • changed TestGSet for faster execution time. It took 3.141 seconds in my machine.
        • added new tests for the exception cases.

        Have run a large set of tests (see TestGSet.runMultipleTestGSet()) with various parameters. It took 5.5 hours and passed all tests.

        Show
        Tsz Wo Nicholas Sze added a comment - h1114_20100614b.patch: changed BlocksMap and BlockInfo in order to use LightWeightGSet. changed TestGSet for faster execution time. It took 3.141 seconds in my machine. added new tests for the exception cases. Have run a large set of tests (see TestGSet.runMultipleTestGSet()) with various parameters. It took 5.5 hours and passed all tests.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h1114_20100615.patch: added close(). Otherwise, some tests will fail with OutOfMemory.

        Show
        Tsz Wo Nicholas Sze added a comment - h1114_20100615.patch: added close(). Otherwise, some tests will fail with OutOfMemory.
        Hide
        Suresh Srinivas added a comment -
        1. BlocksMap.java
          • typo exponient. Should be exponent?
          • Capacity should be divided by a reference size 8 or 4 depending on the 64bit or 32bit java version
          • Current capacity calculation seems quite complex. Add more explanation on why it is implemented that way.
        2. LightWeightGSet.java
          • "which uses a hash table for storing the elements" should this say "uses array"?
          • Add a comment that the size of entries is power of two
          • Throw HadoopIllegalArgumentException instead of IllegalArgumentException (for 20 version of the patch it could remain IllegalArugmentException)
          • remove() - for better readability no need for else if and else since the previous block returns
          • toString() - prints the all the entries. This is a bad idea if some one passes this object to Log unknowingly. If all the details of the HashMap is needed, we should have some other method such as dump() or printDetails() to do the same.
        3. TestGSet.java
          • In exception tests, instead of printing log when expected exception happened, print a log in Assert.fail(), like Assert.fail("Excepected exception was not thrown"). Check for exceptions should be more specific, instead Exception. It is also good idea to document these exceptions in javadoc for methods in GSet.
          • println should use Log.info instead of System.out.println?
          • add some comments to classes on what they do/how they are used
          • add some comments to GSetTestCase members denominator etc. and constructor
          • add comments to testGSet() on what each of the case is accomplishing
        Show
        Suresh Srinivas added a comment - BlocksMap.java typo exponient. Should be exponent? Capacity should be divided by a reference size 8 or 4 depending on the 64bit or 32bit java version Current capacity calculation seems quite complex. Add more explanation on why it is implemented that way. LightWeightGSet.java "which uses a hash table for storing the elements" should this say "uses array"? Add a comment that the size of entries is power of two Throw HadoopIllegalArgumentException instead of IllegalArgumentException (for 20 version of the patch it could remain IllegalArugmentException) remove() - for better readability no need for else if and else since the previous block returns toString() - prints the all the entries. This is a bad idea if some one passes this object to Log unknowingly. If all the details of the HashMap is needed, we should have some other method such as dump() or printDetails() to do the same. TestGSet.java In exception tests, instead of printing log when expected exception happened, print a log in Assert.fail(), like Assert.fail("Excepected exception was not thrown"). Check for exceptions should be more specific, instead Exception. It is also good idea to document these exceptions in javadoc for methods in GSet. println should use Log.info instead of System.out.println? add some comments to classes on what they do/how they are used add some comments to GSetTestCase members denominator etc. and constructor add comments to testGSet() on what each of the case is accomplishing
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Thanks for the detail review, Suresh.

        1. BlocksMap.java

        done.

        2. LightWeightGSet.java

        done all except the follwoing.

        • remove() - for better readability ...

        Implicit else is better the explicit else?

        3. TestGSet.java

        • In exception tests, ...

        Catching specific exceptions but I did not change the messages.

        • println should use Log.info instead of System.out.println?

        No, print(..) and println(..) work together.

        • add some comments to ...
        • add some comments to ...
        • add comments to ...

        Added more some comments.

        Show
        Tsz Wo Nicholas Sze added a comment - Thanks for the detail review, Suresh. 1. BlocksMap.java done. 2. LightWeightGSet.java done all except the follwoing. remove() - for better readability ... Implicit else is better the explicit else? 3. TestGSet.java In exception tests, ... Catching specific exceptions but I did not change the messages. println should use Log.info instead of System.out.println? No, print(..) and println(..) work together. add some comments to ... add some comments to ... add comments to ... Added more some comments.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h1114_20100616b:

        • Rewrote codes on capacity computation
        • By following Java, throwing NPE instead of IllegalArugmentException when the parameter is null.
        • Split toString() to two methods.
        • Catching specific exception and added more comments on the test.
        Show
        Tsz Wo Nicholas Sze added a comment - h1114_20100616b: Rewrote codes on capacity computation By following Java, throwing NPE instead of IllegalArugmentException when the parameter is null. Split toString() to two methods. Catching specific exception and added more comments on the test.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Hudson does not seem working. It did not pick up my previous for a long time. Re-submit.

        Show
        Tsz Wo Nicholas Sze added a comment - Hudson does not seem working. It did not pick up my previous for a long time. Re-submit.
        Hide
        Scott Carey added a comment -

        # Capacity should be divided by a reference size 8 or 4 depending on the 64bit or 32bit java version

        What about -XX:+UseCompressedOops ?

        All users should be using this flag on a 64 bit JVM to save a lot of space. It only works up to -Xmx32G though, beyond that its large pointers again.

        Show
        Scott Carey added a comment - # Capacity should be divided by a reference size 8 or 4 depending on the 64bit or 32bit java version What about -XX:+UseCompressedOops ? All users should be using this flag on a 64 bit JVM to save a lot of space. It only works up to -Xmx32G though, beyond that its large pointers again.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > What about -XX:+UseCompressedOops ?

        This is a good point. Is there a way to determine if UseCompressedOops is set in runtime?

        Show
        Tsz Wo Nicholas Sze added a comment - > What about -XX:+UseCompressedOops ? This is a good point. Is there a way to determine if UseCompressedOops is set in runtime?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h1114_20100617.patch: the UnsupportedOperationException thrown in put(..) should be NullPointerException.

        Show
        Tsz Wo Nicholas Sze added a comment - h1114_20100617.patch: the UnsupportedOperationException thrown in put(..) should be NullPointerException.
        Hide
        Suresh Srinivas added a comment -
        1. For figuring out 64 bit, should we consider the max heap size. If max heap size > 2G consider it as 64 bit machine. Since max heap size on 32 bit machines vary, 1.4G to 2G, such machines in that range could be wrongly classified as 32 bit. Is this an alternative worth considering?
        2. Minor: "print detail" to "print detailed"
        3. Minor: For end of line comments should there be space after //. Java coding conventions explicitly do not talk about this though. Currently there 3043 comments with space after // and 384 without that
        4. Minor: In exceptions tests, in my previous comment, what I meant was you are better of printing to log in Assert.fail(). Printing log when expected thing happens is not that useful. That said, this is minor, you can leave it as it is.
        5. I am not sure what the point of commenting out 5 hours test is. When do we expect it to be uncommented and run? Should it be moved to some other test that is run as smoke test for release qualification?
        Show
        Suresh Srinivas added a comment - For figuring out 64 bit, should we consider the max heap size. If max heap size > 2G consider it as 64 bit machine. Since max heap size on 32 bit machines vary, 1.4G to 2G, such machines in that range could be wrongly classified as 32 bit. Is this an alternative worth considering? Minor: "print detail" to "print detailed" Minor: For end of line comments should there be space after //. Java coding conventions explicitly do not talk about this though. Currently there 3043 comments with space after // and 384 without that Minor: In exceptions tests, in my previous comment, what I meant was you are better of printing to log in Assert.fail(). Printing log when expected thing happens is not that useful. That said, this is minor, you can leave it as it is. I am not sure what the point of commenting out 5 hours test is. When do we expect it to be uncommented and run? Should it be moved to some other test that is run as smoke test for release qualification?
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h1114_20100617b.patch: slightly changed the comments and removed unnecessary spaces.

        I did not change the capacity calculation because the current computation is conservative on the special cases.

        Show
        Tsz Wo Nicholas Sze added a comment - h1114_20100617b.patch: slightly changed the comments and removed unnecessary spaces. I did not change the capacity calculation because the current computation is conservative on the special cases.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Try resubmitting.

        Show
        Tsz Wo Nicholas Sze added a comment - Try resubmitting.
        Hide
        Suresh Srinivas added a comment -

        +1 for the patch.

        Show
        Suresh Srinivas added a comment - +1 for the patch.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Thanks Suresh.

        Hudson is not responding. Ran tests locally

             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 17 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        

        Passed all tests except TestFiHFlush sometimes fails; see HDFS-1206.

        Show
        Tsz Wo Nicholas Sze added a comment - Thanks Suresh. Hudson is not responding. Ran tests locally [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 17 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Passed all tests except TestFiHFlush sometimes fails; see HDFS-1206 .
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed this.

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed this.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Ran some benchmarks. When the modulus is large, which means that number of collisions is small, LightWeightGSet is much better than GSetByHashMap.

        datasize modulus GSetByHashMap LightWeightGSet
        65536 1025 219 234
        65536 1048577 516 296
        65536 1073741825 500 281
        262144 1025 1422 1531
        262144 1048577 3078 2156
        262144 1073741825 3094 2281
        1048576 1025 7172 7313
        1048576 1048577 13531 9844
        1048576 1073741825 14485 10718
        Show
        Tsz Wo Nicholas Sze added a comment - Ran some benchmarks. When the modulus is large, which means that number of collisions is small, LightWeightGSet is much better than GSetByHashMap. datasize modulus GSetByHashMap LightWeightGSet 65536 1025 219 234 65536 1048577 516 296 65536 1073741825 500 281 262144 1025 1422 1531 262144 1048577 3078 2156 262144 1073741825 3094 2281 1048576 1025 7172 7313 1048576 1048577 13531 9844 1048576 1073741825 14485 10718
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #311 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/311/)
        HDFS-1114. Implement LightWeightGSet for BlocksMap in order to reduce NameNode memory footprint.

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #311 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/311/ ) HDFS-1114 . Implement LightWeightGSet for BlocksMap in order to reduce NameNode memory footprint.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Comparing memory footprint on a 32-bit VM over 1,000,000 elements

         num     #instances         #bytes  class name
        ----------------------------------------------
           1:       1000040       24000960  java.util.HashMap$Entry
           2:       1000000       24000000  org.apache.hadoop.hdfs.util.TestGSet$IntElement
           3:            23        8390960  [Ljava.util.HashMap$Entry;
        HashMap: 53.78 MB
        
        
         num     #instances         #bytes  class name
        ----------------------------------------------
           1:       1000000       24000000  org.apache.hadoop.hdfs.util.TestGSet$IntElement
           2:             1        4194320  [Lorg.apache.hadoop.hdfs.util.LightWeightGSet$LinkedElement;
        LightWeightGSet: 26.89 MB
        
        Show
        Tsz Wo Nicholas Sze added a comment - Comparing memory footprint on a 32-bit VM over 1,000,000 elements num #instances #bytes class name ---------------------------------------------- 1: 1000040 24000960 java.util.HashMap$Entry 2: 1000000 24000000 org.apache.hadoop.hdfs.util.TestGSet$IntElement 3: 23 8390960 [Ljava.util.HashMap$Entry; HashMap: 53.78 MB num #instances #bytes class name ---------------------------------------------- 1: 1000000 24000000 org.apache.hadoop.hdfs.util.TestGSet$IntElement 2: 1 4194320 [Lorg.apache.hadoop.hdfs.util.LightWeightGSet$LinkedElement; LightWeightGSet: 26.89 MB
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Note that we should minus 4*1000000 ~= 4MB for HashMap since it does not require the reference for LightWeightGSet.LinkedElement.

        Show
        Tsz Wo Nicholas Sze added a comment - Note that we should minus 4*1000000 ~= 4MB for HashMap since it does not require the reference for LightWeightGSet.LinkedElement.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        benchmark20100618.patch: additional codes for benchmark and memory footprint measurement.

        Show
        Tsz Wo Nicholas Sze added a comment - benchmark20100618.patch: additional codes for benchmark and memory footprint measurement.
        Hide
        Scott Carey added a comment -

        This is a good point. Is there a way to determine if UseCompressedOops is set in runtime?

        Well, there is ManagementFactory.getRuntimeMXBean().getInputArguments(), but later versions of Java are going to be making +UseCompressedOops the default.

        There is also a way to check if the VM is 64 bit or 32 bit, either its out of ManagementFactory or one of the system properties. Digging around I don't see it, but I have used it before. I think it is vendor specific though.

        Show
        Scott Carey added a comment - This is a good point. Is there a way to determine if UseCompressedOops is set in runtime? Well, there is ManagementFactory.getRuntimeMXBean().getInputArguments(), but later versions of Java are going to be making +UseCompressedOops the default. There is also a way to check if the VM is 64 bit or 32 bit, either its out of ManagementFactory or one of the system properties. Digging around I don't see it, but I have used it before. I think it is vendor specific though.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > There is also a way to check if the VM is 64 bit or 32 bit, either its out of ManagementFactory or one of the system properties. Digging around I don't see it, but I have used it before. I think it is vendor specific though.

        I used System.getProperty("sun.arch.data.model") in the patch. See http://java.sun.com/docs/hotspot/HotSpotFAQ.html#64bit_detection

        Show
        Tsz Wo Nicholas Sze added a comment - > There is also a way to check if the VM is 64 bit or 32 bit, either its out of ManagementFactory or one of the system properties. Digging around I don't see it, but I have used it before. I think it is vendor specific though. I used System.getProperty("sun.arch.data.model") in the patch. See http://java.sun.com/docs/hotspot/HotSpotFAQ.html#64bit_detection
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h1114_20100617b_y0.20.1xx.patch: for y0.20.1xx

        Show
        Tsz Wo Nicholas Sze added a comment - h1114_20100617b_y0.20.1xx.patch: for y0.20.1xx
        Hide
        Suresh Srinivas added a comment -

        Minor comment: TestGSet in Y20 version can be a junit4 test. You do not need to extend TestCase.

        +1 for the patch.

        Show
        Suresh Srinivas added a comment - Minor comment: TestGSet in Y20 version can be a junit4 test. You do not need to extend TestCase. +1 for the patch.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Thanks Suresh.

        h1114_20100617b2_y0.20.1xx.patch: used the original junit 4 TestGSet

        Show
        Tsz Wo Nicholas Sze added a comment - Thanks Suresh. h1114_20100617b2_y0.20.1xx.patch: used the original junit 4 TestGSet
        Hide
        Suresh Srinivas added a comment -

        +1 for the new patch.

        Show
        Suresh Srinivas added a comment - +1 for the new patch.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Ravi has done some performance analysis on Yahoo! production cluster fs image.

        46400490 files & directories , 70074790 blocks = 116475280
        FsImage Disk size - 7.5GB.

          heap size at NN startup fsimage loading time
        Without the patch 19.64 GB 603 s
        With the patch 17.03 GB 577 s
        % improvement 13.2% 4.3%

        Thanks Ravi!

        Show
        Tsz Wo Nicholas Sze added a comment - Ravi has done some performance analysis on Yahoo! production cluster fs image. 46400490 files & directories , 70074790 blocks = 116475280 FsImage Disk size - 7.5GB.   heap size at NN startup fsimage loading time Without the patch 19.64 GB 603 s With the patch 17.03 GB 577 s % improvement 13.2% 4.3% Thanks Ravi!
        Hide
        Tsz Wo Nicholas Sze added a comment -

        gset20100702.pdf: added a result section.

        Show
        Tsz Wo Nicholas Sze added a comment - gset20100702.pdf: added a result section.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        gset20100702.tex: source file of the pdf.

        Show
        Tsz Wo Nicholas Sze added a comment - gset20100702.tex: source file of the pdf.

          People

          • Assignee:
            Tsz Wo Nicholas Sze
            Reporter:
            Tsz Wo Nicholas Sze
          • Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development