Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16653

Improve error messages in ShortCircuitCache

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.1.3
    • 3.4.0
    • dfsadmin
    • Linux version 4.15.0-142-generic (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12))

    • Reviewed

    Description

       

      <property>
        <name>dfs.client.mmap.cache.size</name>
        <value>256</value>
        <description>
          When zero-copy reads are used, the DFSClient keeps a cache of recently used
          memory mapped regions.  This parameter controls the maximum number of
          entries that we will keep in that cache.
          The larger this number is, the more file descriptors we will potentially
          use for memory-mapped files.  mmaped files also use virtual address space.
          You may need to increase your ulimit virtual address space limits before
          increasing the client mmap cache size.
          
          Note that you can still do zero-copy reads when this size is set to 0.
        </description>
      </property>
      

      When the configuration item “dfs.client.mmap.cache.size” is set to a negative number, it will cause /hadoop/bin hdfs dfsadmin -safemode provides all the operation options including enter, leave, get, wait and forceExit are invalid, the terminal returns security mode is null and no exceptions are thrown.

      In summary, I think we need to improve the check mechanism related to this configuration item, add maxEvictableMmapedSize that is "dfs.client.mmap.cache.size" related Precondition check suite error message,and give a clear indication when the configuration is abnormal in order to solve the problem in time and reduce the impact on the safe mode related operations.

      The details are as follows.

      I think that since the constructor of the ShortCircuitCache class in ShortCircuitCache.java in the source code already uses Preconditions.checkArgument() to check whether the configuration item value is greater than or equal to zero.So when set to a negative number, it will lead to the creation of ShortCircuitCache class object in ClientContext.java failed.

      But due to Preconditions.checkArgument () in the lack of error information, resulting in the terminal using hdfs dfsadmin script appears as follows:

      hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode leave 
      safemode: null
      Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
      hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode enter
      safemode: null
      Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
      hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode get
      safemode: null
      Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
      hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode forceExit
      safemode: null
      Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]

      And hdfs logs and terminal are not related to the exception thrown.

      Therefore, the cause of the situation can be found directly after adding an error message to the original Preconditions.checkArgument(), as follows:

      hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode leave
      safemode: Invalid argument: dfs.client.mmap.cache.size must be greater than zero.
      Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]

       

      Attachments

        Issue Links

          Activity

            People

              fujx ECFuzz
              fujx ECFuzz
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: