Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16198

Short circuit read leaks Slot objects when InvalidToken exception is thrown

    XMLWordPrintableJSON

Details

    Description

      In secure mode, 'dfs.block.access.token.enable' should be set 'true'. With this configuration SecretManager.InvalidToken exception may be thrown if the access token expires when we do short circuit reads. It doesn't matter because the failed reads will be retried. But it causes the leakage of ShortCircuitShm.Slot objects. 

       

      We found this problem in our secure HBase clusters. The number of open file descriptors of RegionServers kept increasing using short circuit reading. 

       

      It was caused by the leakage of shared memory segments used by short circuit reading.

      [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk '{print $2}') | grep /dev/shm | wc -l
      3925
      [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk '{print $2}') | grep /dev/shm | head -5
      java 86309 hbase DEL REG 0,19 2308279984 /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_743473959
      java 86309 hbase DEL REG 0,19 2306359893 /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_1594162967
      java 86309 hbase DEL REG 0,19 2305496758 /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_2043027439
      java 86309 hbase DEL REG 0,19 2304784261 /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_689571088
      java 86309 hbase DEL REG 0,19 2302621988 /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_347008590 

       

      We finally found that the root cause of this is the leakage of ShortCircuitShm.Slot.

       

      The fix is trivial. Just free the slot when InvalidToken exception is thrown.

      Attachments

        1. HDFS-16198.patch
          10 kB
          Eungsop Yoo
        2. screenshot-2.png
          81 kB
          Eungsop Yoo

        Issue Links

          Activity

            People

              Eungsop Yoo Eungsop Yoo
              Eungsop Yoo Eungsop Yoo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2.5h
                  2.5h