Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-467

Too many open FD when master failed.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      When Tajo Master failed and Worker still alive, too many open FD in worker's host.
      I checked with losf command. Showed the following list.

      lsof -l | grep <pid> | grep wc -l
      2568
      
      lsof -l | grep <pid>
      java      92845      501  110     PIPE 0xd63c81fc1ed6001b       16384          ->0xd63c81fc0aac406b
      java      92845      501  111     PIPE 0xd63c81fc0aac406b       16384          ->0xd63c81fc1ed6001b
      java      92845      501  112u  KQUEUE                                         count=0, state=0x2
      java      92845      501  113     PIPE 0xd63c81fc0d1441cb       16384          ->0xd63c81fc1ed6059b
      java      92845      501  114     PIPE 0xd63c81fc1ed6059b       16384          ->0xd63c81fc0d1441cb
      java      92845      501  115u  KQUEUE                                         count=0, state=0x2
      java      92845      501  116     PIPE 0xd63c81fc1edb140b       16384          ->0xd63c81fc1ed61cfb
      java      92845      501  117     PIPE 0xd63c81fc1ed61cfb       16384          ->0xd63c81fc1edb140b
      java      92845      501  118u  KQUEUE                                         count=0, state=0x2
      java      92845      501  119     PIPE 0xd63c81fc1eba61fb       16384          ->0xd63c81fc1eba727b
      java      92845      501  120     PIPE 0xd63c81fc1eba727b       16384          ->0xd63c81fc1eba61fb
      java      92845      501  121u  KQUEUE                                         count=0, state=0x2
      java      92845      501  122     PIPE 0xd63c81fc163b474b       16384          ->0xd63c81fc1ed61a3b
      java      92845      501  123     PIPE 0xd63c81fc1ed61a3b       16384          ->0xd63c81fc163b474b
      java      92845      501  124u  KQUEUE                                         count=0, state=0x2
      java      92845      501  125     PIPE 0xd63c81fc1ed68e3b       16384          ->0xd63c81fc1d530bfb
      java      92845      501  126     PIPE 0xd63c81fc1d530bfb       16384          ->0xd63c81fc1ed68e3b
      java      92845      501  127u  KQUEUE                                         count=0, state=0x2
      java      92845      501  128     PIPE 0xd63c81fc1d4848ab       16384          ->0xd63c81fc0aac1b4b
      java      92845      501  129     PIPE 0xd63c81fc0aac1b4b       16384          ->0xd63c81fc1d4848ab
      java      92845      501  130u  KQUEUE                                         count=0, state=0x2
      java      92845      501  131     PIPE 0xd63c81fc1fe3d74b       16384          ->0xd63c81fc0aac125b
      java      92845      501  132     PIPE 0xd63c81fc0aac125b       16384          ->0xd63c81fc1fe3d74b
      java      92845      501  133u  KQUEUE                                         count=0, state=0x2
      ...
      
      1. TAJO-467.patch
        4 kB
        Hyoungjun Kim

        Activity

        Hide
        hjkim Hyoungjun Kim added a comment -

        The code which uses RPC is like following.

        NettyClientBase tmClient = null;
        try {
          tmClient = new BlockingRpcClient(...);
        } finally {
          if(tmClient != null) {
            tmClient.close();
          }
        }
        

        If failed creating NettyClientBase instance in the try block, NettyClientBase is null in the finally block and NettyClientBase's close() method is never called.
        NettyClientBase allocates some resources in constructor. I added the code to release resources when the exception occurred in constructor.
        Please review this patch.

        Show
        hjkim Hyoungjun Kim added a comment - The code which uses RPC is like following. NettyClientBase tmClient = null ; try { tmClient = new BlockingRpcClient(...); } finally { if (tmClient != null ) { tmClient.close(); } } If failed creating NettyClientBase instance in the try block, NettyClientBase is null in the finally block and NettyClientBase's close() method is never called. NettyClientBase allocates some resources in constructor. I added the code to release resources when the exception occurred in constructor. Please review this patch.
        Hide
        hyunsik Hyunsik Choi added a comment -

        +1
        The issue is reasonable, and the fix is straightforward.

        Show
        hyunsik Hyunsik Choi added a comment - +1 The issue is reasonable, and the fix is straightforward.
        Hide
        hyunsik Hyunsik Choi added a comment -

        committed it to master. Thanks!

        Show
        hyunsik Hyunsik Choi added a comment - committed it to master. Thanks!
        Hide
        hudson Hudson added a comment -

        ABORTED: Integrated in Tajo-trunk-postcommit #653 (See https://builds.apache.org/job/Tajo-trunk-postcommit/653/)
        TAJO-467: Too many open FD when master failed. (hyoungjunkim via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=33606c80268e59693412aa138f27c927b770d43a)

        • CHANGES.txt
        • tajo-rpc/src/main/java/org/apache/tajo/rpc/NettyClientBase.java
        Show
        hudson Hudson added a comment - ABORTED: Integrated in Tajo-trunk-postcommit #653 (See https://builds.apache.org/job/Tajo-trunk-postcommit/653/ ) TAJO-467 : Too many open FD when master failed. (hyoungjunkim via hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=33606c80268e59693412aa138f27c927b770d43a ) CHANGES.txt tajo-rpc/src/main/java/org/apache/tajo/rpc/NettyClientBase.java

          People

          • Assignee:
            hjkim Hyoungjun Kim
            Reporter:
            hjkim Hyoungjun Kim
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development