Hive
  1. Hive
  2. HIVE-3335

Thousand of CLOSE_WAIT socket when we using SymbolicInputFormat

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.8.1
    • Fix Version/s: None
    • Component/s: Clients
    • Labels:
      None
    • Environment:

      Description

      Procedure for reproduction:
      1. Set up hadoop
      2. Prepare data file and link.txt:
      data:
      $ hadoop fs -cat /path/to/data/2012-07-01/20120701.csv
      1, 20120701 00:00:00
      2, 20120701 00:00:01
      3, 20120701 01:12:45
      link.txt
      $ cat link.txt
      /path/to/data/2012-07-01//*

      2. On hive, create table like below:
      CREATE TABLE user_logs(id INT, created_at STRING)
      row format delimited fields terminated by ',' lines terminated by '\n'
      stored as inputformat 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
      outputformat 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

      3. Put link.txt to /user/hive/warehouse/user_logs
      $ sudo -u hdfs hadoop fs -put link.txt /user/hive/warehouse/user_logs

      4. Open another session(A session), and watch socket,
      $ netstat -a | grep CLOSE_WAIT
      tcp 1 0 localhost:48121 localhost:50010
      CLOSE_WAIT
      tcp 1 0 localhost:48124 localhost:50010
      CLOSE_WAIT
      $

      5. Return to hive session, execute this,
      $ select * from user_logs;

      6. Return to A session, watch socket again,
      $ netstat -a | grep CLOSE_WAIT
      tcp 1 0 localhost:48121 localhost:50010
      CLOSE_WAIT
      tcp 1 0 localhost:48124 localhost:50010
      CLOSE_WAIT
      tcp 1 0 localhost:48166 localhost:50010
      CLOSE_WAIT

      If you makes any partitions, you'll watch unclosed socket whose count
      equals partitions by once.

      I think that this problem maybe is caused by this point:
      At https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java,
      line 66. BufferedReader was opened, but it doesn't closed.

        Issue Links

          Activity

          Hide
          Ashutosh Chauhan added a comment -

          Yuki Yoi Your analysis seems correct. Mind submitting a patch for it?

          Show
          Ashutosh Chauhan added a comment - Yuki Yoi Your analysis seems correct. Mind submitting a patch for it?
          Hide
          Yuki Yoi added a comment -

          A patch of this issue

          Show
          Yuki Yoi added a comment - A patch of this issue
          Hide
          Yuki Yoi added a comment -

          I'm sorry to too late. I attach a patch for this issue.

          Show
          Yuki Yoi added a comment - I'm sorry to too late. I attach a patch for this issue.
          Hide
          Ashutosh Chauhan added a comment -

          Couple of comments:

          • In finally block, instead of reader.close(); better is to do org.apache.hadoop.io.IOUtils.closeStream(reader); since reader could be either null or can throw IOException in close(). IOUtils handles both of those cases.
          • Same problem exists even in unit test code of this class, where reader.close() is never invoked, resulting in socket leak. In both tests testAccuracy1() and testAccuracy2() can you add reader.close() I don't think we need to do full try-catch-block in testcases, since there as soon as exception occurs we want to start unwinding the stack.
          Show
          Ashutosh Chauhan added a comment - Couple of comments: In finally block, instead of reader.close(); better is to do org.apache.hadoop.io.IOUtils.closeStream(reader); since reader could be either null or can throw IOException in close(). IOUtils handles both of those cases. Same problem exists even in unit test code of this class, where reader.close() is never invoked, resulting in socket leak. In both tests testAccuracy1() and testAccuracy2() can you add reader.close() I don't think we need to do full try-catch-block in testcases, since there as soon as exception occurs we want to start unwinding the stack.
          Hide
          Harsh J added a comment -

          This was fixed via HIVE-3480

          Show
          Harsh J added a comment - This was fixed via HIVE-3480

            People

            • Assignee:
              Unassigned
              Reporter:
              Yuki Yoi
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development