Hive
  1. Hive
  2. HIVE-3335

Thousand of CLOSE_WAIT socket when we using SymbolicInputFormat

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.8.1
    • Fix Version/s: None
    • Component/s: Clients
    • Labels:
      None
    • Environment:

      Description

      Procedure for reproduction:
      1. Set up hadoop
      2. Prepare data file and link.txt:
      data:
      $ hadoop fs -cat /path/to/data/2012-07-01/20120701.csv
      1, 20120701 00:00:00
      2, 20120701 00:00:01
      3, 20120701 01:12:45
      link.txt
      $ cat link.txt
      /path/to/data/2012-07-01//*

      2. On hive, create table like below:
      CREATE TABLE user_logs(id INT, created_at STRING)
      row format delimited fields terminated by ',' lines terminated by '\n'
      stored as inputformat 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
      outputformat 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

      3. Put link.txt to /user/hive/warehouse/user_logs
      $ sudo -u hdfs hadoop fs -put link.txt /user/hive/warehouse/user_logs

      4. Open another session(A session), and watch socket,
      $ netstat -a | grep CLOSE_WAIT
      tcp 1 0 localhost:48121 localhost:50010
      CLOSE_WAIT
      tcp 1 0 localhost:48124 localhost:50010
      CLOSE_WAIT
      $

      5. Return to hive session, execute this,
      $ select * from user_logs;

      6. Return to A session, watch socket again,
      $ netstat -a | grep CLOSE_WAIT
      tcp 1 0 localhost:48121 localhost:50010
      CLOSE_WAIT
      tcp 1 0 localhost:48124 localhost:50010
      CLOSE_WAIT
      tcp 1 0 localhost:48166 localhost:50010
      CLOSE_WAIT

      If you makes any partitions, you'll watch unclosed socket whose count
      equals partitions by once.

      I think that this problem maybe is caused by this point:
      At https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java,
      line 66. BufferedReader was opened, but it doesn't closed.

        Activity

        Hide
        Ashutosh Chauhan added a comment -

        Show You Your analysis seems correct. Mind submitting a patch for it?

        Show
        Ashutosh Chauhan added a comment - Show You Your analysis seems correct. Mind submitting a patch for it?
        Hide
        Show You added a comment -

        A patch of this issue

        Show
        Show You added a comment - A patch of this issue
        Hide
        Show You added a comment -

        I'm sorry to too late. I attach a patch for this issue.

        Show
        Show You added a comment - I'm sorry to too late. I attach a patch for this issue.
        Hide
        Ashutosh Chauhan added a comment -

        Couple of comments:

        • In finally block, instead of reader.close(); better is to do org.apache.hadoop.io.IOUtils.closeStream(reader); since reader could be either null or can throw IOException in close(). IOUtils handles both of those cases.
        • Same problem exists even in unit test code of this class, where reader.close() is never invoked, resulting in socket leak. In both tests testAccuracy1() and testAccuracy2() can you add reader.close() I don't think we need to do full try-catch-block in testcases, since there as soon as exception occurs we want to start unwinding the stack.
        Show
        Ashutosh Chauhan added a comment - Couple of comments: In finally block, instead of reader.close(); better is to do org.apache.hadoop.io.IOUtils.closeStream(reader); since reader could be either null or can throw IOException in close(). IOUtils handles both of those cases. Same problem exists even in unit test code of this class, where reader.close() is never invoked, resulting in socket leak. In both tests testAccuracy1() and testAccuracy2() can you add reader.close() I don't think we need to do full try-catch-block in testcases, since there as soon as exception occurs we want to start unwinding the stack.

          People

          • Assignee:
            Unassigned
            Reporter:
            Show You
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development