Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3904

File descriptor leaking (Too many open files) for long running stream process

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • streams

    Description

      I noticed when my application was running long (> 1 day), I will get 'Too many open files' error.

      I used 'lsof' to list all the file descriptors used by the process, it's over 32K, but most of them belongs to the .lock file, e.g. this same lock file shows 2700 times.

      I looked at the code, I think the problem is in:

      File lockFile = new File(stateDir, ProcessorStateManager.LOCK_FILE_NAME);
      FileChannel channel = new RandomAccessFile(lockFile, "rw").getChannel();

      Each time new RandomAccessFile is called, a new fd will be created, we probably should either close or reuse this RandomAccessFile object.

      lsof result:

      java 14799 hcai *740u REG 9,0 0 2415928585 /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock

      java 14799 hcai *743u REG 9,0 0 2415928585 /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock

      java 14799 hcai *746u REG 9,0 0 2415928585 /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock

      java 14799 hcai *755u REG 9,0 0 2415928585 /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock

      hcai@teststream02001:~$ lsof -p 14799 | grep lock | grep 0_16 | wc

      2709 24381 319662

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hcai@pinterest.com Henry Cai
            hcai@pinterest.com Henry Cai
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment