Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10
    • Fix Version/s: 0.10
    • Component/s: parser
    • Labels:
      None
    • Environment:

      Tested on OSX and Linux debian

      Description

      We have a thread which parser > 200k files, and we always get "too many open files open" error from operating system. Using lsof I noticed tha apache-tika temp files (created by class temporaryFiles) are not really deleted by operating system, even if delete method returns true.
      Searching in the code, I found that the problem (which does not manifest with all the files) is probably in TikaInputStream#close method. Here opencontainer is set to null, but in case of opencontainer instance of org.apache.poi.poifs.filesystem.NPOIFSFileSystem the problems disappear if I call close() on opencontainer. I modified the NPOIFSFileSystem class to implement java.io.Closeable, and modified TikaInputStream#close method to make

      if (openContainer instanceof java.io.Closeable)

      { ((java.io.Closeable) openContainer).close(); }

      openContainer = null;

      I don't know if this is the best solution, but it seems to solve the problem for me.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        17h 53m 1 Nick Burch 06/May/11 02:24
        Resolved Resolved Closed Closed
        167d 11h 9m 1 Jukka Zitting 20/Oct/11 13:34
        Jukka Zitting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Nick Burch made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Nick Burch [ gagravarr ]
        Fix Version/s 1.0 [ 12313535 ]
        Resolution Fixed [ 1 ]
        Hide
        Nick Burch added a comment -

        I've made NPOIFSFileSystem and OPCPackage closeable in r1100013. That'll be in POI 3.8 beta 3

        In r1100015 I've made TikaInputStream close the open container as you suggest, thanks for that. For now you'll need to use a nightly build (or your custom build) of POI to see the effect of that, but it'll kick in properly when 3.8 beta 3 is out.

        Show
        Nick Burch added a comment - I've made NPOIFSFileSystem and OPCPackage closeable in r1100013. That'll be in POI 3.8 beta 3 In r1100015 I've made TikaInputStream close the open container as you suggest, thanks for that. For now you'll need to use a nightly build (or your custom build) of POI to see the effect of that, but it'll kick in properly when 3.8 beta 3 is out.
        Enrico Donelli created issue -

          People

          • Assignee:
            Nick Burch
            Reporter:
            Enrico Donelli
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development