Jackrabbit Content Repository
  1. Jackrabbit Content Repository
  2. JCR-2596

multiple instances of jackrabbit-standalone cause "file backing binary value not found" from org.apache.jackrabbit.util.TransientFileFactory

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0
    • Fix Version/s: None
    • Component/s: jackrabbit-jcr-commons
    • Labels:
      None
    • Environment:
      any

      Description

      running 2 or more instances of jackrabbit-standalone causes file deletions in the temporary folder used by another standalone instance. (when garbage collected)

      To reproduce, run 2 or more instances, create files in each, and then stop one of them and attempt read cached files by the other. The one that stopped will garbage collect files used by the other. This may be hard to reproduce, as it doesn't seem to be guaranteed to have a collision on file names. The problem "went away" when I forced each instance to use a different temporary folder. But this is not a permanent solution.
      Ex:
      java -Dhostname=standalonejcr -Djava.io.tmpdir=/tmp1 -Xmn100M -Xms500M -Xmx500M -jar jackrabbit-standalone-2.0.0.jar -p 8000 -r jcr-repository
      java -Dhostname=standalonejcr -Djava.io.tmpdir=/tmp2 -Xmn100M -Xms500M -Xmx500M -jar jackrabbit-standalone-2.0.0.jar -p 8001 -r jcr-repository

      Original Emails: (to jackrabbit dev mailing list)
      >>>>>>>>>>>
      Wed, Apr 7, 2010 at 11:21 AM
      subject clustered environment, 2 different jvms, TransientFileFactory, storing file blobs in db

      Hello,

      I would normally file a bug on jira, but its very difficult to setup/reproduce, so I'm looking for insight first on how temp files/blobs are implemented in jackrabbit.

      We currently run 2 different "standalone" instances of jackrabbit version 2.0.0, each in their own JVM and setup the same way in using <cluster>.

      Our application connects to one of the standalone instances remotely(webdavex) for authoring content, and sends publish instructions (via JMS/activemq) to the other.

      The problem though, is that BLOBInTempFile.getStream is occasionaly throwing : "file backing binary value not found", and one of the instances sometimes can't read the file.

      I've searched and found this information:
      http://mail-archives.apache.org/mod_mbox//jackrabbit-dev/200603.mbox/<90a8d1c00603150237t4c81df4fx178fcd726a93fe@mail.gmail.com>

      So apparently, when files are read/written, you create a temporary cache, but TransientFileFactory runs as a singleton within a single JVM correct?

      So can I assume that one of the "singletons", (there will be 2??) will delete files that were created by the other at some DIFFERENT random time when the garbage collector runs?

      I've also attached your Repository.xml that we use for both (with different cluster ids of course)

      Adrien

      Thanks
      Is there some way to avoid this??

      I've attached our repository.xml for you to look at, both are setup the same way for e

      Thanks.

      >>>>>>>>>>>>
      from Stefan Guggisberg
      reply-to dev@jackrabbit.apache.org
      to dev@jackrabbit.apache.org
      date Thu, Apr 8, 2010 at 12:59 AM
      subject Re: clustered environment, 2 different jvms, TransientFileFactory, storing file blobs in db
      On Wed, Apr 7, 2010 at 8:21 PM, Adrien Lamoureux
      wrote:
      > Hello,
      > I would normally file a bug on jira, but its very difficult to
      > setup/reproduce, so I'm looking for insight first on how temp files/blobs
      > are implemented in jackrabbit.
      > We currently run 2 different "standalone" instances of jackrabbit version
      > 2.0.0, each in their own JVM and setup the same way in using <cluster>.
      > Our application connects to one of the standalone instances
      > remotely(webdavex) for authoring content, and sends publish instructions
      > (via JMS/activemq) to the other.
      > The problem though, is that BLOBInTempFile.getStream is occasionaly throwing
      > : "file backing binary value not found", and one of the instances sometimes
      > can't read the file.
      > I've searched and found this information:
      > http://mail-archives.apache.org/mod_mbox//jackrabbit-dev/200603.mbox/<90a8d1c00603150237t4c81df4fx178fcd726a93fe@mail.gmail.com>
      > So apparently, when files are read/written, you create a temporary cache,
      > but TransientFileFactory runs as a singleton within a single JVM correct?

      yes

      > So can I assume that one of the "singletons", (there will be 2??) will
      > delete files that were created by the other at some DIFFERENT random time
      > when the garbage collector runs?

      no, unless java.io.File#createTempFile invoked from 2 different jvm's
      would create
      colliding temp files. but that's impossible according to the javadoc [0]:

      <quote>
      [...] is guaranteed that:
      1. The file denoted by the returned abstract pathname did not exist
      before this method was invoked
      [...]
      </quote>

      cheers
      stefan

      [0] http://java.sun.com/javase/7/docs/api/java/io/File.html#createTempFile(java.lang.String,
      java.lang.String, java.io.File)

      >>>>>>>>>>>>
      from Thomas Müller
      reply-to dev@jackrabbit.apache.org
      to dev@jackrabbit.apache.org
      date Thu, Apr 8, 2010 at 1:52 AM
      subject Re: clustered environment, 2 different jvms, TransientFileFactory, storing file blobs in db

      Hi,

      Stefan is right, File.createTempFile() doesn't generate colliding
      files. However, there is a potential problem with the
      TransientFileFactory. Consider the following case:

      • The file "bin-1.tmp" is created (BLOBInTempFile line 51).
      • The TransientFileFactory adds a PhantomReference "A" in its queue.
      • BLOBInTempFile.delete() or dispose() is called, the file "bin-1.tmp"
        is deleted.
      • A new file is created, and also called "bin-1.tmp" is created
        (BLOBInTempFile line 51)
        (that's possible because File.createTempFile can re-use file names).
      • The TransientFileFactory adds a second PhantomReference "B" in its
        queue, pointing
        to a different file with the same name.
      • The first (only the first) BLOBInTempFile is no longer referenced.
      • The TransientFileFactory.ReaperThread gets PhantomReference "A" and
        deletes this file. But the file is still used and referenced ("B").

      I'm not sure if this is what is happening in your case, but it is a
      potential problem.

      Could you log a bug?

      There are multiple ways to solve the problem. I think the best
      solution is to not use File.createTempFile() and instead use our own
      file name factory (with a random part, and an counter part).

      Regards,
      Thomas

        Issue Links

          Activity

          Adrien Lamoureux created issue -
          Stefan Guggisberg made changes -
          Field Original Value New Value
          Link This issue is duplicated by JCR-2609 [ JCR-2609 ]
          Hide
          Titu Petrea added a comment -

          Can a phantom reachable File instance be deleted by ReaperThread even though it is not deleted/disposed ?

          Show
          Titu Petrea added a comment - Can a phantom reachable File instance be deleted by ReaperThread even though it is not deleted/disposed ?
          Hide
          Stefan Guggisberg added a comment -

          > Can a phantom reachable File instance be deleted by ReaperThread even though it is not deleted/disposed ?

          yes, that's the very purpose of TransientFileFactory.createTransientFile:

          <quote>
          Same as File.createTempFile(String, String, File) except that the newly-created file will be automatically deleted once the returned File object has been gc'ed.
          </quote>

          Show
          Stefan Guggisberg added a comment - > Can a phantom reachable File instance be deleted by ReaperThread even though it is not deleted/disposed ? yes, that's the very purpose of TransientFileFactory.createTransientFile: <quote> Same as File.createTempFile(String, String, File) except that the newly-created file will be automatically deleted once the returned File object has been gc'ed. </quote>
          Hide
          Jukka Zitting added a comment -

          Which JRE are you using? The File.createTempFile() method used by the TransientFileFactory class never returns a file that already existed, so it should be impossible for the method to create and return the same file in two different JVMs.

          Show
          Jukka Zitting added a comment - Which JRE are you using? The File.createTempFile() method used by the TransientFileFactory class never returns a file that already existed, so it should be impossible for the method to create and return the same file in two different JVMs.
          Titu Petrea made changes -
          Comment [
          - The file "bin-1.tmp" is created (BLOBInTempFile line 51).
          - The TransientFileFactory adds a PhantomReference "A" in its queue.
          - BLOBInTempFile.delete() or dispose() is called, the file "bin-1.tmp"
          is deleted.
          - A new file is created, and also called "bin-1.tmp" is created
          (BLOBInTempFile line 51)
           (that's possible because File.createTempFile can re-use file names).
          - The TransientFileFactory adds a second PhantomReference "B" in its
          queue, pointing
           to a different file with the same name.
          - The first (only the first) BLOBInTempFile is no longer referenced.
          - The TransientFileFactory.ReaperThread gets PhantomReference "A" and
           deletes this file. But the file is still used and referenced ("B").

          From http://download.oracle.com/javase/6/docs/api/
          "Neither this method nor any of its variants will return the same abstract pathname again in the current invocation of the virtual machine."
          ]
          Hide
          Joshua Hyde added a comment - - edited

          I came here via JCR-2609, and we're seeing that issue on the IBM WAS JRE:

          [

          {me}

          bin]# ./java -version
          java version "1.6.0"
          Java(TM) SE Runtime Environment (build pxa6460sr7ifix-20100220_02(SR7+IZ69890+IZ70326))
          IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr7-20100219_54097 (JIT enabled, AOT enabled)
          J9VM - 20100219_054097
          JIT - r9_20091123_13891
          GC - 20100216_AA)
          JCL - 20091202_01

          [ Edit: for sake of clarity, the affected application isn't JBoss CMS, as described in JCR-2609, but an application of our own making sitting on top of JackRabbit, but the stacktraces look, line-for-line, the same, and we do have multiple JackRabbit instances running on the same physical box. ]

          Show
          Joshua Hyde added a comment - - edited I came here via JCR-2609 , and we're seeing that issue on the IBM WAS JRE: [ {me} bin]# ./java -version java version "1.6.0" Java(TM) SE Runtime Environment (build pxa6460sr7ifix-20100220_02(SR7+IZ69890+IZ70326)) IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr7-20100219_54097 (JIT enabled, AOT enabled) J9VM - 20100219_054097 JIT - r9_20091123_13891 GC - 20100216_AA) JCL - 20091202_01 [ Edit: for sake of clarity, the affected application isn't JBoss CMS, as described in JCR-2609 , but an application of our own making sitting on top of JackRabbit, but the stacktraces look, line-for-line, the same, and we do have multiple JackRabbit instances running on the same physical box. ]
          Hide
          Yuri Sarbaev added a comment -

          I have the same issue with only one cms instance.

          java -version
          java version "1.6.0_26"
          Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
          Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)

          Show
          Yuri Sarbaev added a comment - I have the same issue with only one cms instance. java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)
          Hide
          Titu Petrea added a comment -

          Is it possible that a clean up job to delete files in the tmp folder ?

          On 09.11.2011, at 09:23, "Yuri Sarbaev (Commented) (JIRA)"

          Show
          Titu Petrea added a comment - Is it possible that a clean up job to delete files in the tmp folder ? On 09.11.2011, at 09:23, "Yuri Sarbaev (Commented) (JIRA)"
          Hide
          Yuri Sarbaev added a comment -

          Yes, you are right.
          Debian deletes files from the /tmp dir automaticaly.
          We've just set TMPTIME=-1 in /etc/default/rcS and now it works fine.

          Show
          Yuri Sarbaev added a comment - Yes, you are right. Debian deletes files from the /tmp dir automaticaly. We've just set TMPTIME=-1 in /etc/default/rcS and now it works fine.

            People

            • Assignee:
              Unassigned
              Reporter:
              Adrien Lamoureux
            • Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development