Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3441

tika server stuck in loop trying to bind

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.25
    • 2.0.0, 1.27
    • docker, server
    • None

    Description

      Tika seems to be stuck in a loop trying to bind. Please see the attached logs.
      It is also using a lot of CPU while doing that, so in an environment with horizontal scaling it causes other containers to spin up.

      This seems to happen from time to time to one of the multiple containers running, normally everything runs smoothly.  

      Not sure why it fails to bind in the first place (it's running in docker).

      I suspect that it may be related to the fact that it is trying to bind very quickly.

      Ideas on how to debug this issue further are welcome.

       

      Attachments

        1. bind_issue.txt
          621 kB
          Cristian Zamfir
        2. logs_before.txt
          12 kB
          Cristian Zamfir

        Activity

          tallison Tim Allison added a comment -

          Are you using 1.25 or 1.26? Is this fairly rare or easily reproducible?

          tallison Tim Allison added a comment - Are you using 1.25 or 1.26? Is this fairly rare or easily reproducible?

          Sorry, it is 1.25, I selected the wrong version above.

          It is not easy to reproduce, but on the other hand it is not super rare. I have seen it twice already in about a 7 days period where I have 1-6 containers running (depending on load). 

          zamf Cristian Zamfir added a comment - Sorry, it is 1.25, I selected the wrong version above. It is not easy to reproduce, but on the other hand it is not super rare. I have seen it twice already in about a 7 days period where I have 1-6 containers running (depending on load). 
          tallison Tim Allison added a comment -

          Is there any logging before the infinite loops so we can see what might have brought it down initially?

          tallison Tim Allison added a comment - Is there any logging before the infinite loops so we can see what might have brought it down initially?
          tallison Tim Allison added a comment -

          initially most recently

          tallison Tim Allison added a comment - initially most recently
          tallison Tim Allison added a comment -

          Is there any chance something like this is happening in your environment: https://github.com/docker/for-linux/issues/1180

          tallison Tim Allison added a comment - Is there any chance something like this is happening in your environment: https://github.com/docker/for-linux/issues/1180
          tallison Tim Allison added a comment -

          Or, I’m wondering if the OS is not cleaning up the port quickly enough, as it sounds like you suggested? At the least, we should put a pause after a failed restart if the port is bound. Ideally, we’d log the process that was binding to the port.

          tallison Tim Allison added a comment - Or, I’m wondering if the OS is not cleaning up the port quickly enough, as it sounds like you suggested? At the least, we should put a pause after a failed restart if the port is bound. Ideally, we’d log the process that was binding to the port.

          I attached the logs just before. Not sure if it is a coincidence, but it does happen after a parse timeout hits.

          zamf Cristian Zamfir added a comment - I attached the logs just before. Not sure if it is a coincidence, but it does happen after a parse timeout hits.

          I would say that it is not an issue like the one you mentioned with docker. The container is not restarted, instead the Tika spawned child is restarted due to the timeout. 

          Actually it is not clear why after a parse timeout it's not just the child that restarts, so the 9998 port should still be in use by the parent, is that right?

          zamf Cristian Zamfir added a comment - I would say that it is not an issue like the one you mentioned with docker. The container is not restarted, instead the Tika spawned child is restarted due to the timeout.  Actually it is not clear why after a parse timeout it's not just the child that restarts, so the 9998 port should still be in use by the parent, is that right?
          tallison Tim Allison added a comment -

          The port is used only by the child. The parent just monitors the sub process.

          To confirm, the event that triggered this chain was a child timeout?

          tallison Tim Allison added a comment - The port is used only by the child. The parent just monitors the sub process. To confirm, the event that triggered this chain was a child timeout?
          tallison Tim Allison added a comment -

          Oops… sorry on phone and missed your earlier comment. That’ll help when I’m back to keyboard. Thank you!

          tallison Tim Allison added a comment - Oops… sorry on phone and missed your earlier comment. That’ll help when I’m back to keyboard. Thank you!

          A side question - if the parent process only monitors the child, I presume it's safe to assume that it does not need a lot of memory. Can I assume I only need 200MB for the parent?

          A pause after restart on bind issues looks like a good thing to have in any case. But the child restarts successfully e.g., on OOM without hitting this issues.

          For a docker setup I can solve it with a liveness check on the /status api endpoint, but it would be nice to also know more about the underlying issue causing this.

          zamf Cristian Zamfir added a comment - A side question - if the parent process only monitors the child, I presume it's safe to assume that it does not need a lot of memory. Can I assume I only need 200MB for the parent? A pause after restart on bind issues looks like a good thing to have in any case. But the child restarts successfully e.g., on OOM without hitting this issues. For a docker setup I can solve it with a liveness check on the /status api endpoint, but it would be nice to also know more about the underlying issue causing this.
          tallison Tim Allison added a comment -

          Sorry, I see the original logs you attached, but that's where the problem starts, right?

          2021-06-09T11:43:22.472293991+02:00 ERROR Could not start Jetty server on port 9,998: Failed to bind to /0.0.0.0:9998
          

          Was there a timeout before: 2021-06-09T11:43:21.647771746+02:00?

          tallison Tim Allison added a comment - Sorry, I see the original logs you attached, but that's where the problem starts, right? 2021-06-09T11:43:22.472293991+02:00 ERROR Could not start Jetty server on port 9,998: Failed to bind to /0.0.0.0:9998 Was there a timeout before: 2021-06-09T11:43:21.647771746+02:00?

          I attached now another file that somehow did not get attached before. The new logs show the first time the bind error happened.

          zamf Cristian Zamfir added a comment - I attached now another file that somehow did not get attached before. The new logs show the first time the bind error happened.
          tallison Tim Allison added a comment -

          Very helpful. Thank you!

          tallison Tim Allison added a comment - Very helpful. Thank you!
          tallison Tim Allison added a comment -

          I can't explain why the port was still in use. My guess is that the operating system hadn't yet released it.

          I did find a race condition in that the forked process writes "success" to the forking process before the server is successfully launched. I've fixed this in 1.x.

          I've added a check for a bind exception, and the forking process will now retry on a bind exception 5 times (waiting 1 second between tries).

          This part of the code is quite different in 2.x, but I'll review that to see if we need to change anything there.

          I don't know if you're able to pull a snapshot version of 1.27 and try it out. If you could, that'd be great.

          tallison Tim Allison added a comment - I can't explain why the port was still in use. My guess is that the operating system hadn't yet released it. I did find a race condition in that the forked process writes "success" to the forking process before the server is successfully launched. I've fixed this in 1.x. I've added a check for a bind exception, and the forking process will now retry on a bind exception 5 times (waiting 1 second between tries). This part of the code is quite different in 2.x, but I'll review that to see if we need to change anything there. I don't know if you're able to pull a snapshot version of 1.27 and try it out. If you could, that'd be great.
          tallison Tim Allison added a comment -

          >Can I assume I only need 200MB for the parent?

          yes.

          tallison Tim Allison added a comment - >Can I assume I only need 200MB for the parent? yes.
          tallison Tim Allison added a comment -

          ndipiazza_gmail, have you seen this? You're working with the 2.x branch, right?

          tallison Tim Allison added a comment - ndipiazza_gmail , have you seen this? You're working with the 2.x branch, right?
          tallison Tim Allison added a comment -

          I'm wondering if the spawned/abandoned tesseract processes are keeping the port open?

          What may be happening is that your process had a TCP port open when it crashed or otherwise exited without explicitly closing it. Normally the OS cleans up these sorts of things, but only when the process record goes away. While the process may not appear to be running any more, there is at least one thing that can keep a record of it around, in order to prevent reuse of its PID. This is the existence of a child process that is not detached from the parent.
          

          https://serverfault.com/questions/181015/how-do-you-free-up-a-port-being-held-open-by-dead-process

          tallison Tim Allison added a comment - I'm wondering if the spawned/abandoned tesseract processes are keeping the port open? What may be happening is that your process had a TCP port open when it crashed or otherwise exited without explicitly closing it. Normally the OS cleans up these sorts of things, but only when the process record goes away. While the process may not appear to be running any more, there is at least one thing that can keep a record of it around, in order to prevent reuse of its PID. This is the existence of a child process that is not detached from the parent. https://serverfault.com/questions/181015/how-do-you-free-up-a-port-being-held-open-by-dead-process

          No we have not seen this. We do not have a huge amount of people using the Tesseract though. 

          I have still been primarily working on Tika 1.x. 

          I could try to reproduce. 

          zamf can you clarify some more what are the exact steps to reproduce? From the existing details in the ticket I'm having a hard time understand what your exact steps are to reproduce.

          ndipiazza Nicholas DiPiazza added a comment - No we have not seen this. We do not have a huge amount of people using the Tesseract though.  I have still been primarily working on Tika 1.x.  I could try to reproduce.  zamf  can you clarify some more what are the exact steps to reproduce? From the existing details in the ticket I'm having a hard time understand what your exact steps are to reproduce.
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #134 (See https://ci-builds.apache.org/job/Tika/job/tika-branch1x-jdk8/134/)
          TIKA-3441 – prevent infinite loop on failure to bind to a port (tallison: https://github.com/apache/tika/commit/fd98eeefc968b7e564d5446ae774f7f1f21aed93)

          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #134 (See https://ci-builds.apache.org/job/Tika/job/tika-branch1x-jdk8/134/ ) TIKA-3441 – prevent infinite loop on failure to bind to a port (tallison: https://github.com/apache/tika/commit/fd98eeefc968b7e564d5446ae774f7f1f21aed93 ) (edit) tika-server/src/main/java/org/apache/tika/server/TikaServerCli.java (edit) tika-server/src/main/java/org/apache/tika/server/TikaServerWatchDog.java TIKA-3441 – improve likelihood that tesseract processes will be shutdown on crash. (tallison: https://github.com/apache/tika/commit/d7fa2cd284a0d400a1ef29f7111018bb16b1cc5d ) (edit) tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java

          tallison - unfortunately there is no container built for 1.27 available un hub.docker.com so it makes it harder to reproduce. I will upgrade to 1.26 though.

          tallison - I think the defunct tesseract processes could explain it. Since they were lingering there for a while, looks like the solution is to explicitly kill them. 

           

          ndipiazza 

          • I am using the docker image apache/tika:latest-full (side note: this appears to be 1.25 instead of 1.26, but most likely the intention is to use 1.26 in latest).
          • The custom parameters (can also be seen in the environment field in Jira) are: -spawnChild TIKA_CHILD_JVM_OPTS=-JXmx3g -JXX:+ExitOnOutOfMemoryError -status"
          • To reproduce,  just feed it a lot of files. In this particular case the issue appeared after a few hours and after a parse timeout error. That timeout caused a restart of the tika child which then got the bind error.
          • Tesseract is enabled using the default settings in the apache docker image.
          • I mentioned some other environment factors like k8s horizontal scaling, you can scrape that, I am pretty sure that this is an issue with the OS keeping the port open so the issue can manifest with a single tika docker container.

           

           

           

          zamf Cristian Zamfir added a comment - tallison  - unfortunately there is no container built for 1.27 available un hub.docker.com so it makes it harder to reproduce. I will upgrade to 1.26 though. tallison  - I think the defunct tesseract processes could explain it. Since they were lingering there for a while, looks like the solution is to explicitly kill them.    ndipiazza   I am using the docker image apache/tika:latest-full (side note: this appears to be 1.25 instead of 1.26, but most likely the intention is to use 1.26 in latest). The custom parameters (can also be seen in the environment field in Jira) are: -spawnChild TIKA_CHILD_JVM_OPTS=-JXmx3g -JXX:+ExitOnOutOfMemoryError -status" To reproduce,  just feed it a lot of files. In this particular case the issue appeared after a few hours and after a parse timeout error. That timeout caused a restart of the tika child which then got the bind error. Tesseract is enabled using the default settings in the apache docker image. I mentioned some other environment factors like k8s horizontal scaling, you can scrape that, I am pretty sure that this is an issue with the OS keeping the port open so the issue can manifest with a single tika docker container.      
          tallison Tim Allison added a comment -

          Let me see if I can reproduce locally. The changes I made should help some, and we can add a shutdown hook, but I don’t think there’s a way around an os’s oom killer, for example. Still, we should do what we can.

          tallison Tim Allison added a comment - Let me see if I can reproduce locally. The changes I made should help some, and we can add a shutdown hook, but I don’t think there’s a way around an os’s oom killer, for example. Still, we should do what we can.
          zamf Cristian Zamfir added a comment - - edited

          I caught another instance of this issue happening and killed this defunct process in the container:
             2498 ?        Z      0:00 [jspawnhelper] <defunct>
           and after that, the child process was able to get past the bind error.
           

          zamf Cristian Zamfir added a comment - - edited I caught another instance of this issue happening and killed this defunct process in the container:    2498 ?        Z      0:00 [jspawnhelper] <defunct>  and after that, the child process was able to get past the bind error.  
          tallison Tim Allison added a comment - - edited

          On Ubuntu 20.04.2 LTS, with openjdk version "1.8.0_292" and "openjdk version '14.0.2' 2020-07-14", and the 1.25 server, the tesseract processes are orphaned, but the restart is able to reclaim the port. Obv, we want to 1) prevent orphaning tesseract to the best of our ability and 2) we absolutely need to prevent infinite loops when the port can't be bound, but I'm wondering if we should experiment with different OS images for our docker container so that orphaned processes don't prevent port binding?

          tallison Tim Allison added a comment - - edited On Ubuntu 20.04.2 LTS, with openjdk version "1.8.0_292" and "openjdk version '14.0.2' 2020-07-14", and the 1.25 server, the tesseract processes are orphaned, but the restart is able to reclaim the port. Obv, we want to 1) prevent orphaning tesseract to the best of our ability and 2) we absolutely need to prevent infinite loops when the port can't be bound, but I'm wondering if we should experiment with different OS images for our docker container so that orphaned processes don't prevent port binding?
          tallison Tim Allison added a comment -

          Nope...that's what we're already using in our image... <face_palm/>

          tallison Tim Allison added a comment - Nope...that's what we're already using in our image... <face_palm/>

          It does not reproduce that easily. Among 6 containers each having several tens of restarts per day, I only saw this issue 3 times. 

          zamf Cristian Zamfir added a comment - It does not reproduce that easily. Among 6 containers each having several tens of restarts per day, I only saw this issue 3 times. 
          tallison Tim Allison added a comment -

          I've added shutdown hooks to TesseractOCRParser so that if the jvm shuts itself down, that should prevent orphaned tesseract processes.

          I was still able to orphan tesseract processes if I kill -9'd the forked process. I'm not sure how to handle that.

          I was not able to reproduce the infinite loop, but, as you said that should be rare.

          As mentioned above, I added code to prevent infinite loops on failed restart. So, if there is a failure to bind within 10 seconds (we should make that configurable?). The tika-server won't restart.

          I added a waitFor(10 seconds) in the forking process before forcibly destroying child process. This should increase the chances that the child process will terminate itself and thereby clean up orphanable tesseract processes.

          I noticed there was resource leakage in the child process's shutdown hooks. With each restart, another shutdown hook was added. The code intended to remove the previous and then add a new. That was not working, which means that there were 20464 thread objects in the shutdown hook map in the attached logs. This is working now.

          tallison Tim Allison added a comment - I've added shutdown hooks to TesseractOCRParser so that if the jvm shuts itself down, that should prevent orphaned tesseract processes. I was still able to orphan tesseract processes if I kill -9 'd the forked process. I'm not sure how to handle that. I was not able to reproduce the infinite loop, but, as you said that should be rare. As mentioned above, I added code to prevent infinite loops on failed restart. So, if there is a failure to bind within 10 seconds (we should make that configurable?). The tika-server won't restart. I added a waitFor(10 seconds) in the forking process before forcibly destroying child process. This should increase the chances that the child process will terminate itself and thereby clean up orphanable tesseract processes. I noticed there was resource leakage in the child process's shutdown hooks. With each restart, another shutdown hook was added. The code intended to remove the previous and then add a new. That was not working, which means that there were 20464 thread objects in the shutdown hook map in the attached logs. This is working now.
          tallison Tim Allison added a comment -

          After the addition of the shutdownhooks in TesseractOCRParser (e.g. when you migrate to 1.27), you should NOT use -JXX:+ExitOnOutOfMemoryError because this shutsdown the jvm without calling the shutdownhooks and tesseract will be orphaned.

          tallison Tim Allison added a comment - After the addition of the shutdownhooks in TesseractOCRParser (e.g. when you migrate to 1.27), you should NOT use -JXX:+ExitOnOutOfMemoryError because this shutsdown the jvm without calling the shutdownhooks and tesseract will be orphaned.
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #135 (See https://ci-builds.apache.org/job/Tika/job/tika-branch1x-jdk8/135/)
          TIKA-3441 – improve likelihood that tesseract processes will be shutdown on jvm restart. (tallison: https://github.com/apache/tika/commit/1224f881a1dc54282281cdf2fc9ecf3f3e429393)

          • (edit) CHANGES.txt
          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java
          • (add) tika-core/src/main/java/org/apache/tika/parser/AbstractExternalProcessParser.java
          • (edit) tika-server/src/test/java/org/apache/tika/server/TikaServerIntegrationTest.java
          • (edit) tika-server/src/main/java/org/apache/tika/server/TikaServerWatchDog.java
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #135 (See https://ci-builds.apache.org/job/Tika/job/tika-branch1x-jdk8/135/ ) TIKA-3441 – improve likelihood that tesseract processes will be shutdown on jvm restart. (tallison: https://github.com/apache/tika/commit/1224f881a1dc54282281cdf2fc9ecf3f3e429393 ) (edit) CHANGES.txt (edit) tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java (add) tika-core/src/main/java/org/apache/tika/parser/AbstractExternalProcessParser.java (edit) tika-server/src/test/java/org/apache/tika/server/TikaServerIntegrationTest.java (edit) tika-server/src/main/java/org/apache/tika/server/TikaServerWatchDog.java
          tallison Tim Allison added a comment -

          I made the changes in main, and the results look decent.

          tallison Tim Allison added a comment - I made the changes in main , and the results look decent.
          tallison Tim Allison added a comment -

          With the exception of this gem with Java 14:

          WARN  [qtp1256893889-32] 12:27:19,572 org.eclipse.jetty.server.HttpChannel /rmeta
          java.lang.NoClassDefFoundError: Could not initialize class java.io.DeleteOnExitHook
          	at java.io.File.deleteOnExit(File.java:1094) ~[?:?]
          	at org.apache.cxf.helpers.FileUtils.createTmpDir(FileUtils.java:161) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.helpers.FileUtils.getDefaultTempDir(FileUtils.java:88) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.helpers.FileUtils.createTempFile(FileUtils.java:269) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.helpers.FileUtils.createTempFile(FileUtils.java:263) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.io.CachedOutputStream.createFileOutputStream(CachedOutputStream.java:475) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.io.CachedOutputStream.enforceLimits(CachedOutputStream.java:437) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.io.CachedOutputStream.write(CachedOutputStream.java:445) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.helpers.IOUtils.copy(IOUtils.java:193) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.helpers.IOUtils.copy(IOUtils.java:146) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.io.DelegatingInputStream.cacheInput(DelegatingInputStream.java:54) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http.AbstractHTTPDestination$1.cacheInput(AbstractHTTPDestination.java:321) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http.AbstractHTTPDestination.cacheInput(AbstractHTTPDestination.java:600) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http.AbstractHTTPDestination.flushHeaders(AbstractHTTPDestination.java:622) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.flushHeaders(JettyHTTPDestination.java:272) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http.AbstractHTTPDestination$WrappedOutputStream.close(AbstractHTTPDestination.java:784) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.AbstractConduit.close(AbstractConduit.java:56) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http.AbstractHTTPDestination$BackChannelConduit.close(AbstractHTTPDestination.java:722) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:63) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.Server.handle(Server.java:516) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882) [tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036) [tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
          	at java.lang.Thread.run(Thread.java:832) [?:?]
          
          tallison Tim Allison added a comment - With the exception of this gem with Java 14: WARN [qtp1256893889-32] 12:27:19,572 org.eclipse.jetty.server.HttpChannel /rmeta java.lang.NoClassDefFoundError: Could not initialize class java.io.DeleteOnExitHook at java.io.File.deleteOnExit(File.java:1094) ~[?:?] at org.apache.cxf.helpers.FileUtils.createTmpDir(FileUtils.java:161) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.helpers.FileUtils.getDefaultTempDir(FileUtils.java:88) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.helpers.FileUtils.createTempFile(FileUtils.java:269) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.helpers.FileUtils.createTempFile(FileUtils.java:263) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.io.CachedOutputStream.createFileOutputStream(CachedOutputStream.java:475) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.io.CachedOutputStream.enforceLimits(CachedOutputStream.java:437) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.io.CachedOutputStream.write(CachedOutputStream.java:445) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.helpers.IOUtils.copy(IOUtils.java:193) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.helpers.IOUtils.copy(IOUtils.java:146) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.io.DelegatingInputStream.cacheInput(DelegatingInputStream.java:54) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http.AbstractHTTPDestination$1.cacheInput(AbstractHTTPDestination.java:321) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http.AbstractHTTPDestination.cacheInput(AbstractHTTPDestination.java:600) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http.AbstractHTTPDestination.flushHeaders(AbstractHTTPDestination.java:622) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.flushHeaders(JettyHTTPDestination.java:272) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http.AbstractHTTPDestination$WrappedOutputStream.close(AbstractHTTPDestination.java:784) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.AbstractConduit.close(AbstractConduit.java:56) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http.AbstractHTTPDestination$BackChannelConduit.close(AbstractHTTPDestination.java:722) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:63) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.Server.handle(Server.java:516) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) ~[tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882) [tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036) [tika-server-standard-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at java.lang.Thread.run(Thread.java:832) [?:?]
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #258 (See https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/258/)
          TIKA-3441 – increase chances that TesseractOCRParser will not orphan tesseract process and ensure that tika-server never enters an infinite loop on a bind exception. (tallison: https://github.com/apache/tika/commit/51fb33f3067bda551905056110bcf251d4bb3d23)

          • (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-ocr-module/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java
          • (edit) tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/TikaServerWatchDog.java
          • (edit) tika-core/src/main/java/org/apache/tika/pipes/PipesClient.java
          • (add) tika-core/src/main/java/org/apache/tika/parser/AbstractExternalProcessParser.java
          • (edit) tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/TikaServerProcess.java
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #258 (See https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/258/ ) TIKA-3441 – increase chances that TesseractOCRParser will not orphan tesseract process and ensure that tika-server never enters an infinite loop on a bind exception. (tallison: https://github.com/apache/tika/commit/51fb33f3067bda551905056110bcf251d4bb3d23 ) (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-ocr-module/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java (edit) tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/TikaServerWatchDog.java (edit) tika-core/src/main/java/org/apache/tika/pipes/PipesClient.java (add) tika-core/src/main/java/org/apache/tika/parser/AbstractExternalProcessParser.java (edit) tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/TikaServerProcess.java

          People

            Unassigned Unassigned
            zamf Cristian Zamfir
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: