Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2974

unable to extract recursive metadata using tika rest server

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.22
    • 1.23
    • server
    • None
    • Mac OS 10.14.5

    Description

      Steps to reproduce:

      1. **Run the TIka 1.22 REST server:
      $ java -jar tika-server-1.22.jar 
      1. Attempt to extract the recursive metadata from a file (sample attached) using a multipart form:
      curl -F --upload=@test.pdf http://localhost:9998/rmeta/form 

      Expected Results:

      • The recursive metadata is output as JSON (this works using 1.21):
        $ curl -F --upload=@test.pdf http://localhost:9998/rmeta/form
        [{"Author":"Loren Siebert, DigitalGov Search Team","Content-Type":"application/pdf","Creation-Date":"2014-04-15T13:06:09Z","Last-Modified":"2014-04-15T13:06:09Z","Last-Save-Date":"2014-04-15T13:06:09Z" 

        Actual Results:

      • A 500 error is returned:
        $ curl -F --upload=@test.pdf http://localhost:9998/rmeta/form
        <html>
        <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
        <title>Error 500 Server Error</title>
        </head>
        <body><h2>HTTP ERROR 500</h2>
        <p>Problem accessing /rmeta/form. Reason:
        <pre>    Server Error</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.z-SNAPSHOT</a><hr/></body>
        </html> 
      • tika server log:
        $ java -jar tika-server-1.22.jar$ java -jar tika-server-1.22.jarOct 23, 2019 3:26:10 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblemWARNING: J2KImageReader not loaded. JPEG2000 files will not be processed.See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-iofor optional dependencies.
        Oct 23, 2019 3:26:10 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblemWARNING: org.xerial's sqlite-jdbc is not loaded.Please provide the jar on your classpath to parse sqlite files.See tika-parsers/pom.xml for the correct version.INFO  Starting Apache Tika 1.22 serverINFO  Setting the server's publish address to be http://localhost:9998/INFO  Logging initialized @1175ms to org.eclipse.jetty.util.log.Slf4jLogINFO  jetty-9.4.z-SNAPSHOT; built: 2019-04-29T20:42:08.989Z; git: e1bc35120a6617ee3df052294e433f3a25ce7097; jvm 1.8.0_20-b26INFO  Started ServerConnector@475c9c31{HTTP/1.1,[http/1.1]}{localhost:9998}INFO  Started @1292msWARN  Empty contextPathINFO  Started o.e.j.s.h.ContextHandler@79defdc{/,null,AVAILABLE}INFO  Started Apache Tika server at http://localhost:9998/INFO  rmeta/form (autodetecting type)INFO  Application {http://resource.server.tika.apache.org/}MetadataResource has thrown exception, unwinding now: java.io.IOException: Stream ClosedWARN  Exception in handleFault on interceptor org.apache.cxf.jaxrs.interceptor.JAXRSDefaultFaultOutInterceptor@4ce1a41eorg.apache.cxf.interceptor.Fault: Stream Closed at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:128) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:205) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:505) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804) at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: Stream Closed at java.io.FileInputStream.available(Native Method) at org.apache.cxf.attachment.DelegatingInputStream.available(DelegatingInputStream.java:78) at org.apache.cxf.helpers.IOUtils.consume(IOUtils.java:348) at org.apache.cxf.attachment.DelegatingInputStream.close(DelegatingInputStream.java:49) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:437) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:144) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadataFromMultipart(RecursiveMetadataResource.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) ... 26 moreERROR An unexpected error occurred during error handling. No further error processing will occur.org.apache.cxf.interceptor.Fault: Stream Closed at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:128) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:205) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:505) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804) at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException: Stream Closed at java.io.FileInputStream.available(Native Method) at org.apache.cxf.attachment.DelegatingInputStream.available(DelegatingInputStream.java:78) at org.apache.cxf.helpers.IOUtils.consume(IOUtils.java:348) at org.apache.cxf.attachment.DelegatingInputStream.close(DelegatingInputStream.java:49) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:437) at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:144) at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadataFromMultipart(RecursiveMetadataResource.java:85) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) ... 26 moreWARN  /rmeta/form java.io.IOException: Stream Closed

      Attachments

        1. test.pdf
          1.45 MB
          Martha Thompson

        Activity

          People

            Unassigned Unassigned
            martha_gsa Martha Thompson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: