Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1712

GROBID parser fails in tika-app

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.11
    • Component/s: cli, server
    • Labels:
      None

      Description

      Hey Sergey do you have any idea why CXF's 3.0.3 rt-client would work fine in tika-server, but fail in tika-app? I'm seeing that with the GROBID parser. See:

      https://issues.apache.org/jira/browse/CXF-6545

      Try calling the GROBID parser from Tika app:

      java -classpath $HOME/git/grobidparser-resources/:target/tika-app-1.11-SNAPSHOT.jar org.apache.tika.cli.TikaCLI --config=$HOME/git/grobidparser-resources/tika-config.xml -J $HOME/git/grobid/papers/ICSE06.pdf

      After following this guide:

      https://wiki.apache.org/tika/GrobidJournalParser

      Works fine in Tika-Server - dies in Tika-app with:

      java.lang.NullPointerException
      	at org.apache.cxf.jaxrs.client.AbstractClient.setupOutInterceptorChain(AbstractClient.java:849)
      	at org.apache.cxf.jaxrs.client.AbstractClient.createMessage(AbstractClient.java:923)
      	at org.apache.cxf.jaxrs.client.WebClient.finalizeMessage(WebClient.java:1125)
      	at org.apache.cxf.jaxrs.client.WebClient.doChainedInvocation(WebClient.java:1098)
      	at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:894)
      	at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:865)
      	at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:331)
      	at org.apache.cxf.jaxrs.client.WebClient.post(WebClient.java:340)
      	at org.apache.tika.parser.journal.GrobidRESTParser.parse(GrobidRESTParser.java:82)
      	at org.apache.tika.parser.journal.JournalParser.parse(JournalParser.java:67)
      	at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:177)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
      	at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:158)
      	at org.apache.tika.cli.TikaCLI.handleRecursiveJson(TikaCLI.java:504)
      	at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:484)
      	at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:139)
      java.lang.NullPointerException
      	at org.apache.tika.parser.journal.GrobidRESTParser.parse(GrobidRESTParser.java:89)
      	at org.apache.tika.parser.journal.JournalParser.parse(JournalParser.java:67)
      	at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:177)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
      	at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:158)
      	at org.apache.tika.cli.TikaCLI.handleRecursiveJson(TikaCLI.java:504)
      	at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:484)
      	at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:139)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                chrismattmann Chris A. Mattmann
                Reporter:
                chrismattmann Chris A. Mattmann
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: