Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-4115

Grobid Quantities Parser not working properly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 2.8.0
    • None
    • parser, tika-app
    • None

    Description

      I've fixed some bugs in GrobidNERecogniser and pushed some fix here: https://github.com/apache/tika/pull/1280

      However the WebClient does return an NPE when checking if the server is alive (accessing http://localhost:8060/service/isalive):

      ```

      INFO  [main] 12:51:56,317 org.apache.tika.parser.ner.NamedEntityParser going to load, instantiate and bind the instance of org.apache.tika.parser.ner.grobid.GrobidNERecogniser
      INFO  [main] 12:51:56,484 org.apache.tika.parser.ner.grobid.GrobidNERecogniser Grobid Quantities REST Server is not running
      java.lang.NullPointerException: null
          at org.apache.cxf.jaxrs.client.AbstractClient.setupOutInterceptorChain(AbstractClient.java:937) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.AbstractClient.createMessage(AbstractClient.java:1014) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.WebClient.finalizeMessage(WebClient.java:1111) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.WebClient.doChainedInvocation(WebClient.java:1084) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:932) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:901) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:364) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.cxf.jaxrs.client.WebClient.get(WebClient.java:390) ~[tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.tika.parser.ner.grobid.GrobidNERecogniser.<init>(GrobidNERecogniser.java:78) [tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
          at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) [?:?]
          at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [?:?]
          at java.lang.reflect.Constructor.newInstance(Constructor.java:490) [?:?]
          at java.lang.Class.newInstance(Class.java:584) [?:?]
          at org.apache.tika.parser.ner.NamedEntityParser.initialize(NamedEntityParser.java:91) [tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.tika.parser.ner.NamedEntityParser.parse(NamedEntityParser.java:119) [tika-parser-nlp-package-2.8.1-SNAPSHOT.jar:?]
          at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:152) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
          at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
          at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
          at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:203) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
          at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:1071) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
          at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:493) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
          at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:256) [tika-app-2.8.1-SNAPSHOT.jar:2.8.1-SNAPSHOT]
      INFO  [main] 12:51:56,492 org.apache.tika.parser.ner.NamedEntityParser org.apache.tika.parser.ner.grobid.GrobidNERecogniser is available ? false
      INFO  [main] 12:51:56,516 org.apache.tika.parser.sentiment.SentimentAnalysisParser Sentiment Model is at https://raw.githubusercontent.com/USCDataScience/SentimentAnalysisParser/master/sentiment-models/src/main/resources/edu/usc/irds/sentiment/en-netflix-sentiment.bin
      INFO  [main] 12:51:56,885 org.apache.tika.parser.ner.NamedEntityParser Number of NERecognisers in chain 0
      Content-Length: 70
      Content-Type: text/plain
      X-TIKA:Parsed-By: org.apache.tika.parser.CompositeParser
      X-TIKA:Parsed-By: org.apache.tika.parser.ner.NamedEntityParser
      X-TIKA:Parsed-By-Full-Set: org.apache.tika.parser.CompositeParser
      X-TIKA:Parsed-By-Full-Set: org.apache.tika.parser.ner.NamedEntityParser
      resourceName: bao.txt

      ```

      After spending some time with it I did not manage to find the solution

      Attachments

        Activity

          People

            Unassigned Unassigned
            lfoppiano Luca Foppiano
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: