Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2243

GrobidRESTParser executes when no parser matches to MIME-type

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Not A Problem
    • Affects Version/s: 1.14
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None
    • Environment:

      Apache Maven 3.2.5
      Java version: 1.8.0_112, vendor: Oracle Corporation
      Archlinux:
      OS name: "linux", version: "4.8.11-1-arch", arch: "amd64", family: "unix"

      Description

      Generated a tika-config.xml with:

      java -jar target/tika-app-1.14.jar --dump-static-config
      

      Now I'm commenting out the PDF converter:

          <parser class="org.apache.tika.parser.pdf.PDFParser">
            <mime-exclude>application/pdf</mime-exclude>
          </parser>
      

      Converting a PDF document with the following code:

      TikaConfig config = new TikaConfig( "tika-config.xml" );
      AutoDetectParser parser = new AutoDetectParser( config );
      
      ParseContext context = new ParseContext( );
       		
      String fileName = "test.pdf";
      InputStream stream = new FileInputStream( fileName );
      
      Metadata metadata = new Metadata( );
      metadata.add( TikaMetadataKeys.RESOURCE_NAME_KEY, fileName );
      
      ContentHandler handler = new WriteOutContentHandler(
        new StringWriter( ), -1 );	
      
      parser.parse( stream, handler, metadata, context );
      		
      String content = handler.toString( );
      System.out.println( content );
      

      Then I get:

      ...
           at org.apache.tika.parser.journal.GrobidRESTParser.parse(GrobidRESTParser.java:77)
              at org.apache.tika.parser.journal.JournalParser.parse(JournalParser.java:60)
              at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
              at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
              at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
              at com.eurospider.conversion.filter.test.ConversionTest.setUp(ConversionTest.java:66)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
              at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
              at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
              at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
              at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
              at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
              at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
              at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
              at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
              at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
              at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
              at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
              at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
              at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
              at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
              at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
              at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
              at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
              at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
              at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
      Caused by: javax.ws.rs.ProcessingException: No message body writer has been found for class org.apache.cxf.jaxrs.ext.multipart.MultipartBody, ContentType: multipart/form-data
              at org.apache.cxf.jaxrs.client.AbstractClient.reportMessageHandlerProblem(AbstractClient.java:740)
              at org.apache.cxf.jaxrs.client.AbstractClient.writeBody(AbstractClient.java:469)
              at org.apache.cxf.jaxrs.client.WebClient$BodyWriter.doWriteBody(WebClient.java:1215)
      ..
      

      The GrobidRESTParser is not configured in tika-config.xml, so why does it get executed then?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              andreasbaumann Andreas Baumann
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: