Uploaded image for project: 'Apache Any23 (Retired)'
  1. Apache Any23 (Retired)
  2. ANY23-314

Service fails to return extraction in case of extraction error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1
    • 2.2
    • service
    • None
    • Any23 2.2-SNAPSHOT

    Description

      See the following command line extraction

      lmcgibbn@LMC-056430 /usr/local/any23(master) $ ./cli/target/appassembler/bin/any23 rover -l output.log -o extraction.json https://www.jobcluster.de
      
      ------------------------------------------------------------------------
      Apache Any23 :: rover
      ------------------------------------------------------------------------
      
      0    [main] WARN  org.apache.tika.parser.image.ImageParser  - JBIG2ImageReader not loaded. jbig2 files will be ignored
      128  [main] INFO  org.apache.any23.rdf.PopularPrefixes  - Loading prefixes from /org/apache/any23/prefixes/prefixes.properties
      1388 [main] WARN  org.apache.commons.httpclient.HttpMethodBase  - Going to buffer response body of large or unknown size. Using getResponseBodyAsStream instead is recommended.
      4790 [main] INFO  org.apache.any23.extractor.SingleDocumentExtraction  - Processing https://www.jobcluster.de/
      [Fatal Error] :12:46: The entity name must immediately follow the '&' in the entity reference.
      
      ------------------------------------------------------------------------
      Apache Any23 FAILURE
      
      Execution terminated with errors: Error while parsing RDF document.
      
      Total time: 5s
      Finished at: Tue Dec 12 08:01:14 PST 2017
      Final Memory: 31M/184M
      ------------------------------------------------------------------------
      

      This results in the attached extraction result (extraction.json) and associated log (output.log)
      If I attempt to run the same extraction using the service at any23.org the (partial) extraction result should be returned regardless of whether the entire extraction was successful or not.

      The service servlet seems to be returning the extraction Exception as oppose to the preferred extraction result. This issue will fix that.

      Attachments

        1. output.log
          0.2 kB
          Lewis John McGibbney
        2. extraction.json
          34 kB
          Lewis John McGibbney

        Issue Links

          Activity

            People

              lewismc Lewis John McGibbney
              lewismc Lewis John McGibbney
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: