Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2223

Upgrade xercesImpl to 2.11.0 to fix hang on issue in tika mimetype detection

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.11
    • 1.12
    • parser
    • None

    Description

      Stracktrace for the hang seems to be:

      at org.apache.xerces.impl.XMLScanner.scanExternalID(Unknown Source)
      at org.apache.xerces.impl.XMLDocumentScannerImpl.scanDoctypeDecl(Unknown Source)
      at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
      at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
      at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
      at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
      at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
      at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
      at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
      at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
      at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)
      at org.apache.tika.detect.XmlRootExtractor.extractRootElement(XmlRootExtractor.java:54)
      at org.apache.tika.detect.XmlRootExtractor.extractRootElement(XmlRootExtractor.java:41)
      at org.apache.tika.mime.MimeTypes.getMimeType(MimeTypes.java:192)
      at org.apache.tika.mime.MimeTypes.detect(MimeTypes.java:439)
      at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
      at org.apache.tika.cli.TikaCLI$10.process(TikaCLI.java:252)
      at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:417)
      at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:111)
      

      Attachments

        1. NUTCH-2223.patch
          1 kB
          Tien Nguyen Manh

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            markus17 Markus Jelsma
            tiennm Tien Nguyen Manh
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment