Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2727

Parsing and detect mime type of XML file stuck in infinite loop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.17
    • 2.0.0, 1.19.1
    • detector, parser
    • None

    Description

      Hi,

      I'm trying to parse (even mime type detect) some XML file that it's not large, but kinda tricky and my process hangs on :

      XMLStringBuffer.append(char[], int, int) line: not available
      XMLStringBuffer.append(XMLString) line: not available
      XMLNSDocumentScannerImpl(XMLScanner).scanAttributeValue(XMLString, XMLString, String, boolean, String) line: not available
      XMLNSDocumentScannerImpl.scanAttribute(XMLAttributesImpl) line: not available
      XMLNSDocumentScannerImpl.scanStartElement() line: not available
      XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook() line: not available
      XMLNSDocumentScannerImpl$NSContentDispatcher(XMLDocumentFragmentScannerImpl$FragmentContentDispatcher).dispatch(boolean) line: not available
      XMLNSDocumentScannerImpl(XMLDocumentFragmentScannerImpl).scanDocument(boolean) line: not available
      XIncludeAwareParserConfiguration(XML11Configuration).parse(boolean) line: not available
      XIncludeAwareParserConfiguration(XML11Configuration).parse(XMLInputSource) line: not available
      SAXParserImpl$JAXPSAXParser(XMLParser).parse(XMLInputSource) line: not available
      SAXParserImpl$JAXPSAXParser(AbstractSAXParser).parse(InputSource) line: not available
      SAXParserImpl$JAXPSAXParser.parse(InputSource) line: not available
      SAXParserImpl.parse(InputSource, DefaultHandler) line: not available
      SAXParserImpl(SAXParser).parse(InputStream, DefaultHandler) line: 195
      XmlRootExtractor.extractRootElement(InputStream) line: 62
      XmlRootExtractor.extractRootElement(byte[]) line: 42
      MimeTypes.getMimeType(byte[]) line: 212
      MimeTypes.detect(InputStream, Metadata) line: 494
      DefaultDetector(CompositeDetector).detect(InputStream, Metadata) line: 84

       

      Please see attached XML file.

      Please advise.

      Thanks

      Attachments

        Activity

          People

            tallison Tim Allison
            slavago Slava G
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: