Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2727

Parsing and detect mime type of XML file stuck in infinite loop

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.17
    • Fix Version/s: 2.0.0, 1.19.1
    • Component/s: detector, parser
    • Labels:
      None

      Description

      Hi,

      I'm trying to parse (even mime type detect) some XML file that it's not large, but kinda tricky and my process hangs on :

      XMLStringBuffer.append(char[], int, int) line: not available
      XMLStringBuffer.append(XMLString) line: not available
      XMLNSDocumentScannerImpl(XMLScanner).scanAttributeValue(XMLString, XMLString, String, boolean, String) line: not available
      XMLNSDocumentScannerImpl.scanAttribute(XMLAttributesImpl) line: not available
      XMLNSDocumentScannerImpl.scanStartElement() line: not available
      XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook() line: not available
      XMLNSDocumentScannerImpl$NSContentDispatcher(XMLDocumentFragmentScannerImpl$FragmentContentDispatcher).dispatch(boolean) line: not available
      XMLNSDocumentScannerImpl(XMLDocumentFragmentScannerImpl).scanDocument(boolean) line: not available
      XIncludeAwareParserConfiguration(XML11Configuration).parse(boolean) line: not available
      XIncludeAwareParserConfiguration(XML11Configuration).parse(XMLInputSource) line: not available
      SAXParserImpl$JAXPSAXParser(XMLParser).parse(XMLInputSource) line: not available
      SAXParserImpl$JAXPSAXParser(AbstractSAXParser).parse(InputSource) line: not available
      SAXParserImpl$JAXPSAXParser.parse(InputSource) line: not available
      SAXParserImpl.parse(InputSource, DefaultHandler) line: not available
      SAXParserImpl(SAXParser).parse(InputStream, DefaultHandler) line: 195
      XmlRootExtractor.extractRootElement(InputStream) line: 62
      XmlRootExtractor.extractRootElement(byte[]) line: 42
      MimeTypes.getMimeType(byte[]) line: 212
      MimeTypes.detect(InputStream, Metadata) line: 494
      DefaultDetector(CompositeDetector).detect(InputStream, Metadata) line: 84

       

      Please see attached XML file.

      Please advise.

      Thanks

        Attachments

          Activity

            People

            • Assignee:
              tallison@apache.org Tim Allison
              Reporter:
              slavago Slava G
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: