Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1325

Invalid XML character causing job to abort

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • ManifoldCF 2.3
    • ManifoldCF 2.5
    • SharePoint connector
    • None

    Description

      The following error is causing the Manifold job to abort, and subsequently the job not being able to finish.

      It would be good to have the crawler log this error, but not throw an exception which causes the entire job to stop.

      ERROR 2016-06-21 19:01:54,562 (Worker thread '6') system.WorkerThread - Exception tossed: XML parsing error: Character reference "&#xD83D" is an invalid XML character.
      org.apache.manifoldcf.core.interfaces.ManifoldCFException: XML parsing error: Character reference "&#xD83D" is an invalid XML character.
              at org.apache.manifoldcf.core.common.XMLDoc.init(XMLDoc.java:390)
              at org.apache.manifoldcf.core.common.XMLDoc.<init>(XMLDoc.java:286)
              at org.apache.manifoldcf.crawler.connectors.sharepoint.SPSProxyHelper.getFieldValues(SPSProxyHelper.java:2039)
              at org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository.processDocuments(SharePointRepository.java:974)
              at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
      Caused by: org.xml.sax.SAXParseException; lineNumber: 18; columnNumber: 64; Character reference "&#xD83D" is an invalid XML character.
              at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
              at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
              at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
              at org.apache.manifoldcf.core.common.XMLDoc.init(XMLDoc.java:359)
              ... 4 more
      

      Attachments

        1. CONNECTORS-1325.patch
          7 kB
          Karl Wright
        2. CONNECTORS-1325-2.patch
          2 kB
          Karl Wright
        3. CONNECTORS-1325-3.patch
          5 kB
          Karl Wright
        4. mcf-bad-ms-char.xml
          0.7 kB
          Konstantin Avdeev

        Activity

          People

            kwright@metacarta.com Karl Wright
            priethmuller Phil
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: