Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3
-
None
Description
When browsing the page http://www.bray-sur-seine.fr/les-gagnants-du-concours-de-bd/ I encountered the following exception:
org.apache.any23.extractor.ExtractionException: Error while parsing RDF document. at org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:175) at org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:57) at org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:471) at org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:259) at org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:323) at org.apache.any23.extractor.html.AbstractExtractorTestCase.extract(AbstractExtractorTestCase.java:189) at org.apache.any23.extractor.html.AbstractExtractorTestCase.assertExtract(AbstractExtractorTestCase.java:204) ... 28 more Caused by: org.eclipse.rdf4j.rio.RDFParseException: org.xml.sax.SAXParseException; lineNumber: 205; columnNumber: 52; An invalid XML character (Unicode: 0x8) was found in the element content of the document. at org.semarglproject.rdf4j.rdf.rdfa.RDF4JRDFaParser.parse(RDF4JRDFaParser.java:111) at org.semarglproject.rdf4j.rdf.rdfa.RDF4JRDFaParser.parse(RDF4JRDFaParser.java:95) at org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:171) ... 34 more Caused by: org.semarglproject.rdf.ParseException: org.xml.sax.SAXParseException; lineNumber: 205; columnNumber: 52; An invalid XML character (Unicode: 0x8) was found in the element content of the document. at org.semarglproject.rdf.rdfa.RdfaParser.processException(RdfaParser.java:1141) at org.semarglproject.source.XmlSource.process(XmlSource.java:50) at org.semarglproject.source.StreamProcessor.processInternal(StreamProcessor.java:87) at org.semarglproject.source.BaseStreamProcessor.process(BaseStreamProcessor.java:167) at org.semarglproject.source.BaseStreamProcessor.process(BaseStreamProcessor.java:154) at org.semarglproject.rdf4j.rdf.rdfa.RDF4JRDFaParser.parse(RDF4JRDFaParser.java:109) ... 36 more Caused by: org.xml.sax.SAXParseException; lineNumber: 205; columnNumber: 52; An invalid XML character (Unicode: 0x8) was found in the element content of the document. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.semarglproject.source.XmlSource.process(XmlSource.java:48) ... 40 more
Attachments
Issue Links
- links to