Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Jena 4.2.0
-
None
Description
The presence of "%" near to a syntax error might cause TokenizerText#fatal to throw an UnknownFormatConversionException instead of a RiotParseException. This happens because of the use of String#format without escaping "%". See the following example with an intended syntax error (additional " after lang-tag):
import java.io.ByteArrayInputStream; import static java.nio.charset.StandardCharsets.UTF_8; import org.apache.jena.rdf.model.ModelFactory; import org.apache.jena.riot.Lang; import org.apache.jena.riot.RDFParserBuilder; import org.junit.jupiter.api.Test; public class TokenizerTextTest { @Test public void fatal() { RDFParserBuilder.create().source(new ByteArrayInputStream( "<http://example.org/s> <http://example.org/p> \"example\"@en-US\" <http://example.org/%D8-graph>" .getBytes(UTF_8))).lang(Lang.NQUADS).parse(ModelFactory.createDefaultModel()); } }
This causes:
java.util.UnknownFormatConversionException: Conversion = 'D' at java.base/java.util.Formatter$FormatSpecifier.conversion(Formatter.java:2839) at java.base/java.util.Formatter$FormatSpecifier.<init>(Formatter.java:2865) at java.base/java.util.Formatter.parse(Formatter.java:2713) at java.base/java.util.Formatter.format(Formatter.java:2655) at java.base/java.util.Formatter.format(Formatter.java:2609) at java.base/java.lang.String.format(String.java:2897) at org.apache.jena.riot.tokens.TokenizerText.fatal(TokenizerText.java:1347) at org.apache.jena.riot.tokens.TokenizerText.readString(TokenizerText.java:773) at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:238) at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:89) at org.apache.jena.atlas.iterator.PeekIterator.fill(PeekIterator.java:50) at org.apache.jena.atlas.iterator.PeekIterator.next(PeekIterator.java:92) at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:98) at org.apache.jena.riot.lang.LangNQuads.parseOne(LangNQuads.java:78) at org.apache.jena.riot.lang.LangNQuads.runParser(LangNQuads.java:53) at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:43) at org.apache.jena.riot.RDFParserRegistry$ReaderRIOTLang.read(RDFParserRegistry.java:181) at org.apache.jena.riot.RDFParser.read(RDFParser.java:358) at org.apache.jena.riot.RDFParser.parseNotUri(RDFParser.java:348) at org.apache.jena.riot.RDFParser.parse(RDFParser.java:295) at org.apache.jena.riot.RDFParser.parse(RDFParser.java:241) at org.apache.jena.riot.RDFParser.parse(RDFParser.java:250) at org.apache.jena.riot.RDFParserBuilder.parse(RDFParserBuilder.java:574) at TokenizerTextTest.fatal(TokenizerTextTest.java:17)
Attachments
Issue Links
- links to