Uploaded image for project: 'Rya'
  1. Rya
  2. RYA-530

Rya can't ingest hostnames beginning with a number

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • sail
    • None

    Description

      I am attempting to ingest the latest DBpedia dataset. Rya is erroring out whenever it hits a URI with a hostname that begins with a number. I'm not sure if the problem is in Rya itself or in RDF4J.

       

      2020-05-28 00:53:07,971 ERROR [main -- parser thread] org.apache.rya.accumulo.mr.RdfFileInputFormat: Invalid IRI 'https://9p.io/plan9 [line 36207]
      org.eclipse.rdf4j.rio.RDFParseException: Invalid IRI 'https://9p.io/plan9 [line 36207]
      at org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportError(RDFParserHelper.java:322)
      at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.reportError(AbstractRDFParser.java:684)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.reportError(TurtleParser.java:1309)
      at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.resolveURI(AbstractRDFParser.java:387)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseURI(TurtleParser.java:941)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:588)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:385)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
      at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$2.run(RdfFileInputFormat.java:275)
      2020-05-28 00:53:07,972 ERROR [main -- parser thread] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[main -- parser thread,5,main] threw an Exception.
      java.lang.RuntimeException: Invalid IRI 'https://9p.io/plan9 [line 36207]
      at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$2.run(RdfFileInputFormat.java:280)
      Caused by: org.eclipse.rdf4j.rio.RDFParseException: Invalid IRI 'https://9p.io/plan9 [line 36207]
      at org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportError(RDFParserHelper.java:322)
      at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.reportError(AbstractRDFParser.java:684)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.reportError(TurtleParser.java:1309)
      at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.resolveURI(AbstractRDFParser.java:387)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseURI(TurtleParser.java:941)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:588)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:385)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
      at org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
      at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$2.run(RdfFileInputFormat.java:275)
      2020-05-28 00:53:07,972 ERROR [main -- reader thread] org.apache.rya.accumulo.mr.RdfFileInputFormat: Error processing line 38462 of input
      java.io.InterruptedIOException
      at java.io.PipedReader.receive(PipedReader.java:187)
      at java.io.PipedReader.receive(PipedReader.java:206)
      at java.io.PipedWriter.write(PipedWriter.java:150)
      at java.io.Writer.write(Writer.java:192)
      at java.io.Writer.write(Writer.java:157)
      at org.apache.rya.accumulo.mr.RdfFileInputFormat$RdfFileRecordReader$1.run(RdfFileInputFormat.java:249)

      Attachments

        Activity

          People

            brushworth Brad
            brushworth Brad
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: