Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-1424

LOAD ... INTO GRAPH relies too much on filename extension

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • Jena 3.5.0
    • Jena 3.6.0
    • None
    • None
    • Ubuntu 16.04

    Description

      I tried to perform this SPARQL Update on an empty TDB:

      LOAD <http://api.finto.fi/rest/v1/yso/data> INTO GRAPH <http://www.yso.fi/onto/yso/>
      

      but got this error from tdbupdate:

      $ tdbupdate --loc tdb --update=load-yso.ru 
      org.apache.jena.update.UpdateException: Attempt to load quads into a graph
              at org.apache.jena.sparql.modify.UpdateEngineWorker.visit(UpdateEngineWorker.java:146)
              at org.apache.jena.sparql.modify.request.UpdateLoad.visit(UpdateLoad.java:64)
              at org.apache.jena.sparql.modify.UpdateVisitorSink.send(UpdateVisitorSink.java:46)
              at org.apache.jena.sparql.modify.UpdateVisitorSink.send(UpdateVisitorSink.java:26)
              at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:546)
              at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:553)
              at org.apache.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:59)
              at arq.update.execOneFile(update.java:105)
              at arq.update.execUpdate(update.java:81)
              at arq.cmdline.CmdUpdate.exec(CmdUpdate.java:63)
              at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
              at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
              at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
              at tdb.tdbupdate.main(tdbupdate.java:37)
      

      So basically Jena is complaining that I'm trying to load quads into a graph. But that's not true. The URL specified in the LOAD actually performs content negotiation and then redirects (302) either to an RDF/XML or a Turtle serialization (with the proper Content-type headers), both are graphs not quads.

      The problem seems to be that UpdateEngineWorker checks the URL specified in the LOAD as if it were a filename and throws an exception if its file extension doesn't match the known graph formats. In this case there is no extension so it won't match.

      The check was introduced 6 months ago in this commit:
      https://github.com/apache/jena/commit/931a437bb49fecdb1cb70a5e6225e27141dec86c#diff-d0b3b8995c502712dac778f5bb61bc9dR146

      If I use the URL that the above URL redirects to, which contains a .ttl file extension, loading works fine:

      LOAD <http://api.finto.fi/download/yso/yso-skos.ttl> INTO GRAPH <http://www.yso.fi/onto/yso/>
      

      But this means that the LOAD ... INTO GRAPH ... command cannot be used with arbitrary Linked Data URIs, just ones that happen to contain a file extension like .ttl or .rdf or .nt.

      Attachments

        Issue Links

          Activity

            People

              andy Andy Seaborne
              osma Osma Suominen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: