Apache Jena
  1. Apache Jena
  2. JENA-132

N3 / TURTLE serializers ignore relative URI

    Details

    • Type: Wish Wish
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Jena 2.10.1
    • Component/s: Jena, RDF API
    • Labels:
      None

      Description

      Unlike RDF/XML* serializers, N3 and TURTLE ignore the base URI in their output.

      val turtle =
      """
      @prefix foaf: <http://xmlns.com/foaf/0.1/> .
      @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

      <#JL>
      a foaf:Person ;
      foaf:homepage </2007/wiki/people/JoeLambda> ;
      foaf:img <images/me.jpg> ;
      foaf:name "Joe Lambda" .
      """

      val base = "http://w3.org/People/Joe"

      val model =

      { val m = ModelFactory.createDefaultModel() m.getReader("TURTLE").read(m, new StringReader(turtle), base) m }

      model.getWriter("TTL").write(model, System.out, base) // doesn't work as expected

      model.getWriter("RDF/XML-ABBREV").write(model, System.out, base) // this one is ok

        Activity

        Andy Seaborne made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Andy Seaborne made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Andy Seaborne [ andy.seaborne ]
        Fix Version/s Jena 2.10.1 [ 12324086 ]
        Resolution Fixed [ 1 ]
        Hide
        Andy Seaborne added a comment -

        Jena RIOT writes output RDF.
        if the data has relative IRIs, then relative IRIs are output.
        The parsers can be used to output relative IRIs by adding a processing stage (StreamRDF).

        Relative URIs are also used in output when the base is set in the write operation with
        a @base directive as the first line and can be removed.

        Show
        Andy Seaborne added a comment - Jena RIOT writes output RDF. if the data has relative IRIs, then relative IRIs are output. The parsers can be used to output relative IRIs by adding a processing stage (StreamRDF). Relative URIs are also used in output when the base is set in the write operation with a @base directive as the first line and can be removed.
        Hide
        Andy Seaborne added a comment -

        If I remember correctly because it needs finishing and testing.

        Also, from memory, all the <#> stuff need removing because the base is not going to have a fragment if used with HTTP GET.

        But it was a while ago.

        Show
        Andy Seaborne added a comment - If I remember correctly because it needs finishing and testing. Also, from memory, all the <#> stuff need removing because the base is not going to have a fragment if used with HTTP GET. But it was a while ago.
        Hide
        Paolo Castagna added a comment -

        Alexandre, the N3JenaWriterCommon.java file which Andy pointed you at is here:
        https://svn.apache.org/repos/asf/incubator/jena/Jena2/jena/trunk/src/main/java/com/hp/hpl/jena/n3/N3JenaWriterCommon.java

        Here is how you can check out Jena sources, run the tests and produce a patch:

        svn co https://svn.apache.org/repos/asf/incubator/jena/Jena2/jena/trunk/ jena
        cd jena
        mvn test
        ... here you make your changes (and add tests) ...
        mvn test
        svn diff > JENA-132.patch

        Use More Actions > Attach Files to attach the patch.

        I wonder why the functionality you want was there and it has always been commented code (since 2009).

        Show
        Paolo Castagna added a comment - Alexandre, the N3JenaWriterCommon.java file which Andy pointed you at is here: https://svn.apache.org/repos/asf/incubator/jena/Jena2/jena/trunk/src/main/java/com/hp/hpl/jena/n3/N3JenaWriterCommon.java Here is how you can check out Jena sources, run the tests and produce a patch: svn co https://svn.apache.org/repos/asf/incubator/jena/Jena2/jena/trunk/ jena cd jena mvn test ... here you make your changes (and add tests) ... mvn test svn diff > JENA-132 .patch Use More Actions > Attach Files to attach the patch. I wonder why the functionality you want was there and it has always been commented code (since 2009).
        Hide
        Tim Berners-Lee added a comment -

        Thanks. I didn't find N3JenaWriterCommon but did find
        http://jena.cvs.sourceforge.net/viewvc/jena/jena/src/com/hp/hpl/jena/n3/N3JenaWriter.java?revision=1.7&view=markup

        hmm no doAbbreviatedBaseURIref though ... wrong repo? wrong branch?

        In that file it looks like it is at
        Line 652 return "<"r.getURI()">" ;

        Where is thing hosted nowadays? Excuse my ignorance.

        Show
        Tim Berners-Lee added a comment - Thanks. I didn't find N3JenaWriterCommon but did find http://jena.cvs.sourceforge.net/viewvc/jena/jena/src/com/hp/hpl/jena/n3/N3JenaWriter.java?revision=1.7&view=markup hmm no doAbbreviatedBaseURIref though ... wrong repo? wrong branch? In that file it looks like it is at Line 652 return "<" r.getURI() ">" ; Where is thing hosted nowadays? Excuse my ignorance.
        Hide
        Andy Seaborne added a comment -

        Tim - this is not a bug. A bug would be where it does something wrong and the writer writes something that can't be parsed or isn't the RDF given as input.

        The writer writes correct and legal Turtle for the test case given.

        The relevant file is probably N3JenaWriterCommon and unused code controlled by doAbbreviatedBaseURIref. It needs enabling and testing, and test cases written.

        I look forward to a contribution to Jena.

        Show
        Andy Seaborne added a comment - Tim - this is not a bug. A bug would be where it does something wrong and the writer writes something that can't be parsed or isn't the RDF given as input. The writer writes correct and legal Turtle for the test case given. The relevant file is probably N3JenaWriterCommon and unused code controlled by doAbbreviatedBaseURIref. It needs enabling and testing, and test cases written. I look forward to a contribution to Jena.
        Hide
        Tim Berners-Lee added a comment -

        PLEA FOR RELATIVE URIs

        Many systems, including most of the ones I have built and use day to day,
        have many internal relative links but really are better
        built without a knowledge of their own base URI, in that stored URIs
        are always relative. Even if the software absolutizes them
        in processing them, as there are no absolute URIs for the local identifiers
        one can take a system such as a bug tracker and clone it easily to make
        a new copy of the same system.

        For example, issue tracking system, a calendar system, build with RDF
        are interesting to clone.

        This is not to say all systems are like this, but some are and they are an important
        class.

        For example one project I have a bunch RDF and rules, where there are
        locally defined instances and local ontologies, and I process it
        in file:// space much of the time, and browse it in http://
        space though a web server, but when I edit it the http space
        is actually proxies to a read-write-linked-data server on a different port.
        With Jena serializes, suddenly the URI of the slave server behind the proxy
        crops up in the files.

        All the URIs within the system are relative.

        Attempting to introduce Jena-based code to this system is currently blocked
        on the need for this bug to be resolved in Jena.

        For this reason, for example cwm's serializers use relative URIs, and even the
        RDF/XML one has an option to put relative URIs in namespaces.
        The rdflib.js library from Tabulator uses relative URIs by default.

        Yes, yjere are cases when people want to design systems without this properties,
        so absolute URIs should be an option.

        Note other reasons for relative URIs include readability, and storage space and transmission length.

        There is a classic failure mode for RDF systems in which
        developers bring up a system on test.acme.com and then move it to production.acme.com
        and everything breaking. I know there are cases for absolute URIs but the relative URIs are
        very important best practices.

        Tim

        PS: This sort of RDF system is like writing program or set of programs
        Imagine writing a program

        pi = 3.14159265359;
        print (2 * pi);

        an it being saved in circles.py as

        <file:///users/andys/programs/play/circles.py#pi> = 3.14159265359;
        print (2 * <file:///users/andys/programs/play/circles.py#pi>);

        Not practical, not the sort of thing you can copy and move around.

        Show
        Tim Berners-Lee added a comment - PLEA FOR RELATIVE URIs Many systems, including most of the ones I have built and use day to day, have many internal relative links but really are better built without a knowledge of their own base URI, in that stored URIs are always relative. Even if the software absolutizes them in processing them, as there are no absolute URIs for the local identifiers one can take a system such as a bug tracker and clone it easily to make a new copy of the same system. For example, issue tracking system, a calendar system, build with RDF are interesting to clone. This is not to say all systems are like this, but some are and they are an important class. For example one project I have a bunch RDF and rules, where there are locally defined instances and local ontologies, and I process it in file:// space much of the time, and browse it in http:// space though a web server, but when I edit it the http space is actually proxies to a read-write-linked-data server on a different port. With Jena serializes, suddenly the URI of the slave server behind the proxy crops up in the files. All the URIs within the system are relative. Attempting to introduce Jena-based code to this system is currently blocked on the need for this bug to be resolved in Jena. For this reason, for example cwm's serializers use relative URIs, and even the RDF/XML one has an option to put relative URIs in namespaces. The rdflib.js library from Tabulator uses relative URIs by default. Yes, yjere are cases when people want to design systems without this properties, so absolute URIs should be an option. Note other reasons for relative URIs include readability, and storage space and transmission length. There is a classic failure mode for RDF systems in which developers bring up a system on test.acme.com and then move it to production.acme.com and everything breaking. I know there are cases for absolute URIs but the relative URIs are very important best practices. Tim PS: This sort of RDF system is like writing program or set of programs Imagine writing a program pi = 3.14159265359; print (2 * pi); an it being saved in circles.py as < file:///users/andys/programs/play/circles.py#pi > = 3.14159265359; print (2 * < file:///users/andys/programs/play/circles.py#pi >); Not practical, not the sort of thing you can copy and move around.
        Andy Seaborne made changes -
        Field Original Value New Value
        Issue Type Bug [ 1 ] Wish [ 5 ]
        Priority Major [ 3 ] Minor [ 4 ]
        Hide
        Andy Seaborne added a comment -

        The output from the TTL writer is correct RDF - it just has not used the base URI to abbreviate the data. It's the same triples.

        The output files from the TTL writer are a failthful representation of the triples and are fully portable (i.e. the RDF will be read the same where ever the file is read from). Some people would argue this is the better behaviour to the relative URIs generated by the RDF/XML-ABBREV writer.

        A contribution to enable the writing of relative URIs by the TTL writer would be good.

        Reclassified as a feature request.

        Show
        Andy Seaborne added a comment - The output from the TTL writer is correct RDF - it just has not used the base URI to abbreviate the data. It's the same triples. The output files from the TTL writer are a failthful representation of the triples and are fully portable (i.e. the RDF will be read the same where ever the file is read from). Some people would argue this is the better behaviour to the relative URIs generated by the RDF/XML-ABBREV writer. A contribution to enable the writing of relative URIs by the TTL writer would be good. Reclassified as a feature request.
        Alexandre Bertails created issue -

          People

          • Assignee:
            Andy Seaborne
            Reporter:
            Alexandre Bertails
          • Votes:
            3 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development