Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-712

ARQ serialises queries and updates using relative URIs but does not include a BASE clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • Jena 2.11.2
    • Jena 2.12.0
    • ARQ
    • None

    Description

      An internal discussion with harschware has raised what we think is a bug in ARQs behaviour though it is somewhat open to interpretation so may be controversial.

      The code I will attach demonstrates the issue.

      The problem arises as follows:

      1 - When given a query/update with a relative URI ARQ resolves it against an implicit Base URI of the current working directory
      2 - When applying toString() on the parsed Query or UpdateRequest the implicit Base URI is used and relative URIs are output but no `BASE` clause is output
      3 - The query is transmitted to a different system which has a different working directory and so interprets it differently resulting in unexpected behaviour/errors

      This causes us issues because the relative URIs are valid relative to the working directory of the client but not relative to the working directory of the server so we want absolute URIs to be transmitted to the server.

      For example given the following query string:

      SELECT * WHERE { <path/to/thing> a ?type }
      

      Calling toString() on the resulting Query object gives the following:

      SELECT  *
      WHERE
        { <path/to/thing> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type }
      

      Which does not include the `BASE` declaration, if we however force the `Query` object to have a null base via `setBaseURI((String)null)` ARQ prints the following when `toString()` is called:

      BASE    <file:///Users/rvesse/Documents/Work/Code/jena-playground/>
      
      SELECT  *
      WHERE
        { <path/to/thing> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type }
      

      More generally it seems that whenever an implicit Base URI is used or where a Base URI is passed only to the QueryFactory.create() or UpdateFactory.create() call a BASE declaration is never written i.e. when there is an IRIResolver set but not a specific Base URI no BASE declaration will be written but URIs will be serialised in relative form.

      We can appreciate that other people may have use cases where leaving relative URIs as-is and not including a `BASE` is desirable but our feeling is that in the more general case this does more harm than good and lets users shoot themselves in the foot unwittingly as we have done in this example.

      We would like to propose that the default behaviour should be for a `BASE` declaration to always be written if relative URIs are being output. Or at the very least we would like the behaviour to be configurable.

      Attachments

        1. JENA-712-AlwaysWriteBase.patch
          0.7 kB
          Rob Vesse
        2. JENA-712-ConfigurableOutputImplictBase.patch
          2 kB
          Rob Vesse
        3. JENA-712-ConfigurableOutputImplictBaseOnByDefault.patch
          3 kB
          Rob Vesse
        4. JENA-712-UseBaseOnlyIfExplicit.patch
          4 kB
          Andy Seaborne
        5. SparqlRelativeUriTreatment.java
          3 kB
          Rob Vesse

        Activity

          People

            andy Andy Seaborne
            rvesse Rob Vesse
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: