Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-1161

riot cmdline uses wrong base when parsing Turtle over http

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • Jena 3.0.1
    • Jena 3.1.0
    • Cmd line tools
    • None

    Description

      Parsing a Turtle file served over http:// or https:// which has got no @base, and uses relative IRI references, wrongly uses the current directory in file:/// as a base.

      The command line

      stain@biggie:/tmp$ riot https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl
      

      where that URL returns Content-Type: text/turtle;charset=utf-8 with the body:

      @prefix : <#> .
      <> :a <#test> .
      

      is wrongly parsed by the riot command line tool to be relative to the current directory:

      <file:///tmp/https:/cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl> <file:///tmp/https:/cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#a> <file:///tmp/https:/cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#test> .
      

      The expected output would be the same as supplying the same URI as a --base:

      stain@biggie:/tmp$ riot --base https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl
      <https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl> <https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#a> <https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#test> .
      

      (except if a Content-Location header is provided, or HTTP redirection has been followed - in which case the result of that should be used as base)

      Relevant specs:

      https://www.w3.org/TR/turtle/#sec-iri-references
      https://www.ietf.org/rfc/rfc3986 section 5.1 and 5.2

      Attachments

        Activity

          People

            andy Andy Seaborne
            stain Stian Soiland-Reyes
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: