Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ARQ, Fuseki
    • Labels:

      Description

      The capability to generate JSON directly from a SPARQL (or extended SPARQL) query would enable the creation of JSON data API over published linked data.

      This project would cover:

      1. Design and publication of a design.
      2. Refinement of design based on community feed
      3. Implementation, including testing.
      4. Refinement of implementation based on community feed

      Skills required: Java, some parser work, design and discussion with the user community, basic understanding of HTTP and content negotiation.

        Issue Links

          Activity

          Hide
          andy.seaborne Andy Seaborne added a comment -

          This is not a prescriptive design - just an example of a possibility.
          http://steveharris.tumblr.com/post/4590579712/construct-json

          It may be better to consider it a SELECT query.

          The relationship to content negotation of JSON-LD with a regular SPARQL CONSTRUCT query needs to be considered.

          This project may include working and interacting with the open source project that provides the JSON-LD engine that Jena uses. See https://github.com/jsonld-java/jsonld-java

          Show
          andy.seaborne Andy Seaborne added a comment - This is not a prescriptive design - just an example of a possibility. http://steveharris.tumblr.com/post/4590579712/construct-json It may be better to consider it a SELECT query. The relationship to content negotation of JSON-LD with a regular SPARQL CONSTRUCT query needs to be considered. This project may include working and interacting with the open source project that provides the JSON-LD engine that Jena uses. See https://github.com/jsonld-java/jsonld-java
          Hide
          alexsdutton Alexander Dutton added a comment - - edited

          If you'd like another perspective, I've been doing something similar to turn SPARQL resultsets into JSON objects by using carefully crafted variable names. Rows are grouped using the ?id variable, and underscores in variable names lead to nested objects. Separately we say which nested objects can appear multiple times. Hence, a query like this (with 'knows' flagged as an array):

          SELECT * WHERE {
            ?id a foaf:Person ;
              foaf:name ?name .
            OPTIONAL {
              ?id foaf:knows ?knows_id .
              ?knows_id foaf:name ?knows_name
            }
          }
          

          leads to e.g.

          [{
              "id": "http://example.com/alice",
              "name": "Alice",
              "knows": [{
                "id": "http://example.com/bob",
                "name": "Bob"
              }, ...]
          }, ...]
          

          For anything remotely complicated it requires some particularly verbose queries. Something more elegant could probably be done within Jena .

          Show
          alexsdutton Alexander Dutton added a comment - - edited If you'd like another perspective, I've been doing something similar to turn SPARQL resultsets into JSON objects by using carefully crafted variable names. Rows are grouped using the ?id variable, and underscores in variable names lead to nested objects. Separately we say which nested objects can appear multiple times. Hence, a query like this (with 'knows' flagged as an array): SELECT * WHERE { ?id a foaf:Person ; foaf:name ?name . OPTIONAL { ?id foaf:knows ?knows_id . ?knows_id foaf:name ?knows_name } } leads to e.g. [{ "id": "http://example.com/alice", "name": "Alice", "knows": [{ "id": "http://example.com/bob", "name": "Bob" }, ...] }, ...] For anything remotely complicated it requires some particularly verbose queries. Something more elegant could probably be done within Jena .
          Hide
          andy.seaborne Andy Seaborne added a comment -

          Yes - another design approach is to consider is a separate templating/formating system that takes the output of a SELECT query. The CONSTRUCT JSON form is one specific style of template - a more general one could be either in-query or part of the JSON data API setup.

          Show
          andy.seaborne Andy Seaborne added a comment - Yes - another design approach is to consider is a separate templating/formating system that takes the output of a SELECT query. The CONSTRUCT JSON form is one specific style of template - a more general one could be either in-query or part of the JSON data API setup.
          Hide
          akuckartz Andreas Kuckartz added a comment -

          Generating JSON-LD is an additional feature.

          Show
          akuckartz Andreas Kuckartz added a comment - Generating JSON-LD is an additional feature.
          Hide
          andy.seaborne Andy Seaborne added a comment -

          Jena supports JSON-LD already as an RDF format (as of last week ).

          Show
          andy.seaborne Andy Seaborne added a comment - Jena supports JSON-LD already as an RDF format (as of last week ).
          Hide
          dileepaj Dileepa Jayakody added a comment -

          Hi All,

          My name is Dileepa Jayakody, a research student from University of Moratuwa, Sri Lanka. My research interests include semantic-web, linked-data domains and I have worked with related linked-data technologies such as RDF, FOAF, SPARQL and semantic-web projects like Apache Stanbol.

          I'm interested in working on this project and wish to submit a GSOC proposal for this after doing some background research. Is this project about supporting the W3C recommendation for serializing SPARQL query results in JSON format [1]?

          Thanks,
          Dileepa
          [1] http://www.w3.org/TR/sparql11-results-json/

          Show
          dileepaj Dileepa Jayakody added a comment - Hi All, My name is Dileepa Jayakody, a research student from University of Moratuwa, Sri Lanka. My research interests include semantic-web, linked-data domains and I have worked with related linked-data technologies such as RDF, FOAF, SPARQL and semantic-web projects like Apache Stanbol. I'm interested in working on this project and wish to submit a GSOC proposal for this after doing some background research. Is this project about supporting the W3C recommendation for serializing SPARQL query results in JSON format [1] ? Thanks, Dileepa [1] http://www.w3.org/TR/sparql11-results-json/
          Hide
          andy.seaborne Andy Seaborne added a comment -

          The standard SPARQL query results in JSON format is already supported.

          This project is about allowing the application writer/data published define custom JSON output (see the 'construct-json' link).

          Currently, this project does not have a volunteer mentor.

          Show
          andy.seaborne Andy Seaborne added a comment - The standard SPARQL query results in JSON format is already supported. This project is about allowing the application writer/data published define custom JSON output (see the 'construct-json' link). Currently, this project does not have a volunteer mentor.
          Hide
          dileepaj Dileepa Jayakody added a comment -

          Hi Andy,

          Thanks for your reply.

          Dileepa

          Show
          dileepaj Dileepa Jayakody added a comment - Hi Andy, Thanks for your reply. Dileepa
          Hide
          kinow Bruno P. Kinoshita added a comment - - edited

          Hi,

          I've been trying to write a patch for this, but before having a definitive patch to submit I'll need to learn a bunch of things

          At the moment what I managed to do was to understand how the SPARQL and ARQ grammars are generated (with JavaCC) and I've updated them (master.jj, which creates sparql_11.jj, arq.jj and other files via the grammar executable).

          You can check my work-in-progress updated grammar here https://gist.github.com/kinow/875851f7379abfcb4f13

          I haven't messed up with compiler generators and grammars in a while, last time was in 2010 but I used Antlr. So there are probably parts that can be enhanced (like this part where I couldn't make JavaCC recognize a <COMMA> token).

          Even though the grammar seems to be working, and I'm collecting the "variable": ?variable in a java.util.Map in the Query object, I still have to update SPARQL and Fuseki to return JSON.

          BTW: I'm implementing what Andy suggested in the mailing list, a SELECT JSON statement, rather than CONSTRUCT JSON.

          Show
          kinow Bruno P. Kinoshita added a comment - - edited Hi, I've been trying to write a patch for this, but before having a definitive patch to submit I'll need to learn a bunch of things At the moment what I managed to do was to understand how the SPARQL and ARQ grammars are generated (with JavaCC) and I've updated them (master.jj, which creates sparql_11.jj, arq.jj and other files via the grammar executable). You can check my work-in-progress updated grammar here https://gist.github.com/kinow/875851f7379abfcb4f13 I haven't messed up with compiler generators and grammars in a while, last time was in 2010 but I used Antlr. So there are probably parts that can be enhanced (like this part where I couldn't make JavaCC recognize a <COMMA> token). Even though the grammar seems to be working, and I'm collecting the "variable": ?variable in a java.util.Map in the Query object, I still have to update SPARQL and Fuseki to return JSON. BTW: I'm implementing what Andy suggested in the mailing list, a SELECT JSON statement, rather than CONSTRUCT JSON .
          Hide
          andy.seaborne Andy Seaborne added a comment -

          Suggestion; make SELECT JSON a whole new top level query type. It's going to need it's own evaluation call anyway.

          {{<SELECT> <JSON>

          { getQuery().setQueryJsonType() ; }

          }}

          Either sprinkle LOOKAHEAD (2) or a new token "SELECT"<WS>"JSON" — or maybe skepp all than and make it JSON (no SELECT).

          I don't see why the {{<COMMA>}}is a problem.

          A JSON meber/value needs a value and the value can be variable or number. Something like is

          <STRING1> <COLON> (<VAR> | <STRING> | <NUMBER>)
          

          Whether the fild part should be a variable is possible one step too far at the moment.

          Show
          andy.seaborne Andy Seaborne added a comment - Suggestion; make SELECT JSON a whole new top level query type. It's going to need it's own evaluation call anyway. {{<SELECT> <JSON> { getQuery().setQueryJsonType() ; } }} Either sprinkle LOOKAHEAD (2) or a new token "SELECT"<WS>"JSON" — or maybe skepp all than and make it JSON (no SELECT ). I don't see why the {{<COMMA>}}is a problem. A JSON meber/value needs a value and the value can be variable or number. Something like is <STRING1> <COLON> (<VAR> | <STRING> | <NUMBER>) Whether the fild part should be a variable is possible one step too far at the moment.
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Thanks for your feedback on this Andy! I was already going to start hacking the SPARQL code

          I'm +1 for making it JSON. It will make the grammar simpler, and I think it will be clearer for users that this is a new statement and has a different syntax.

          I'll play with the grammar again during holidays and will post a new version soon.

          Thanks again,
          Bruno

          Show
          kinow Bruno P. Kinoshita added a comment - Thanks for your feedback on this Andy! I was already going to start hacking the SPARQL code I'm +1 for making it JSON . It will make the grammar simpler, and I think it will be clearer for users that this is a new statement and has a different syntax. I'll play with the grammar again during holidays and will post a new version soon. Thanks again, Bruno
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Hello

          The grammar has been updated and it now supports the following syntax:

          JSON { "name": ?name } WHERE { ?name ?a ?b } LIMIT 3
          

          The LOOKAHEAD(2) is no longer necessary, thanks to Andy's suggestion and I now, after changing it, find JSON easier to understand and avoid mistakes than SELECT JSON.

          [1] QueryUnit    ::= Query
          [2] Query           ::= Prologue ( SelectQuery | ConstructQuery | DescribeQuery | AskQuery | JsonQuery ) ValuesClause
          [...] ...
          

          Q1) Does that look correct?

          I've debugged it, and found that, considering that this grammar is correct for what is proposed in this issue, the next step would be update SPARQL_Query#executeQuery(), and prepare the JSON result.

          I'm temped to re-use what SPARQL_Query does when the query type is select, and simply replace the variable names by the ones provided by the user. i.e.

          JSON { "name": ?abcde } WHERE { ?abcde ?a ?b }
          

          In the example above, I'd replace the var "abcde" by "name" before serializing the result.

          Q2) Does that sound like a good plan?

          Talking about the serialization... in Fuseki, users define the serialization type (xml, text/plan, json, etc).

          Q3) Should we enforce JSON when the user uses a JSON query? I think it is not necessary, maybe we can simply return the result set with the changed variables.

          Q4) And finally, is it safe and correct to use the JSONOutput#format to return the result of the JSON query? I think the result will be similar to:

          *query:*

          JSON { "name": ?a } WHERE { ?a ?b ?c }
          

          *result:*

          {
            "head": {
              "vars": [ "name" ]
            } ,
            "results": {
              "bindings": [
                {
                  "name": { "type": "uri" , "value": "http://example.org/book/book5" }
                } ,
                {
                  "name": { "type": "uri" , "value": "http://example.org/book/book5" }
                } ,
                {
                  "name": { "type": "uri" , "value": "http://example.org/book/book3" }
                }
              ]
            }
          }
          

          Reading the old comments here and the post in the fist comment, I think maybe we should use something new (or add some new method somewhere) to return something close to:

          {
            "head": {
              "vars": [ "name" ]
            } ,
            "results": {
              "bindings": [
                {
                    "name": "http://example.org/book/book5"
                },
                {
                    "name": "http://example.org/book/book5"
                },
                {
                    "name": "http://example.org/book/book3"
                }
              ]
            }
          }
          

          Thanks!

          Show
          kinow Bruno P. Kinoshita added a comment - Hello The grammar has been updated and it now supports the following syntax: JSON { "name": ?name } WHERE { ?name ?a ?b } LIMIT 3 The LOOKAHEAD(2) is no longer necessary, thanks to Andy's suggestion and I now, after changing it, find JSON easier to understand and avoid mistakes than SELECT JSON. [1] QueryUnit ::= Query [2] Query ::= Prologue ( SelectQuery | ConstructQuery | DescribeQuery | AskQuery | JsonQuery ) ValuesClause [...] ... Q1) Does that look correct? I've debugged it, and found that, considering that this grammar is correct for what is proposed in this issue, the next step would be update SPARQL_Query#executeQuery() , and prepare the JSON result . I'm temped to re-use what SPARQL_Query does when the query type is select, and simply replace the variable names by the ones provided by the user. i.e. JSON { "name": ?abcde } WHERE { ?abcde ?a ?b } In the example above, I'd replace the var "abcde" by "name" before serializing the result. Q2) Does that sound like a good plan? Talking about the serialization... in Fuseki, users define the serialization type (xml, text/plan, json, etc). Q3) Should we enforce JSON when the user uses a JSON query? I think it is not necessary, maybe we can simply return the result set with the changed variables. Q4) And finally, is it safe and correct to use the JSONOutput#format to return the result of the JSON query? I think the result will be similar to: * query: * JSON { "name": ?a } WHERE { ?a ?b ?c } * result: * { "head": { "vars": [ "name" ] } , "results": { "bindings": [ { "name": { "type": "uri" , "value": "http://example.org/book/book5" } } , { "name": { "type": "uri" , "value": "http://example.org/book/book5" } } , { "name": { "type": "uri" , "value": "http://example.org/book/book3" } } ] } } Reading the old comments here and the post in the fist comment, I think maybe we should use something new (or add some new method somewhere) to return something close to: { "head": { "vars": [ "name" ] } , "results": { "bindings": [ { "name": "http://example.org/book/book5" }, { "name": "http://example.org/book/book5" }, { "name": "http://example.org/book/book3" } ] } } Thanks!
          Hide
          andy.seaborne Andy Seaborne added a comment -

          Q1

          The grammar looks OK (visual inspection of the JavaCC ... not by reading the machine
          generated version ).

          QueryExectuion can have a new execJson (better name?) to make this
          class of query a new, top level type. SPARQL_Query#executeQuery then
          have

           if ( query.isJsonType() ) {
              JSONArray results = qExec.execJSON
              log.info(format("[%d] exec/json",action.id)) ;
              return new SPARQLResult(results) ;
           }
          

          To break up the work into tasks, maybe start with execution in ARQ, and
          move onto Fuseki. The Fuseki part has some additional stuff around content
          types, nothing mysteryious, just more machinary.

          Q2

          The content negotation is presumably application/json - see Q3 as
          well. As the results are also triggered by the query type (here, JSON, not
          SELECT or a model), we don't get into a circularity of content negotation
          and query execution interaction.

          Q3

          Not sure I quite understand the question. We can be quite liberal on
          conneg if you want, include any common, but wrong, MIME types that people
          use. If asked for text/plain, or {{&format=text}, we can either return
          text/plain (conventient when working in a browser to force text/plain,
          or return application/json (maybe better for environments where it is
          tricky to set the outgoing "Accept" header). Fine tuning.

          Q4

          Yes - there are 4 choices for what to substitute for ?abcde, maybe
          more:

          1. A Json value as you suggest. This is my preference.
          2. RDF terms written application/sparql-results+json style.
          3. RDF terms written JSON-LD style.
          4. RDF terms written RDF/JSON style.

          I think the utility of the JSON query feature is use the data in JSON
          /javascript programs. Choices 2,3,4 are about keeping the RDF term
          details, where as 1 is lossey. But if the app wants the details, there are
          other good formats to use, such as application/sparql-results+json.
          Therefore, idiomatic JSON is the choice I favour.

          That's your second
          example.

          So how to map each kind of term encountered:

          URI => JSON string.

          Blank Node => JSON String "_:label". The only real use of blank nodes is
          to see when they are the same as else where in the results. Nice if the
          label is "b0", "b1" etc.

          Literal : most complicated.

          1. An XSD number should become a JSON number
          2. An XSD boolean should become a JSON boolean.
          3. Anything else, take the lexical form.

          I can't see anything sensible to handle language tags.

          undefined => Either JSON null or omit the field. null is is more
          JSON idiomatic maybe?

          Other

          We even have our own JSON subsystem, org.apache.jena.atlas.json, which
          post-dates JSONOuput.

          Key features include streaming input ((JSONParser/JSONHandler) and
          output (JSWriter), and to some extent predates some of the commonly
          used libraries. No JSON-Java object-model. It's just
          the JSON language so pared down for speed.

          I can well imagine that query type JSON will be used for large extractions
          from a dataset. Streaming matters.

          Show
          andy.seaborne Andy Seaborne added a comment - Q1 The grammar looks OK (visual inspection of the JavaCC ... not by reading the machine generated version ). QueryExectuion can have a new execJson (better name?) to make this class of query a new, top level type. SPARQL_Query#executeQuery then have if ( query.isJsonType() ) { JSONArray results = qExec.execJSON log.info(format("[%d] exec/json",action.id)) ; return new SPARQLResult(results) ; } To break up the work into tasks, maybe start with execution in ARQ, and move onto Fuseki. The Fuseki part has some additional stuff around content types, nothing mysteryious, just more machinary. Q2 The content negotation is presumably application/json - see Q3 as well. As the results are also triggered by the query type (here, JSON, not SELECT or a model), we don't get into a circularity of content negotation and query execution interaction. Q3 Not sure I quite understand the question. We can be quite liberal on conneg if you want, include any common, but wrong, MIME types that people use. If asked for text/plain , or {{&format=text}, we can either return text/plain (conventient when working in a browser to force text/plain , or return application/json (maybe better for environments where it is tricky to set the outgoing "Accept" header). Fine tuning. Q4 Yes - there are 4 choices for what to substitute for ?abcde , maybe more: A Json value as you suggest. This is my preference. RDF terms written application/sparql-results+json style. RDF terms written JSON-LD style. RDF terms written RDF/JSON style. I think the utility of the JSON query feature is use the data in JSON /javascript programs. Choices 2,3,4 are about keeping the RDF term details, where as 1 is lossey. But if the app wants the details, there are other good formats to use, such as application/sparql-results+json . Therefore, idiomatic JSON is the choice I favour. That's your second example. So how to map each kind of term encountered: URI => JSON string. Blank Node => JSON String "_:label". The only real use of blank nodes is to see when they are the same as else where in the results. Nice if the label is "b0", "b1" etc. Literal : most complicated. An XSD number should become a JSON number An XSD boolean should become a JSON boolean. Anything else, take the lexical form. I can't see anything sensible to handle language tags. undefined => Either JSON null or omit the field. null is is more JSON idiomatic maybe? Other We even have our own JSON subsystem, org.apache.jena.atlas.json , which post-dates JSONOuput . Key features include streaming input (( JSONParser / JSONHandler ) and output ( JSWriter ), and to some extent predates some of the commonly used libraries. No JSON-Java object-model. It's just the JSON language so pared down for speed. I can well imagine that query type JSON will be used for large extractions from a dataset. Streaming matters.
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Hi Andy! Thanks for the detailed response. Lots of things to learn. I'll need some time to digest it and learn other parts of the code, so I'll probably continue to read the docs, look at the other smaller issues in the atlas submodule, arq, and other parts. But I'll update this issue as soon as I've done some progress.

          Learning is always exciting, and this issue seems to require working with many different parts of the project

          Thanks again!
          Bruno

          Show
          kinow Bruno P. Kinoshita added a comment - Hi Andy! Thanks for the detailed response. Lots of things to learn. I'll need some time to digest it and learn other parts of the code, so I'll probably continue to read the docs, look at the other smaller issues in the atlas submodule, arq, and other parts. But I'll update this issue as soon as I've done some progress. Learning is always exciting, and this issue seems to require working with many different parts of the project Thanks again! Bruno
          Hide
          andy.seaborne Andy Seaborne added a comment -

          I'm personally intersted in seeing this JIRA through. If you'd like to draw up a list of items that need to be done, then I'll pick off those that are to do with the grungey details of wiring out the overall system. There will be some items that are small ... if you already know here to look .

          A first intermediate step might be to get parser-print working. arq.qparse does quite a lot of checking (e.g. query.equals, query.hashCode for the equality contract; checking the associated algebra is the same valiue-based equality contract).

          Show
          andy.seaborne Andy Seaborne added a comment - I'm personally intersted in seeing this JIRA through. If you'd like to draw up a list of items that need to be done, then I'll pick off those that are to do with the grungey details of wiring out the overall system. There will be some items that are small ... if you already know here to look . A first intermediate step might be to get parser-print working. arq.qparse does quite a lot of checking (e.g. query.equals, query.hashCode for the equality contract; checking the associated algebra is the same valiue-based equality contract).
          Hide
          kinow Bruno P. Kinoshita added a comment -

          > If you'd like to draw up a list of items that need to be done, then I'll pick off those that are to do with the grungey details of wiring out the overall system.

          Yay, sure Andy. I can do that.

          > A first intermediate step might be to get parser-print working.

          I'll try to summarize what needs to be done, and while I work on that I'll take a look at parser-print

          Show
          kinow Bruno P. Kinoshita added a comment - > If you'd like to draw up a list of items that need to be done, then I'll pick off those that are to do with the grungey details of wiring out the overall system. Yay, sure Andy. I can do that. > A first intermediate step might be to get parser-print working. I'll try to summarize what needs to be done, and while I work on that I'll take a look at parser-print
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Hi!

          I have updated the grammar to support what was suggested earlier for Json Values: Var() | String() | Number(),

          > QueryExectuion can have a new execJson (better name?) to make this
          class of query a new, top level type. SPARQL_Query#executeQuery then
          have

          Done! More or less like that?

          Show
          kinow Bruno P. Kinoshita added a comment - Hi! I have updated the grammar to support what was suggested earlier for Json Values: Var() | String() | Number() , > QueryExectuion can have a new execJson (better name?) to make this class of query a new, top level type. SPARQL_Query#executeQuery then have Done ! More or less like that?
          Hide
          kinow Bruno P. Kinoshita added a comment - - edited

          Here's the list of what I think are the remaining activities:

          Activities
          • Change Fuseki and ConNeg to offer the JSON query response to the user (probably as application/json or render as text/plan if requested by the user to do so
          • Implement QueryEngineHTTP#execJson()?
          • Get parser-print working (as suggested above)
          • Work on how to handle large datasets and streaming. I'm starting to build a JsonArray from the QueryIterator, but probably that's wrong?
          • Write unit tests
          • Get user/other devs feedback in the mailing-list?

          Anything else?

          Show
          kinow Bruno P. Kinoshita added a comment - - edited Here's the list of what I think are the remaining activities: Activities Change Fuseki and ConNeg to offer the JSON query response to the user (probably as application/json or render as text/plan if requested by the user to do so Implement QueryEngineHTTP#execJson() ? Get parser-print working (as suggested above) Work on how to handle large datasets and streaming. I'm starting to build a JsonArray from the QueryIterator , but probably that's wrong? Write unit tests Get user/other devs feedback in the mailing-list? Anything else?
          Hide
          andy.seaborne Andy Seaborne added a comment -

          Maybe have two operations: QueryExecution.execJson() => one JsonArray and also QueryExecution.execJsonItems() => Iterator<JsonValue>, one iteration per json object generated (one per row of matches to the graph pattern).

          Both the as JsonArray and as iterator forms have usefulness.

          Add to choice to SPARQLResult (small).

          Implement QueryExecutionBase.execJson()

          c.f QueryExecution.execConstruct, QueryExecutionexecConstructTriples.

          JsonWriter is the streaming output. A JsonArray would mean collecting into memory then writing the entire JSON structure which is non-streaming. That said, I have found that building the non-streaming version first, get everything working ()easier to debug) and going back and replacing with a stream writer when all the machinery is working and tested.

          Show
          andy.seaborne Andy Seaborne added a comment - Maybe have two operations: QueryExecution.execJson() => one JsonArray and also QueryExecution.execJsonItems() => Iterator<JsonValue> , one iteration per json object generated (one per row of matches to the graph pattern). Both the as JsonArray and as iterator forms have usefulness. Add to choice to SPARQLResult (small). Implement QueryExecutionBase.execJson() c.f QueryExecution.execConstruct , QueryExecutionexecConstructTriples . JsonWriter is the streaming output. A JsonArray would mean collecting into memory then writing the entire JSON structure which is non-streaming. That said, I have found that building the non-streaming version first, get everything working ()easier to debug) and going back and replacing with a stream writer when all the machinery is working and tested.
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Using the books.ttl example dataset, and the following query:

          query-sparql.sparql
          PREFIX purl: <http://purl.org/dc/elements/1.1/>
          PREFIX w3: <http://www.w3.org/2001/vcard-rdf/3.0#> 
          PREFIX : <http://example.org/book/> 
          
          SELECT ?author ?title 
          WHERE 
          {
          ?book purl:creator ?author .
          ?book purl:title ?title . 
          FILTER (?author = 'J.K. Rowling')
          }
          

          Produces the following result set.

          ----------------------------------------------------------------
          | author         | title                                       |
          ================================================================
          | "J.K. Rowling" | "Harry Potter and the Order of the Phoenix" |
          | "J.K. Rowling" | "Harry Potter and the Philosopher's Stone"  |
          | "J.K. Rowling" | "Harry Potter and the Half-Blood Prince"    |
          | "J.K. Rowling" | "Harry Potter and the Deathly Hallows"      |
          ----------------------------------------------------------------
          

          And the JSON query.

          query-json.sparql
          PREFIX purl: <http://purl.org/dc/elements/1.1/>
          PREFIX w3: <http://www.w3.org/2001/vcard-rdf/3.0#> 
          PREFIX : <http://example.org/book/> 
          
          JSON {
          "author": ?author, 
          "title": ?title 
          }
          WHERE 
          {
          ?book purl:creator ?author .
          ?book purl:title ?title . 
          FILTER (?author = 'J.K. Rowling')
          }
          

          Produces:

          [ { 
              "author" : "J.K. Rowling" ,
              "title" : "Harry Potter and the Order of the Phoenix"
            } ,
            { 
              "author" : "J.K. Rowling" ,
              "title" : "Harry Potter and the Philosopher's Stone"
            } ,
            { 
              "author" : "J.K. Rowling" ,
              "title" : "Harry Potter and the Half-Blood Prince"
            } ,
            { 
              "author" : "J.K. Rowling" ,
              "title" : "Harry Potter and the Deathly Hallows"
            }
          ]
          

          How are we supposed to handle this kind of data set in JSON queries?

          Show
          kinow Bruno P. Kinoshita added a comment - Using the books.ttl example dataset, and the following query: query-sparql.sparql PREFIX purl: <http: //purl.org/dc/elements/1.1/> PREFIX w3: <http: //www.w3.org/2001/vcard-rdf/3.0#> PREFIX : <http: //example.org/book/> SELECT ?author ?title WHERE { ?book purl:creator ?author . ?book purl:title ?title . FILTER (?author = 'J.K. Rowling') } Produces the following result set. ---------------------------------------------------------------- | author | title | ================================================================ | "J.K. Rowling" | "Harry Potter and the Order of the Phoenix" | | "J.K. Rowling" | "Harry Potter and the Philosopher's Stone" | | "J.K. Rowling" | "Harry Potter and the Half-Blood Prince" | | "J.K. Rowling" | "Harry Potter and the Deathly Hallows" | ---------------------------------------------------------------- And the JSON query. query-json.sparql PREFIX purl: <http: //purl.org/dc/elements/1.1/> PREFIX w3: <http: //www.w3.org/2001/vcard-rdf/3.0#> PREFIX : <http: //example.org/book/> JSON { "author" : ?author, "title" : ?title } WHERE { ?book purl:creator ?author . ?book purl:title ?title . FILTER (?author = 'J.K. Rowling') } Produces: [ { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Order of the Phoenix" } , { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Philosopher's Stone" } , { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Half-Blood Prince" } , { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Deathly Hallows" } ] How are we supposed to handle this kind of data set in JSON queries?
          Hide
          andy.seaborne Andy Seaborne added a comment -

          That's the answer I would expect. I don't understand the question at the end.

          Show
          andy.seaborne Andy Seaborne added a comment - That's the answer I would expect. I don't understand the question at the end.
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Oh, yeah? If so that's easier than I thought. I thought we would provide some mechanism/syntax to have an output like:

          [ { 
              "author" : "J.K. Rowling" ,
              "title" : [
                  "Harry Potter and the Order of the Phoenix", 
                  "Harry Potter and the Philosopher's Stone",
                  "Harry Potter and the Half-Blood Prince",
                  "Harry Potter and the Deathly Hallows"
              ]
            }
          ]
          
          Show
          kinow Bruno P. Kinoshita added a comment - Oh, yeah? If so that's easier than I thought. I thought we would provide some mechanism/syntax to have an output like: [ { "author" : "J.K. Rowling" , "title" : [ "Harry Potter and the Order of the Phoenix", "Harry Potter and the Philosopher's Stone", "Harry Potter and the Half-Blood Prince", "Harry Potter and the Deathly Hallows" ] } ]
          Hide
          andy.seaborne Andy Seaborne added a comment - - edited

          I have seen attempt to produce more complex JSON from RDF. They have their place but so does a direct, simple, efficient format.

          One row, one JSON object. Streamed if the underlying query execution streams.

          To produce that title array, the code would need to see the whole result set (very special case - know there is only repeated "title", and only one "author" – just expressing that is complicated enough. If the condition is broken mid result set, we end up with the "HTTP 200" already sent problem).

          Show
          andy.seaborne Andy Seaborne added a comment - - edited I have seen attempt to produce more complex JSON from RDF. They have their place but so does a direct, simple, efficient format. One row, one JSON object. Streamed if the underlying query execution streams. To produce that title array, the code would need to see the whole result set (very special case - know there is only repeated "title", and only one "author" – just expressing that is complicated enough. If the condition is broken mid result set, we end up with the "HTTP 200" already sent problem).
          Hide
          kinow Bruno P. Kinoshita added a comment -

          >I have seen attempt to produce more complex JSON from RDF. They have their place but so does a direct, simple, efficient format.
          >One row, one JSON object. Streamed if the underlying query execution streams.

          Good point! Thanks Andy!

          > To produce that title array, the code would need to see the whole result set (very special case - know there is only repeated "title", and only one "author" – just expressing that is complicated enough. If the condition is broken mid result set, we end up with the "HTTP 200" already sent problem).

          I was preparing to try some ways to aggregate the titles into a single array, and look for ways of doing that. But very happy I asked here before. Let's do the simpler way, introduce the new JSON statement, write tests, docs, and then see if we need other more complex methods.

          Thanks a lot Andy! Going to use some spare time in this Carnival to work on this issue
          Bruno

          Show
          kinow Bruno P. Kinoshita added a comment - >I have seen attempt to produce more complex JSON from RDF. They have their place but so does a direct, simple, efficient format. >One row, one JSON object. Streamed if the underlying query execution streams. Good point! Thanks Andy! > To produce that title array, the code would need to see the whole result set (very special case - know there is only repeated "title", and only one "author" – just expressing that is complicated enough. If the condition is broken mid result set, we end up with the "HTTP 200" already sent problem). I was preparing to try some ways to aggregate the titles into a single array, and look for ways of doing that. But very happy I asked here before. Let's do the simpler way, introduce the new JSON statement, write tests, docs, and then see if we need other more complex methods. Thanks a lot Andy! Going to use some spare time in this Carnival to work on this issue Bruno
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Followed the example of execSelect, and used a ResultSet wrapping the QueryIterator. This way, we were able to reuse existing classes. The resulting JSON is as follows:

          {
            "head": {
              "vars": [ "author" , "title" ]
            } ,
            "results": {
              "bindings": [
                {
                  "author": { "type": "literal" , "value": "J.K. Rowling" } ,
                  "title": { "type": "literal" , "value": "Harry Potter and the Order of the Phoenix" }
                } ,
                {
                  "author": { "type": "literal" , "value": "J.K. Rowling" } ,
                  "title": { "type": "literal" , "value": "Harry Potter and the Philosopher's Stone" }
                } ,
                {
                  "author": { "type": "literal" , "value": "J.K. Rowling" } ,
                  "title": { "type": "literal" , "value": "Harry Potter and the Half-Blood Prince" }
                } ,
                {
                  "author": { "type": "literal" , "value": "J.K. Rowling" } ,
                  "title": { "type": "literal" , "value": "Harry Potter and the Deathly Hallows" }
                }
              ]
            }
          }
          

          Here are some GitHub links to the mentioned code:

          New result set - https://github.com/kinow/jena/blob/ae7c50f0f313b7dfa1cd8333fb0072f89e5972c5/jena-arq/src/main/java/com/hp/hpl/jena/sparql/engine/ResultSetJsonStream.java#L37

          The new method that gets called by SPARQL and builds a ResultSetJsonStream - https://github.com/kinow/jena/blob/ae7c50f0f313b7dfa1cd8333fb0072f89e5972c5/jena-arq/src/main/java/com/hp/hpl/jena/sparql/engine/QueryExecutionBase.java#L390

          I'm waiting until we have settled down on the API details to start writing tests and documentation.

          Thanks!

          Show
          kinow Bruno P. Kinoshita added a comment - Followed the example of execSelect, and used a ResultSet wrapping the QueryIterator. This way, we were able to reuse existing classes. The resulting JSON is as follows: { "head": { "vars": [ "author" , "title" ] } , "results": { "bindings": [ { "author": { "type": "literal" , "value": "J.K. Rowling" } , "title": { "type": "literal" , "value": "Harry Potter and the Order of the Phoenix" } } , { "author": { "type": "literal" , "value": "J.K. Rowling" } , "title": { "type": "literal" , "value": "Harry Potter and the Philosopher's Stone" } } , { "author": { "type": "literal" , "value": "J.K. Rowling" } , "title": { "type": "literal" , "value": "Harry Potter and the Half-Blood Prince" } } , { "author": { "type": "literal" , "value": "J.K. Rowling" } , "title": { "type": "literal" , "value": "Harry Potter and the Deathly Hallows" } } ] } } Here are some GitHub links to the mentioned code: New result set - https://github.com/kinow/jena/blob/ae7c50f0f313b7dfa1cd8333fb0072f89e5972c5/jena-arq/src/main/java/com/hp/hpl/jena/sparql/engine/ResultSetJsonStream.java#L37 The new method that gets called by SPARQL and builds a ResultSetJsonStream - https://github.com/kinow/jena/blob/ae7c50f0f313b7dfa1cd8333fb0072f89e5972c5/jena-arq/src/main/java/com/hp/hpl/jena/sparql/engine/QueryExecutionBase.java#L390 I'm waiting until we have settled down on the API details to start writing tests and documentation. Thanks!
          Hide
          andy.seaborne Andy Seaborne added a comment -

          That output can be obtained currently using "application/sparq-results+json". It's OK but I wouldn't describe it as idiomatic JSON. My mental use cases are slurping results up in a browser-based application, or reading from some programme that really does not know much about RDF and different types of RDF terms.

          For QueryExecution.execJsonItems, how about making the return type Iterator<JsonObject>? The iterator is one element per row e.g.:

          { 
              "author" : "J.K. Rowling" ,
              "title" : "Harry Potter and the Order of the Phoenix"
            }
          

          and that can be turned into output bytes by printing "[", each object (separated by ","), and a "]". In fact, there is a stream write for JSON : JSWriter (it's pretty low level : e.g. startArray/finishArray, with each array element being startObject/finishObject).

          How does that sound?

          Show
          andy.seaborne Andy Seaborne added a comment - That output can be obtained currently using "application/sparq-results+json". It's OK but I wouldn't describe it as idiomatic JSON. My mental use cases are slurping results up in a browser-based application, or reading from some programme that really does not know much about RDF and different types of RDF terms. For QueryExecution.execJsonItems , how about making the return type Iterator<JsonObject> ? The iterator is one element per row e.g.: { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Order of the Phoenix" } and that can be turned into output bytes by printing "[", each object (separated by ","), and a "]". In fact, there is a stream write for JSON : JSWriter (it's pretty low level : e.g. startArray/finishArray, with each array element being startObject/finishObject). How does that sound?
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Hi Andy! Sorry for the delay. Sounds good to me. I should have followed your initial suggestion of starting by doing the streaming version first I will take a look at JSWriter and will try to come up with an update in the next days.

          Thanks!

          Show
          kinow Bruno P. Kinoshita added a comment - Hi Andy! Sorry for the delay. Sounds good to me. I should have followed your initial suggestion of starting by doing the streaming version first I will take a look at JSWriter and will try to come up with an update in the next days. Thanks!
          Hide
          kinow Bruno P. Kinoshita added a comment -

          Hi again,

          Here's the updated code https://github.com/kinow/jena/commit/7b3b10134f4201314d5f6c6103a595181e82f997

          I noticed that I have been updating several imports (Eclipse auto-removes the space before the ;). I'll try to fix it before preparing the code for a merge.

          Here's how I used the JSWriter https://github.com/kinow/jena/commit/7b3b10134f4201314d5f6c6103a595181e82f997#diff-c42503247148fd09663639cb9df2e641R693

          And here's the new output:

          [ { 
            "author" : "J.K. Rowling" ,
            "title" : "Harry Potter and the Order of the Phoenix"
          }
          { 
            "author" : "J.K. Rowling" ,
            "title" : "Harry Potter and the Philosopher's Stone"
          }
          { 
            "author" : "J.K. Rowling" ,
            "title" : "Harry Potter and the Half-Blood Prince"
          }
          { 
            "author" : "J.K. Rowling" ,
            "title" : "Harry Potter and the Deathly Hallows"
          }
           ]
          
          Show
          kinow Bruno P. Kinoshita added a comment - Hi again, Here's the updated code https://github.com/kinow/jena/commit/7b3b10134f4201314d5f6c6103a595181e82f997 I noticed that I have been updating several imports (Eclipse auto-removes the space before the ; ). I'll try to fix it before preparing the code for a merge. Here's how I used the JSWriter https://github.com/kinow/jena/commit/7b3b10134f4201314d5f6c6103a595181e82f997#diff-c42503247148fd09663639cb9df2e641R693 And here's the new output: [ { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Order of the Phoenix" } { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Philosopher's Stone" } { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Half-Blood Prince" } { "author" : "J.K. Rowling" , "title" : "Harry Potter and the Deathly Hallows" } ]
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user kinow opened a pull request:

          https://github.com/apache/jena/pull/114

          JENA-632: Generate JSON from SPARQL directly

          This pull request contains code for JENA-632(https://issues.apache.org/jira/browse/JENA-632). The original work is still in a branch in [my fork of Jena](https://github.com/apache/jena/compare/master...kinow:JENA-632). It has been updated after the work on Jena 3 (mainly package renaming). And the web layer has been implemented in fuseki 2, but not backported to fuseki 1.

          Besides reviewing the code, the follow steps can be used to quickly test the code.

          • Start Fuseki (debug in Eclipse after checking out this branch, for example)
          • Load the books.ttl from fuseki1/Data directory
          • Query with something as

          ```
          PREFIX purl: <http://purl.org/dc/elements/1.1/>
          PREFIX w3: <http://www.w3.org/2001/vcard-rdf/3.0#>
          PREFIX : <http://example.org/book/>

          JSON

          { "author": ?author, "title": ?title }

          WHERE

          { ?book purl:creator ?author . ?book purl:title ?title . FILTER (?author = 'J.K. Rowling') }

          ```

          Which follows the syntax proposed in the issue in JIRA.

          I am still reviewing the code after porting to the new code base, but an extra pair of eyes reviewing it is always welcome! :grin:

          ps: the SPARQL editor may need some tweaking to support the new syntax

          ps2: tried to change the key name in the JSON query but it didn't work. Will try to update the PR if that's really a bug in the code in the next days

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/kinow/jena JENA-632-2

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/jena/pull/114.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #114


          commit 0049f2abcb757c9a190e0017d9369583f4eebf93
          Author: Bruno P. Kinoshita <brunodepaulak@yahoo.com.br>
          Date: 2015-12-27T10:51:24Z

          JENA-632: Generate JSON from SPARQL directly


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user kinow opened a pull request: https://github.com/apache/jena/pull/114 JENA-632 : Generate JSON from SPARQL directly This pull request contains code for JENA-632 ( https://issues.apache.org/jira/browse/JENA-632 ). The original work is still in a branch in [my fork of Jena] ( https://github.com/apache/jena/compare/master...kinow:JENA-632 ). It has been updated after the work on Jena 3 (mainly package renaming). And the web layer has been implemented in fuseki 2, but not backported to fuseki 1. Besides reviewing the code, the follow steps can be used to quickly test the code. Start Fuseki (debug in Eclipse after checking out this branch, for example) Load the books.ttl from fuseki1/Data directory Query with something as ``` PREFIX purl: < http://purl.org/dc/elements/1.1/ > PREFIX w3: < http://www.w3.org/2001/vcard-rdf/3.0# > PREFIX : < http://example.org/book/ > JSON { "author": ?author, "title": ?title } WHERE { ?book purl:creator ?author . ?book purl:title ?title . FILTER (?author = 'J.K. Rowling') } ``` Which follows the syntax proposed in the issue in JIRA. I am still reviewing the code after porting to the new code base, but an extra pair of eyes reviewing it is always welcome! :grin: ps: the SPARQL editor may need some tweaking to support the new syntax ps2: tried to change the key name in the JSON query but it didn't work. Will try to update the PR if that's really a bug in the code in the next days You can merge this pull request into a Git repository by running: $ git pull https://github.com/kinow/jena JENA-632 -2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/114.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #114 commit 0049f2abcb757c9a190e0017d9369583f4eebf93 Author: Bruno P. Kinoshita <brunodepaulak@yahoo.com.br> Date: 2015-12-27T10:51:24Z JENA-632 : Generate JSON from SPARQL directly
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48532056

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          {
          boolean allowAggregatesInExpressions = false ;
          +
          — End diff –

          This looks like debug code - could it go elsewhere?

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48532056 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE { boolean allowAggregatesInExpressions = false ; + — End diff – This looks like debug code - could it go elsewhere?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48532508

          — Diff: jena-arq/Grammar/master.jj —
          @@ -327,6 +359,35 @@ void AskQuery() : {}
          SolutionModifier()
          }

          +void JsonQuery() : {}
          +

          { + JsonClause() + ( DatasetClause() )* + WhereClause() + SolutionModifier() +}

          +
          +void JsonClause() :

          { Object o ; String s ; }

          +{
          + <JSON>

          { getQuery().setQueryJsonType() ; }

          + <LBRACE>
          + s = String() < PNAME_NS >
          + (
          — End diff –

          Shouldn't PNAME_NS be COLON? PNAME_NS matches `foo:` so "s = String() < PNAME_NS >" matches `"abcd" xyz:`

          It may be hard to parse JSON using the SPARQL tokens,in which case using some javacc ability to switch details may be needed.

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48532508 — Diff: jena-arq/Grammar/master.jj — @@ -327,6 +359,35 @@ void AskQuery() : {} SolutionModifier() } +void JsonQuery() : {} + { + JsonClause() + ( DatasetClause() )* + WhereClause() + SolutionModifier() +} + +void JsonClause() : { Object o ; String s ; } +{ + <JSON> { getQuery().setQueryJsonType() ; } + <LBRACE> + s = String() < PNAME_NS > + ( — End diff – Shouldn't PNAME_NS be COLON? PNAME_NS matches `foo:` so "s = String() < PNAME_NS >" matches `"abcd" xyz:` It may be hard to parse JSON using the SPARQL tokens,in which case using some javacc ability to switch details may be needed.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48532742

          — Diff: jena-arq/Grammar/master.jj —
          @@ -2166,6 +2227,17 @@ String String() :

          { Token t ; String lex ; }

          }
          }

          +Number Number() :

          { Token t ; Number number ; }

          +{
          — End diff –

          Number -> JSONNumber

          But will rule NumericLiteral work here? Then parse the lexical form to a number.

          However, generally, working in nodes is probably going to be easier because bindings map variables to Nodes.

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48532742 — Diff: jena-arq/Grammar/master.jj — @@ -2166,6 +2227,17 @@ String String() : { Token t ; String lex ; } } } +Number Number() : { Token t ; Number number ; } +{ — End diff – Number -> JSONNumber But will rule NumericLiteral work here? Then parse the lexical form to a number. However, generally, working in nodes is probably going to be easier because bindings map variables to Nodes.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48532825

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          — End diff –

          Please update tokens.txt as well - it is used to generate HTML.

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48532825 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE — End diff – Please update tokens.txt as well - it is used to generate HTML.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48532880

          — Diff: jena-arq/src/main/java/org/apache/jena/sparql/lang/sparql_11/SPARQLParser11Constants.java —
          @@ -431,6 +433,7 @@
          "\"select\"",
          "\"distinct\"",
          "\"reduced\"",
          + "\"json\"",
          — End diff –

          This the new JSON construct sneaking into pure SPARQL 1.1

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48532880 — Diff: jena-arq/src/main/java/org/apache/jena/sparql/lang/sparql_11/SPARQLParser11Constants.java — @@ -431,6 +433,7 @@ "\"select\"", "\"distinct\"", "\"reduced\"", + "\"json\"", — End diff – This the new JSON construct sneaking into pure SPARQL 1.1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48532962

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          {
          boolean allowAggregatesInExpressions = false ;
          +
          + public static void main(String args[]) {
          + while (true) {
          — End diff –

          This looks like debug code - could it go elsewhere?

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48532962 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE { boolean allowAggregatesInExpressions = false ; + + public static void main(String args[]) { + while (true) { — End diff – This looks like debug code - could it go elsewhere?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on the pull request:

          https://github.com/apache/jena/pull/114#issuecomment-167763543

          Main comment : the grammar changes haven't been put inside `#ifdef ARQ` so they happen in strict SPARQL 1.1 as well as the extended ARQ language.

          Keeping the SPARQL 1.1 parser exactly as the SPARQL 1.1 spec is important. There should just be some noise changes and no more. (see for example the construct-quad changes).

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on the pull request: https://github.com/apache/jena/pull/114#issuecomment-167763543 Main comment : the grammar changes haven't been put inside `#ifdef ARQ` so they happen in strict SPARQL 1.1 as well as the extended ARQ language. Keeping the SPARQL 1.1 parser exactly as the SPARQL 1.1 spec is important. There should just be some noise changes and no more. (see for example the construct-quad changes).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kinow commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48680907

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          {
          boolean allowAggregatesInExpressions = false ;
          +
          + public static void main(String args[]) {
          + while (true) {
          — End diff –

          Not sure where else it could go. I copied (shamelessly) from an example I found online while re-reading about JavaCC. With this main method, you can run the grammar in Eclipse, and in the Eclipse Console it will be waiting for a String+LFLF (two break lines IIRC).

          Then it will use the grammar to parse the string and will output the QueryUnit. I found it useful for reviewing the changes without running some extra class with a main method, or Fuseki.

          What do you think? I'm OK with removing it, or moving it somewhere else. Just don't know where else it could go

          Show
          githubbot ASF GitHub Bot added a comment - Github user kinow commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48680907 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE { boolean allowAggregatesInExpressions = false ; + + public static void main(String args[]) { + while (true) { — End diff – Not sure where else it could go. I copied (shamelessly) from an example I found online while re-reading about JavaCC. With this main method, you can run the grammar in Eclipse, and in the Eclipse Console it will be waiting for a String+LFLF (two break lines IIRC). Then it will use the grammar to parse the string and will output the QueryUnit. I found it useful for reviewing the changes without running some extra class with a main method, or Fuseki. What do you think? I'm OK with removing it, or moving it somewhere else. Just don't know where else it could go
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kinow commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48680948

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          — End diff –

          Looking at the Notes file, there is a comment to run "jj2tokens sparql_11.jj > tokens.txt" to create an initial tokens.txt file, and then manually tidy it up.

          Should I re-run jj2tokens, or just manually add the missing entries?

          Not sure if that hasn't been used in a while and could generate strange behaviour in other grammars later.

          Show
          githubbot ASF GitHub Bot added a comment - Github user kinow commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48680948 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE — End diff – Looking at the Notes file, there is a comment to run "jj2tokens sparql_11.jj > tokens.txt" to create an initial tokens.txt file, and then manually tidy it up. Should I re-run jj2tokens, or just manually add the missing entries? Not sure if that hasn't been used in a while and could generate strange behaviour in other grammars later.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kinow commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48680972

          — Diff: jena-arq/Grammar/master.jj —
          @@ -327,6 +359,35 @@ void AskQuery() : {}
          SolutionModifier()
          }

          +void JsonQuery() : {}
          +

          { + JsonClause() + ( DatasetClause() )* + WhereClause() + SolutionModifier() +}

          +
          +void JsonClause() :

          { Object o ; String s ; }

          +{
          + <JSON>

          { getQuery().setQueryJsonType() ; }

          + <LBRACE>
          + s = String() < PNAME_NS >
          + (
          — End diff –

          I will need some time to check in Eclipse why I used PNAME_NS. If I remember correctly, using COLON, JavaCC would find it ambiguous with some other character or rule?

          Show
          githubbot ASF GitHub Bot added a comment - Github user kinow commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48680972 — Diff: jena-arq/Grammar/master.jj — @@ -327,6 +359,35 @@ void AskQuery() : {} SolutionModifier() } +void JsonQuery() : {} + { + JsonClause() + ( DatasetClause() )* + WhereClause() + SolutionModifier() +} + +void JsonClause() : { Object o ; String s ; } +{ + <JSON> { getQuery().setQueryJsonType() ; } + <LBRACE> + s = String() < PNAME_NS > + ( — End diff – I will need some time to check in Eclipse why I used PNAME_NS. If I remember correctly, using COLON, JavaCC would find it ambiguous with some other character or rule?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kinow commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48680983

          — Diff: jena-arq/src/main/java/org/apache/jena/sparql/lang/sparql_11/SPARQLParser11Constants.java —
          @@ -431,6 +433,7 @@
          "\"select\"",
          "\"distinct\"",
          "\"reduced\"",
          + "\"json\"",
          — End diff –

          Oh, when updating the branch (it was written when Jena was 2.x) I simply applied the same changes to master.jj and sparql_11.jj, and looked at other classes that I had patched, and looked for similar classes that looked like had to be updated as well.

          But maybe I added Should I revert this one line?

          Show
          githubbot ASF GitHub Bot added a comment - Github user kinow commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48680983 — Diff: jena-arq/src/main/java/org/apache/jena/sparql/lang/sparql_11/SPARQLParser11Constants.java — @@ -431,6 +433,7 @@ "\"select\"", "\"distinct\"", "\"reduced\"", + "\"json\"", — End diff – Oh, when updating the branch (it was written when Jena was 2.x) I simply applied the same changes to master.jj and sparql_11.jj, and looked at other classes that I had patched, and looked for similar classes that looked like had to be updated as well. But maybe I added Should I revert this one line?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48998941

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          {
          boolean allowAggregatesInExpressions = false ;
          +
          + public static void main(String args[]) {
          + while (true) {
          — End diff –

          New `CLASS` don't appear very often so this could be in a java source file for each language

          In fact, it only needs to work for language ARQ which is a superset of SPARQL 1.1

          Also - have you seen the command `arq.qparse`? It reads in a query and prints it out (and performs internal checks on `.equals`, `.hashcode`, output sameas input and the algebra generated.

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48998941 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE { boolean allowAggregatesInExpressions = false ; + + public static void main(String args[]) { + while (true) { — End diff – New `CLASS` don't appear very often so this could be in a java source file for each language In fact, it only needs to work for language ARQ which is a superset of SPARQL 1.1 Also - have you seen the command `arq.qparse`? It reads in a query and prints it out (and performs internal checks on `.equals`, `.hashcode`, output sameas input and the algebra generated.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r48999309

          — Diff: jena-arq/Grammar/master.jj —
          @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ;
          public class CLASS extends PARSERBASE
          — End diff –

          tokens.txt is used to produce HTML - hence the additional syntax to indicate inline vs a token rule for each definition.

          It is safer to hand edit for minor changes because you risk loosing the additional edits alreayd made.

          (If running jj2tokens, send to a temporary file and pick out the new bits and edit in manually)

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r48999309 — Diff: jena-arq/Grammar/master.jj — @@ -100,6 +100,38 @@ import org.apache.jena.sparql.core.Quad ; public class CLASS extends PARSERBASE — End diff – tokens.txt is used to produce HTML - hence the additional syntax to indicate inline vs a token rule for each definition. It is safer to hand edit for minor changes because you risk loosing the additional edits alreayd made. (If running jj2tokens, send to a temporary file and pick out the new bits and edit in manually)
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r49002105

          — Diff: jena-arq/Grammar/master.jj —
          @@ -327,6 +359,35 @@ void AskQuery() : {}
          SolutionModifier()
          }

          +void JsonQuery() : {}
          +

          { + JsonClause() + ( DatasetClause() )* + WhereClause() + SolutionModifier() +}

          +
          +void JsonClause() :

          { Object o ; String s ; }

          +{
          + <JSON>

          { getQuery().setQueryJsonType() ; }

          + <LBRACE>
          + s = String() < PNAME_NS >
          + (
          — End diff –

          Yes - javacc will tokenize to PNAME_NS. These is no COLON and if there were, theer would be other problems.

          I have a devious idea!

          Use the PNAME_NS and follow with a java fragment that does additional checking:
          ```
          t = <PNAME_NS>

          { if ( t.image is not exactly a ":" ) throwParseException("message", t.beginLine, t.beginColumn) }

          ```
          This then restricts the legal token and means you won't to have to play complicated games with javacc to switch to a different token set (if that is even possible due to lookahead of characters).

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r49002105 — Diff: jena-arq/Grammar/master.jj — @@ -327,6 +359,35 @@ void AskQuery() : {} SolutionModifier() } +void JsonQuery() : {} + { + JsonClause() + ( DatasetClause() )* + WhereClause() + SolutionModifier() +} + +void JsonClause() : { Object o ; String s ; } +{ + <JSON> { getQuery().setQueryJsonType() ; } + <LBRACE> + s = String() < PNAME_NS > + ( — End diff – Yes - javacc will tokenize to PNAME_NS. These is no COLON and if there were, theer would be other problems. I have a devious idea! Use the PNAME_NS and follow with a java fragment that does additional checking: ``` t = <PNAME_NS> { if ( t.image is not exactly a ":" ) throwParseException("message", t.beginLine, t.beginColumn) } ``` This then restricts the legal token and means you won't to have to play complicated games with javacc to switch to a different token set (if that is even possible due to lookahead of characters).
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user afs commented on a diff in the pull request:

          https://github.com/apache/jena/pull/114#discussion_r49003225

          — Diff: jena-arq/src/main/java/org/apache/jena/sparql/lang/sparql_11/SPARQLParser11Constants.java —
          @@ -431,6 +433,7 @@
          "\"select\"",
          "\"distinct\"",
          "\"reduced\"",
          + "\"json\"",
          — End diff –

          This is a generated file - Put inside a `#ifdef ARQ ... #endif` in master.jj.

          Show
          githubbot ASF GitHub Bot added a comment - Github user afs commented on a diff in the pull request: https://github.com/apache/jena/pull/114#discussion_r49003225 — Diff: jena-arq/src/main/java/org/apache/jena/sparql/lang/sparql_11/SPARQLParser11Constants.java — @@ -431,6 +433,7 @@ "\"select\"", "\"distinct\"", "\"reduced\"", + "\"json\"", — End diff – This is a generated file - Put inside a `#ifdef ARQ ... #endif` in master.jj.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user ajs6f commented on the issue:

          https://github.com/apache/jena/pull/114

          @kinow is this still live?

          Show
          githubbot ASF GitHub Bot added a comment - Github user ajs6f commented on the issue: https://github.com/apache/jena/pull/114 @kinow is this still live?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user kinow commented on the issue:

          https://github.com/apache/jena/pull/114

          @ajs6f sorry for the delay. Checked out the branch, just need to refresh my memory. Will need a few more days before pinging you guys to check out this PR again. Next month finally will have some time to play with Jena again, then plan to have this PR completed, and check if there's any Fuseki or easy tasks in JIRA

          Cheers
          Bruno

          Show
          githubbot ASF GitHub Bot added a comment - Github user kinow commented on the issue: https://github.com/apache/jena/pull/114 @ajs6f sorry for the delay. Checked out the branch, just need to refresh my memory. Will need a few more days before pinging you guys to check out this PR again. Next month finally will have some time to play with Jena again, then plan to have this PR completed, and check if there's any Fuseki or easy tasks in JIRA Cheers Bruno
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user ajs6f commented on the issue:

          https://github.com/apache/jena/pull/114

          No prob-- good to see you back, @kinow !

          Show
          githubbot ASF GitHub Bot added a comment - Github user ajs6f commented on the issue: https://github.com/apache/jena/pull/114 No prob-- good to see you back, @kinow !

            People

            • Assignee:
              kinow Bruno P. Kinoshita
              Reporter:
              andy.seaborne Andy Seaborne
            • Votes:
              4 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h

                  Development