Apache Jena
  1. Apache Jena
  2. JENA-199

BindingBase can hit a null pointer exception on certain queries against a TDB dataset

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: TDB

      Description

      This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.

      Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV. When using XML/JSON the output is fine.

      The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:

      PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
      PREFIX foaf: <http://xmlns.com/foaf/0.1/>
      PREFIX bench: <http://localhost/vocabulary/bench/>
      PREFIX dc: <http://purl.org/dc/elements/1.1/>

      SELECT DISTINCT ?person ?name
      WHERE

      { ?article rdf:type bench:Article . ?article dc:creator ?person . ?inproc rdf:type bench:Inproceedings . ?inproc dc:creator ?person . ?person foaf:name ?name }

      And:

      PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
      PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
      PREFIX foaf: <http://xmlns.com/foaf/0.1/>
      PREFIX dc: <http://purl.org/dc/elements/1.1/>

      SELECT DISTINCT ?name
      WHERE {
      ?erdoes rdf:type foaf:Person .
      ?erdoes foaf:name "Paul Erdoes"^^xsd:string .

      { ?document dc:creator ?erdoes . ?document dc:creator ?author . ?document2 dc:creator ?author . ?document2 dc:creator ?author2 . ?author2 foaf:name ?name FILTER (?author!=?erdoes && ?document2!=?document && ?author2!=?erdoes && ?author2!=?author) }

      UNION

      { ?document dc:creator ?erdoes. ?document dc:creator ?author. ?author foaf:name ?name FILTER (?author!=?erdoes) }

      }

      I will attach these as files as well for convenience.

      If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine. This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.

      I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

      1. TDBEmptyOutput.java
        3 kB
        Rob Vesse
      2. sp2b10k.nt
        1.55 MB
        Rob Vesse
      3. 8.txt
        0.8 kB
        Rob Vesse
      4. 5b.txt
        0.4 kB
        Rob Vesse

        Activity

        Hide
        Rob Vesse added a comment -

        SP2B 10k dataset and queries 5b and 8 which exhibit this problem

        Show
        Rob Vesse added a comment - SP2B 10k dataset and queries 5b and 8 which exhibit this problem
        Hide
        Rob Vesse added a comment -

        Ok so after some playing I have reduced 5b down to a simpler query that exhibits this behavior:

        PREFIX foaf: <http://xmlns.com/foaf/0.1/>
        PREFIX dc: <http://purl.org/dc/elements/1.1/>

        SELECT DISTINCT ?person ?name
        WHERE

        { ?article dc:creator ?person . ?person foaf:name ?name . }

        Interestingly as soon as DISTINCT is removed then the Text, CSV and TSV output works fine.

        So I'm still uncertain whether this is a TDB or ARQ issue, problem is that this reduced query works fine with an ARQ in-memory dataset so I'm still leaning towards it being caused by some interaction of TDB and the DISTINCT operator

        Yet when I look at OpExecutorTDB I see:

        @Override
        protected QueryIterator execute(OpDistinct opDistinct, QueryIterator input)

        { return super.execute(opDistinct, input) ; }

        Which leaves me none the wiser, following the code I end up in QueryIterDistinct but I assume an ARQ in-memory dataset would also use QueryIterDistinct so why should the query not return properly for some formats on TDB and yet return fine on the same formats with an ARQ in-memory dataset?

        Show
        Rob Vesse added a comment - Ok so after some playing I have reduced 5b down to a simpler query that exhibits this behavior: PREFIX foaf: < http://xmlns.com/foaf/0.1/ > PREFIX dc: < http://purl.org/dc/elements/1.1/ > SELECT DISTINCT ?person ?name WHERE { ?article dc:creator ?person . ?person foaf:name ?name . } Interestingly as soon as DISTINCT is removed then the Text, CSV and TSV output works fine. So I'm still uncertain whether this is a TDB or ARQ issue, problem is that this reduced query works fine with an ARQ in-memory dataset so I'm still leaning towards it being caused by some interaction of TDB and the DISTINCT operator Yet when I look at OpExecutorTDB I see: @Override protected QueryIterator execute(OpDistinct opDistinct, QueryIterator input) { return super.execute(opDistinct, input) ; } Which leaves me none the wiser, following the code I end up in QueryIterDistinct but I assume an ARQ in-memory dataset would also use QueryIterDistinct so why should the query not return properly for some formats on TDB and yet return fine on the same formats with an ARQ in-memory dataset?
        Hide
        Andy Seaborne added a comment -

        I can't think of anything TDB specific (but it's been a while since I looked at that code).

        There is a lot of change going on in TDB, and hence Fuseki; exactly which version is this? Some of the bug fixes around December/(v early January?) might account for it if "blank output" means there is a slot in TSV, it's just empty.

        Show
        Andy Seaborne added a comment - I can't think of anything TDB specific (but it's been a while since I looked at that code). There is a lot of change going on in TDB, and hence Fuseki; exactly which version is this? Some of the bug fixes around December/(v early January?) might account for it if "blank output" means there is a slot in TSV, it's just empty.
        Hide
        Rob Vesse added a comment -

        This is the latest Fuseki 0.2.1 Snapshot, --version output from Fuseki:

        Jena: VERSION: 2.7.0-incubating
        Jena: BUILD_DATE: 2011-12-14T14:54:09+0000
        ARQ: VERSION: 2.9.0-incubating
        ARQ: BUILD_DATE: 2011-12-14T15:04:27+0000
        TDB: VERSION: 0.9.0-incubating-SNAPSHOT
        TDB: BUILD_DATE: 2012-01-12T01:15:34+0000
        Fuseki: VERSION: 0.2.1-incubating-SNAPSHOT
        Fuseki: BUILD_DATE: 2012-01-12T20:02:39+0000

        When I say the output is blank it's blank totally, so it fails to even show a header with the variable names for all of the TSV, CSV and Text outputs which is what I'd expect if there are just empty rows.

        There are answers to the queries because the XML and JSON outputs show valid output with multiple non-empty result rows.

        Show
        Rob Vesse added a comment - This is the latest Fuseki 0.2.1 Snapshot, --version output from Fuseki: Jena: VERSION: 2.7.0-incubating Jena: BUILD_DATE: 2011-12-14T14:54:09+0000 ARQ: VERSION: 2.9.0-incubating ARQ: BUILD_DATE: 2011-12-14T15:04:27+0000 TDB: VERSION: 0.9.0-incubating-SNAPSHOT TDB: BUILD_DATE: 2012-01-12T01:15:34+0000 Fuseki: VERSION: 0.2.1-incubating-SNAPSHOT Fuseki: BUILD_DATE: 2012-01-12T20:02:39+0000 When I say the output is blank it's blank totally, so it fails to even show a header with the variable names for all of the TSV, CSV and Text outputs which is what I'd expect if there are just empty rows. There are answers to the queries because the XML and JSON outputs show valid output with multiple non-empty result rows.
        Hide
        Rob Vesse added a comment -

        Edited title to reflect the actual cause of the issue

        Show
        Rob Vesse added a comment - Edited title to reflect the actual cause of the issue
        Hide
        Rob Vesse added a comment - - edited

        So I got into this a lot more and was able to reproduce the issue in code and isolate it from Fuseki

        It appears that something in this data causes TDB to return a null for one of the variables which is bizarre because there is no optional variables and it shouldn't be possible to set a null on a Binding AFAIK.

        Stack Trace from the code I will shortly attach is as follows:

        null
        java.lang.NullPointerException
        at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204)
        at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185)
        at java.util.HashMap.put(HashMap.java:372)
        at java.util.HashSet.add(HashSet.java:200)
        at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109)
        at org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
        at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
        at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84)
        at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102)
        at com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111)
        at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43)
        at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
        at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481)
        at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459)
        at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44)
        at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29)

        So the lack of Text, CSV and TSV output appears to be that it hits this error partway through and all those writers either do a single flush at the end of output or in the case of the Text output have to do a complete iteration over the ResultSet before they can output or they reach whatever Jetty's flush threshold is. Hence the lack of outputs for these formats yet output in the XML/JSON cases.

        What I failed to notice previously was that the XML/JSON output was incomplete (though strangely it was valid) but it doesn't include as many results as it should. Something in the Fuseki layer appears to catch and clean up the incomplete output whereas when evaluated via the code you get larger but incomplete output.

        Show
        Rob Vesse added a comment - - edited So I got into this a lot more and was able to reproduce the issue in code and isolate it from Fuseki It appears that something in this data causes TDB to return a null for one of the variables which is bizarre because there is no optional variables and it shouldn't be possible to set a null on a Binding AFAIK. Stack Trace from the code I will shortly attach is as follows: null java.lang.NullPointerException at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204) at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185) at java.util.HashMap.put(HashMap.java:372) at java.util.HashSet.add(HashSet.java:200) at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109) at org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58) at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87) at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152) at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84) at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102) at com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111) at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43) at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52) at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481) at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459) at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44) at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29) So the lack of Text, CSV and TSV output appears to be that it hits this error partway through and all those writers either do a single flush at the end of output or in the case of the Text output have to do a complete iteration over the ResultSet before they can output or they reach whatever Jetty's flush threshold is. Hence the lack of outputs for these formats yet output in the XML/JSON cases. What I failed to notice previously was that the XML/JSON output was incomplete (though strangely it was valid) but it doesn't include as many results as it should. Something in the Fuseki layer appears to catch and clean up the incomplete output whereas when evaluated via the code you get larger but incomplete output.
        Hide
        Rob Vesse added a comment -

        Attached code which demonstrates the issue - allows you to run with the data as a TDB dataset or as an ARQ in-memory dataset.

        With TDB dataset I see a NPE in BindingBase.hashCode() whereas with ARQ I do not - the implication is the QueryIterator that TDB returns can return malformed Binding instances which have/return nulls for some variables OR whose variables list is incorrect so BindingBase tries to access variables that don't exist in the Binding

        Show
        Rob Vesse added a comment - Attached code which demonstrates the issue - allows you to run with the data as a TDB dataset or as an ARQ in-memory dataset. With TDB dataset I see a NPE in BindingBase.hashCode() whereas with ARQ I do not - the implication is the QueryIterator that TDB returns can return malformed Binding instances which have/return nulls for some variables OR whose variables list is incorrect so BindingBase tries to access variables that don't exist in the Binding
        Hide
        Andy Seaborne added a comment -

        Using the current development TDB with ARQ from Jena 2.7.0, I can run the code either way round. I get the same number of lines of output and no exceptions.

        Also, I ran it with ARQ with the proposed patch for JENA-200 / TSV reading.

        Looking at the code for BindingBase.hashCode, node is null. This is a symptom of old bugs in TDB. I thought they would have been fixed by the build of 12-Jan but they were DB-corrupting so if the data is older, then it might have already been broken.

        Exact code I'm running:
        https://svn.apache.org/repos/asf/incubator/jena/Scratch/AFS/Jena-Dev/trunk/src/dev/Jena199_BadBindingTDB.java

        See script "dwim" for how I'm running the code.

        Rob - if you have a moment, could you try with latest TDB, and a cleanly built database? Thanks.

        Show
        Andy Seaborne added a comment - Using the current development TDB with ARQ from Jena 2.7.0, I can run the code either way round. I get the same number of lines of output and no exceptions. Also, I ran it with ARQ with the proposed patch for JENA-200 / TSV reading. Looking at the code for BindingBase.hashCode, node is null. This is a symptom of old bugs in TDB. I thought they would have been fixed by the build of 12-Jan but they were DB-corrupting so if the data is older, then it might have already been broken. Exact code I'm running: https://svn.apache.org/repos/asf/incubator/jena/Scratch/AFS/Jena-Dev/trunk/src/dev/Jena199_BadBindingTDB.java See script "dwim" for how I'm running the code. Rob - if you have a moment, could you try with latest TDB, and a cleanly built database? Thanks.
        Hide
        Rob Vesse added a comment -

        Yes I believe I did copy an existing dataset created with Fuseki 0.2.0 with TDB 0.8.0 so db corruption may be the root cause of this

        Will try and see if I can reproduce on latest snapshots at some point in the next day or so

        Show
        Rob Vesse added a comment - Yes I believe I did copy an existing dataset created with Fuseki 0.2.0 with TDB 0.8.0 so db corruption may be the root cause of this Will try and see if I can reproduce on latest snapshots at some point in the next day or so
        Hide
        Andy Seaborne added a comment -

        Is it OK to close this as "can't reproduce"? It can be reopened if new evidence comes to light.

        Show
        Andy Seaborne added a comment - Is it OK to close this as "can't reproduce"? It can be reopened if new evidence comes to light.
        Hide
        Rob Vesse added a comment -

        Sorry for taking so long to reply on this, with an updated Fuseki/TDB snapshot and with a cleanly built database I can no longer reproduce this so I'll close as Not a Bug or similar

        Show
        Rob Vesse added a comment - Sorry for taking so long to reply on this, with an updated Fuseki/TDB snapshot and with a cleanly built database I can no longer reproduce this so I'll close as Not a Bug or similar
        Hide
        Rob Vesse added a comment -

        Bug was due to using an old corrupt database created with an old Fuseki/TDB snapshot - using an up to data Fuseki/TDB snapshot with a cleanly built database does not exhibit the problem

        Show
        Rob Vesse added a comment - Bug was due to using an old corrupt database created with an old Fuseki/TDB snapshot - using an up to data Fuseki/TDB snapshot with a cleanly built database does not exhibit the problem

          People

          • Assignee:
            Unassigned
            Reporter:
            Rob Vesse
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development