Apache Jena
  1. Apache Jena
  2. JENA-236

Allow Service XML Parsing error to cancel the query on a per query basis

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: ARQ 2.9.0
    • Fix Version/s: ARQ 2.9.1
    • Component/s: ARQ
    • Labels:
    • Environment:

      All

      Description

      Currently an XML parsing error may occur on a hasNext() call on a Sparql service query. The result is that the entire query (not just the service call) fails.

      The goal of this improvement is to capture the XML parsing error and return false for the hasNext(). The result being that data errors will be silently ignored. This should be done on a per endpoint basis. Perhaps as an onParseErrorCancel flag.

      I think there is an interplay of several flags in this request:

      1) SILENT service parameter. If silent is true should this also be true by default? (I think not, but perhaps a system setting to make that the case)
      2) cancelAllowDrain. If the error occurs should the cancel flag be raised? (I think not, but again perhaps a per service call flag to enable this)
      3) JENA-93 discusses changing cancelAllowDrain to be a per endpoint setting. If that is the case it may apply here as well.

      1. JENA-236-1.txt
        2 kB
        Claude Warren

        Activity

        Claude Warren created issue -
        Hide
        Claude Warren added a comment -

        Patch to use a context symbol (finishOnXmlStreamError) to cause XML Stream errors to signal hasNext() to return false rather than aborting the entire query.

        This is not per query as noted in the improvement title, nor does it take into account Silent, cancelAllowDrain or JENA-93 changes.

        Show
        Claude Warren added a comment - Patch to use a context symbol (finishOnXmlStreamError) to cause XML Stream errors to signal hasNext() to return false rather than aborting the entire query. This is not per query as noted in the improvement title, nor does it take into account Silent, cancelAllowDrain or JENA-93 changes.
        Claude Warren made changes -
        Field Original Value New Value
        Attachment JENA-236-1.txt [ 12525413 ]
        Hide
        Claude Warren added a comment -

        After further research I don't theink the Silent, cacnelAllowDrain apply to this defect.

        Show
        Claude Warren added a comment - After further research I don't theink the Silent, cacnelAllowDrain apply to this defect.
        Rob Vesse made changes -
        Assignee Rob Vesse [ rvesse ]
        Hide
        Rob Vesse added a comment -

        Claude

        If I understand correctly the problem you are encountering is that you make a SERVICE call which succeeds but partway through fails due to malformed XML (cough DBPedia) and this failure is a parser error rather than a query error?

        The patch you have proposed does not really seem appropriate because it introduces a coupling between the SPARQL XML parser and the query engine when none is really necessary. A better approach is to wrap the iterator QueryIterService with another iterator which can catch parser errors and turn them into query errors instead. Possibly with some configurable non-default parameter like you have proposed to just silently continue on regardless of the parsing error.

        I will take a look at taking this approach

        Show
        Rob Vesse added a comment - Claude If I understand correctly the problem you are encountering is that you make a SERVICE call which succeeds but partway through fails due to malformed XML ( cough DBPedia) and this failure is a parser error rather than a query error? The patch you have proposed does not really seem appropriate because it introduces a coupling between the SPARQL XML parser and the query engine when none is really necessary. A better approach is to wrap the iterator QueryIterService with another iterator which can catch parser errors and turn them into query errors instead. Possibly with some configurable non-default parameter like you have proposed to just silently continue on regardless of the parsing error. I will take a look at taking this approach
        Hide
        Rob Vesse added a comment -

        Having started at looking at this I'm not sure that my initial assessment is the appropriate approach.

        We actually materialize results immediately so it looks like all we need is some error handling in Server.exec() to handle the case where attempting the materialization fails

        Show
        Rob Vesse added a comment - Having started at looking at this I'm not sure that my initial assessment is the appropriate approach. We actually materialize results immediately so it looks like all we need is some error handling in Server.exec() to handle the case where attempting the materialization fails
        Hide
        Rob Vesse added a comment -

        Actually this error handling is already in place so I'm failing to see what exactly the issue is here.

        If SILENT is on for the SERVICE clause then we will ignore the error and move on, if it is not then we throw an error and query processing stops. This is completely in line with the SPARQL specification so what exactly is the behavior you were hoping to achieve?

        Adding SILENT to your queries causes a continue on error which AFAICT is what you want or are you wanting to preserve the results retrieved up to the point of the error and make that a configurable non-standard behavior?

        Show
        Rob Vesse added a comment - Actually this error handling is already in place so I'm failing to see what exactly the issue is here. If SILENT is on for the SERVICE clause then we will ignore the error and move on, if it is not then we throw an error and query processing stops. This is completely in line with the SPARQL specification so what exactly is the behavior you were hoping to achieve? Adding SILENT to your queries causes a continue on error which AFAICT is what you want or are you wanting to preserve the results retrieved up to the point of the error and make that a configurable non-standard behavior?
        Hide
        Andy Seaborne added a comment -

        Maybe other work in the area corrected the problem; there a report on fuseki locking up on loop bck and it was because the service code was not materialing results, leaving results in the server (localcase - limited netwrok buffering).

        If so, apologies, I should have noticed and closed the JIRA. Close now?

        Show
        Andy Seaborne added a comment - Maybe other work in the area corrected the problem; there a report on fuseki locking up on loop bck and it was because the service code was not materialing results, leaving results in the server (localcase - limited netwrok buffering). If so, apologies, I should have noticed and closed the JIRA. Close now?
        Hide
        Rob Vesse added a comment -

        Appears to have been fixed by some unrelated fixes, can be reopened if Claude can clarify what the behavior he was after was and that it wasn't already covered by the existing fixes.

        Show
        Rob Vesse added a comment - Appears to have been fixed by some unrelated fixes, can be reopened if Claude can clarify what the behavior he was after was and that it wasn't already covered by the existing fixes.
        Rob Vesse made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s ARQ 2.9.1 [ 12319291 ]
        Resolution Fixed [ 1 ]
        Andy Seaborne made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Rob Vesse
            Reporter:
            Claude Warren
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development