[JENA-29] cancellation during query execution - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: Jena 2.11.0
Component/s: ARQ, TDB
Labels:
None

Description

The requested improvement and proposed patch is made by Simon Helsen on behalf of IBM

ARQ query execution currently does not have a satisfactory way to cancel a running query in a safe way. Moreover, cancel (unlike a hard abort) is especially useful if it is able to provide partial result sets (i.e. all the results it managed to compute up to when the cancellation was requested). Although the exact cancellation behavior depends on the capabilities of the underlying triple store, the proposed patch merely relies on the iterators used by ARQ.

Here is a more detailed explanation of the proposed changes:

1) the cancel() method in the QueryIterator initiates a cancellation request (first boolean flag). In analogy with closeIterator(), it propagates through all chained iterators, so the entire calculation is aware that a cancellation is requested
2) to ensure a thread-safe semantics, the cancelRequest becomes a real cancel once nextBinding() has been called. It sets the second boolean which is used in hasNext(). This 2-phase approach is critical since the cancel() method can be called at any time during a query execution by the external thread. And because the behavior of hasNext() is such that it has to return the same value until next() is called, this is the only way to guarantee semantic safety when cancel() is invoked (let me re-phrase this: it is the only way I was able to make it actually work)
3) cancel() does not close anything since it allows execution to finish normally and the client is responsible to call close() just like with a regular execution. Note that the client has to call cancel() explicitly (typically in another thread) and has to assume that the returning result set may be incomplete if this method is called (it is undetermined whether the result is actually incomplete)
4) in order to deal with order-by and groups, I had to make two more changes. First, I had to make QueryIterSort and QueryIterGroup a slightly bit more lazy. Currently, the full result set is calculated during plan calculation. With my proposed adjustments, this full result set is called on the first call to any of its Iterator methods (e.g. hasNext). This change does not AFAIK affect the semantics. Second, because the desired behavior of cancelling a sort or group query is to make sure everything is sorted/grouped even if the total result set is not completed, I added an exception which reverses the cancellation request of the encompassing iterator (as an example see cancel() in QueryIterSort). This makes sure that the entire subset of found and sorted elements is returned, not just the first element. However, it also implies in the case of sort that when a query is cancelled, it will first sort the partially complete result set before returning to the client.

the attached patch is based on ARQ 2.8.5 (and a few classes in TDB 0.8.7 -> possibly the other triple store implementations need adjustement as well)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

cancelFix3.patch
23/Feb/11 15:46
2 kB
Simon Helsen
jena.patch
12/Jan/11 23:06
53 kB
Simon Helsen
JENA-29_ARQ_r8489.patch
07/Feb/11 13:36
36 kB
Paolo Castagna
JENA-29_TDB_r8489.patch
07/Feb/11 13:36
1 kB
Paolo Castagna
JENA-29_tests_ARQ_r8489.patch
07/Feb/11 16:59
6 kB
Paolo Castagna
jenaAddition.patch
28/Jan/11 22:52
2 kB
Simon Helsen
queryIterRepeatApply.patch
15/Feb/11 23:06
2 kB
Simon Helsen

Issue Links

is related to

JENA-44 Support external sorting of bindings in ARQ

Closed

JENA-382 SPARQL Updates are not cancelable

Open

Sub-Tasks

1.	Add timeout mechanism to query execution	Closed	Andy Seaborne
2.	Make cancellation immediate is called before execution starts	Closed	Andy Seaborne
3.	Check use of exception in QueryIterGroup / QueryIterSort .cancel()	Closed	Andy Seaborne
4.	Add timeout processing to QueryEngineHTTP	Closed	Andy Seaborne

Activity

People

Assignee:: Andy Seaborne

Reporter:: Simon Helsen

Votes:: 5 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 12/Jan/11 23:06

Updated:: 01/Feb/15 19:11

Resolved:: 01/Feb/15 19:11