[JENA-2328] Query timeouts failing when plan phase is long - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Jena 4.6.0
Component/s: ARQ
Labels:
None

Description

In a production service with a large TDB store (around 500MT) we find that some complex queries evade the query timeouts (set to 90s first result, 120s total) and then run for hours soaking up all available CPU cores. While the queries show no clear pattern, and it has been hard replicate in a controlled setting, we do now have one example which is expressible as a test case. See attached.

The behaviour is that the abort() call from the alarm timeout is received by QueryExecDataset before there is an iterator to cancel - the QueryExecDataset instance is deep in getPlan() which itself executes part of the query. In the specific example it's OpSlice which is iterating through the offset while still in the planning phase. Though not queries which cause this sort of behaviour use offsets.

Sorry but have no PR to offer at this stage. Have looked at whether it's possible to have getPlan() return some future or deferrable plan so that the top level exec has a handle on something that it can abort. However, the changes looks far reaching and I don't yet have a satisfactory approach to offer.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

TestQueryExecutionTimeout3.java
15/May/22 18:38
3 kB
Dave Reynolds

Issue Links

is related to

JENA-2141 Query timeout may not be seen if during execution initialization.

Closed

links to

GitHub Pull Request #1368

Activity

People

Assignee:: Andy Seaborne

Reporter:: Dave Reynolds

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 15/May/22 20:14

Updated:: 25/Aug/22 09:43

Resolved:: 08/Jun/22 07:30