[JENA-119] Eliminate memory bounds during query execution - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: None
Component/s: ARQ
Labels:
None

Description

It would be nice to eliminate all memory bounds on queries. Similar to ~~JENA-44~~, it would involve modifying all of the QueryIterator objects that maintain unbounded collections of Bindings.

The ones I've identified (let me know if I've missed any):

+ QueryIterSort
Complete!

+ QueryIterGroup
Probably one of the more complicated implementations. I think it can be done with a DistinctDataBag.

+ QueryIterDistinct
Can be implemented trivially using DistinctDataBag, but would lose streaming capability. We could do streaming just until the first spill, which would be a little more difficult but not bad. If we wanted streaming even after spilling, then we would need an on-disk hashtable or b-tree (which could get expensive for maybe limited benefit, do you really need streaming after 10,000 results?).

+ QueryIteratorCopy
Only appears to be used QueryIterService. Simple implementation using DefaultDataBag.

+ QueryIteratorCaching
Does not match DataBag's assumption of completing all writes before iterating. But it isn't used anywhere, so maybe we remove it?

+ QueryIterDiff
+ QueryIterMinus
Both of these materialize the RHS into a collection. Can be implemented with DefaultDataBag. As an aside, is this necessary to do for all queries? What if the RHS is cheap (i.e. a single TriplePattern)?

+ QueryIterJoin
+ QueryIterLeftJoin
Both materialize RHS. Are they used anywhere? I was under the impression that ARQ only considered left-deep plans with indexed joins on the RHS TriplePatterns.

+ SubQueries
I'm not sure how this is handled. Are these materialized somewhere?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

JENA-119-r1177452-ARQ-Construct.patch
29/Sep/11 22:56
40 kB
Stephen Allen
JENA-119-r1177090-Fuseki-Construct.patch
29/Sep/11 22:56
0.6 kB
Stephen Allen

Issue Links

relates to

JENA-126 Change temporary table threshold policy from count to memory size

Open

Activity

People

Assignee:: Stephen Allen

Reporter:: Stephen Allen

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 19/Sep/11 23:21

Updated:: 09/Feb/22 09:07

Resolved:: 09/Feb/22 09:07