Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-119

Eliminate memory bounds during query execution

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ARQ
    • Labels:
      None

      Description

      It would be nice to eliminate all memory bounds on queries. Similar to JENA-44, it would involve modifying all of the QueryIterator objects that maintain unbounded collections of Bindings.

      The ones I've identified (let me know if I've missed any):

      + QueryIterSort
      Complete!

      + QueryIterGroup
      Probably one of the more complicated implementations. I think it can be done with a DistinctDataBag.

      + QueryIterDistinct
      Can be implemented trivially using DistinctDataBag, but would lose streaming capability. We could do streaming just until the first spill, which would be a little more difficult but not bad. If we wanted streaming even after spilling, then we would need an on-disk hashtable or b-tree (which could get expensive for maybe limited benefit, do you really need streaming after 10,000 results?).

      + QueryIteratorCopy
      Only appears to be used QueryIterService. Simple implementation using DefaultDataBag.

      + QueryIteratorCaching
      Does not match DataBag's assumption of completing all writes before iterating. But it isn't used anywhere, so maybe we remove it?

      + QueryIterDiff
      + QueryIterMinus
      Both of these materialize the RHS into a collection. Can be implemented with DefaultDataBag. As an aside, is this necessary to do for all queries? What if the RHS is cheap (i.e. a single TriplePattern)?

      + QueryIterJoin
      + QueryIterLeftJoin
      Both materialize RHS. Are they used anywhere? I was under the impression that ARQ only considered left-deep plans with indexed joins on the RHS TriplePatterns.

      + SubQueries
      I'm not sure how this is handled. Are these materialized somewhere?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                sallen Stephen Allen
                Reporter:
                sallen Stephen Allen
              • Votes:
                1 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: