Uploaded image for project: 'Apache Jena'
  1. Apache Jena
  2. JENA-121

Improvements to Bindings

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • ARQ
    • None

    Description

      The Binding interface is a key object for query execution. It has some issues such that it may be a good idea to think about tweaking the design a bit. Some thoughts:

      1) Bindings should be immutable (in the strong Java sense: http://www.ibm.com/developerworks/java/library/j-jtp02183/index.html)

      2) Add a BindingPair class that represents a Var/Node value (could be called something else, BindingValue?)
      2a) Binding constructor/factory method takes an Iterable<BindingPair> to initialize it
      2b) Binding can now implement Iterable<BindingPair> which would be more efficient than iterating over the variables then looking up each node

      3) An implementation that has better memory usage than BindingMap (a HashMap may be overkill here, if we can use the iterator from 2b in more places)

      4) An implementation that copies parent BindingPairs instead of maintaining a reference. If the parent bindings are not held onto by themselves after being incorporated into a child, we can save memory by copying and letting the parent be GCed (indeed in the common join case, this appears to be what happens). We would also get speed benefits from storing the BindingPairs in a single data structure, making iterating and looking up values faster. Additionally, more Binding objects die young instead of being held as part of a higher level algebra collection (like sort or distinct), which can help with GC overhead.

      5) Expose an iterator of BindingPairs ordered by variable. This is needed for BindingComparator, and may be an option for Algebra.merge()/compatible() if we eliminate fast get(Var) lookups of nodes (as a consequence of 3). The ordering could be determined at construction or be initialized lazily.

      6) Method for estimating memory size for the binding. Would be very useful for setting threshold policies for DataBags. Although this may be tough to do, especially if Nodes are shared between bindings.

      Some of these points need some more investigation, and some good profiling to ensure that they are beneficial, especially 3, 4, and 5.

      Attachments

        1. JENA-121-r1174067.patch
          48 kB
          Andy Seaborne

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sallen Stephen Allen
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: