[TINKERPOP-960] Add a Bulk class which is used by Traverser - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Won't Fix
Affects Version/s: 3.1.0-incubating
Fix Version/s: None
Component/s: process
Labels:
- breaking

Description

Currently, Traverser.bulk() is a long. This should be generalized to support any type of "bulk" that can be split, merged, and has a magnitude, where our current representation would be a LongBulk.

There are three reasons to support this.

1. Right now we use Traverser.sack() to handle "energy flows" (0/0 to 1.0 energy in a traverser). We have introduced this ugly TraversalSource.withBulk(false) to allow these to semantically make sense (i.e. remain unitary). The "energy" should be the "bulk" as it can be split, merged, and has a magnitude.

2. We can support complex numbers for bulks (and complex number arrays). There is currently no strong use case for this beyond quantum simulation and wave dynamics, but you never know what might unfold. A solid algorithm here would be epic.

3. We need to be able to support BigInteger as the bulking model on complex graphs easily blows long. There have been numerous times where I wanted to repeat() more times, but can't without incurring long overflow and thus, negative bulk. If the computation calls for it, a user should be able to use BigInteger.

--------------------------

public interface Bulk<T> {
  public boolean canMerge(Bulk<T> otherBulk);
  public void merge(Bulk<T> otherBulk);
  public Bulk<T> split();
  public Number magnitude();
  public T get();
  public boolean isAlive();
  public static Bulk<T> identity();
}

Our current model would be:

public class LongBulk extends Bulk<Long> {
  public boolean canMerge(LongBulk otherBulk) { return true; }
  public void merge(LongBulk otherBulk) { this.count += otherBulk.count; }
  public LongBulk split() { return new LongBulk(this.count); }
  public Number magnitude() { return this.count; }
  public Long get() { return this.count; }
  public boolean isAlive() { return this.count > 0; }
  public static LongBulk identity() { return new LongBulk(1); } 
}

Another benefit of this is that bulking is now decoupled from the Traverser class and thus, you would be able to use different bulk classes with the same Traverser-species:

g.withBulk(BigIntegerBulk.class).V().out().out()

...where the default is assumed LongBulk and thus, you don't have to specify a withBulk. For backwards compatible sake, withBulk(false) would simply be IdentityLongBulk would would just return a 1 for everything.

We will want to then expose it.bulk() methods so algorithms can alter the bulk as they see fit. However, this is where a backwards compatibility issue would happen. Traverser.bulk() -> Bulk and Traverser.magnitude() -> Number (sidenote ---- perhaps "magnitude" is just called "count").

g.withBulk(UnitaryBulk).V(0).bulk(1.0).outE().bulk(mult).by('weight').inV()

What is stellar about this is that it frees up "sack" for more "objects" and not for "bulking" when long is not sufficient. Thus, we promote that "sack" is for objects and "bulk" is for, well, bulk! Now we no longer have this weird tension between bulk and sack as they have two different uses. Bulking is for altering the "magnitude" of a traverser and sacking if for gathering statistics/data about the graph along the way.

Attachments

Activity

People

Assignee:: Marko A. Rodriguez

Reporter:: Marko A. Rodriguez

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 16/Nov/15 15:36

Updated:: 12/Jul/17 20:48

Resolved:: 12/Jul/17 20:48