[SPARK-28513] Compute distinct label sets instead of subsets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 3.1.0
Fix Version/s: None
Component/s: Graph
Labels:
None

Epic Link:
Property Graph / Cypher / Algorithms

Description

CypherSession::createDataFrame(nodes: DataFrame, rels: DataFrame)

currently computes NodeFrames by filtering label columns, computing all possible subsets and creating one NodeFrame for each subset. This results in 2^n sets / NodeFrames.

Instead, we should compute the distinct label sets that actually occur in the nodes DataFrame.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Martin Junghanns

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 25/Jul/19 10:24

Updated:: 16/Mar/20 22:53