[S2GRAPH-221] Unify configurations for bulk and mutate in S2GraphSink. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Done
Priority: Minor
Resolution: Done
Affects Version/s: None
Fix Version/s: None
Component/s: s2jobs
Labels:
None

Description

Currently, S2GraphSink in s2jobs provide two way to sink data from spark into S2Graph.

1. mutate: open S2Graph per each executor, then call mutateElements method.
2. bulk: run spark job to build HFile and run loadIncrementalHFiles.

It is hard to followup options for these two different method, since mutate options are defined in org.apache.s2graph.spark.sql.streaming.S2SinkConfigs, and bulkload options are defined in org.apache.s2graph.s2jobs.load.GraphFileOptions.

I suggest placing all configurations under in one place so it would be easy to maintain them.

Also, many options for bulk can be removed.

One example is dbUrl options, which is same as "db.default.url" and zkQuorum, which is same as "hbase.zookeeper.quorum".

Attachments

Issue Links

links to

GitHub Pull Request #173

Activity

People

Assignee:: Do Yung Yoon

Reporter:: Do Yung Yoon

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Due:: 29/Jun/18

Created:: 14/Jun/18 08:00

Updated:: 21/Jun/18 01:24

Resolved:: 21/Jun/18 01:24