Details
-
Improvement
-
Status: Done
-
Minor
-
Resolution: Done
-
None
-
None
-
None
Description
Currently, S2GraphSink in s2jobs provide two way to sink data from spark into S2Graph.
1. mutate: open S2Graph per each executor, then call mutateElements method.
2. bulk: run spark job to build HFile and run loadIncrementalHFiles.
It is hard to followup options for these two different method, since mutate options are defined in org.apache.s2graph.spark.sql.streaming.S2SinkConfigs, and bulkload options are defined in org.apache.s2graph.s2jobs.load.GraphFileOptions.
I suggest placing all configurations under in one place so it would be easy to maintain them.
Also, many options for bulk can be removed.
One example is dbUrl options, which is same as "db.default.url" and zkQuorum, which is same as "hbase.zookeeper.quorum".
Attachments
Issue Links
- links to