Details
-
New Feature
-
Status: To Do
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Background
As the lambda architecture view, S2Graph provides a great real-time view with serving layer on HBase.
The input stream came from the REST API is stored to HBase, and it can be served by the graph query in real-time.
The stream, which is write-ahead log is also written to Kafka, it allows us to do a lot of things.
There are several works (or sub-projects) using this stream.
- S2Counter - computes the real-time count by the combinations of properties using Kafka stream directly.
- WalToHdfs - Kafka stream to the incremental view
- S2ML - performs machine learning algorithm using the incremental view.
- …
S2Lambda
Because the above works have been developed, respectively, they use different Spark versions and duplicated codes.
This causes difficulty of build and code reusability.
S2Lambda should be designed to solve this problem to support a general framework of speed and batch layers.
IMHO, first, A JSON-formatted job description is designed for compatible with both speed and batch layer.
then the S2Lambda is implemented by corresponding it.
Attachments
Attachments
Issue Links
- links to