Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.6.0
-
None
Description
SpringXD is a unified, distributed, and extensible runtime platform for data ingestion, real time analytics, batch processing, and data export. It simplifies the development of big data applications.
Spring XD provides an extensible DSL for defining a stream and jobs using pipes and filters abstraction. A simple linear stream consists of a sequence of modules. Typically an Input Source, (optional) Processing Steps, and an Output Sink.
DSL example for defaing a stream that collects data from an HTTP Source and writes it into a HDFS Sink
http --port 9000 | hdfs --fileName=<hdfs file name>
or twitter search stream that stores the incoming tweets in memory grid like Geode:
twittersearch --query=Zeppelin --outputType=application/json | gemfire-json-server --host=... --port=... --regionName=... --keyExpression=payload.getField('id_str')
The Spring XD DSL is good fit for Zeppelin notebooks as it will allow to declaratively (and human readably) define the ingestion/processing/export pipelines.
Attachments
Issue Links
- is related to
-
ZEPPELIN-274 Add Support for Streaming (long-running) Tasks.
- Open