Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Unknown
Description
As a part of the the IoT project I'm working on, I have created a Spark component (1) to make it easier to handle analytics requests from devices. I would like to donate this code to the ASF Camel and extend it here, as I guess that there would be many people interested in using Spark from Camel.
The URI looks like spark:rdd/rddName/rddCallback or spark:dataframe/frameName/frameCallback depending if you would like to work with RDDs or DataFrames.
The idea here is that Camel route acts as a driver application. You specify RDD/DataFrames definitions (and callbacks to act against those) in a registry (for example as Spring beans or OSGi services). Then you send a parameters for the computations as a body of a message.
For example in Spring Boot you specify RDD+callback as:
@Bean JavaRDD myRdd(SparkContext sparkContext) { return sparkContext.textFile("foo.txt"); } @Bean class MyAnalytics { @RddCallback long countLines(JavaRDD<String> textFile, long argument) { return rdd.count() * argument; } }
Then you ask for the results of computations:
long results = producerTemplate.requestBody("spark:rdd/myRdd/MyAnalytics", 10, long.class);
Such setup is extremely useful for bridging Spark computations via different transports.
(1) https://github.com/rhiot/rhiot/tree/master/components/camel-spark