Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Currently Apex engine provides operator checkpointing in Hdfs ( with Hdfs backed StorageAgents i.e. FSStorageAgent & AsyncFSStorageAgent )
As operator check-pointing is critical functionality of Apex streaming platform to ensure fault tolerant behavior, platform should also provide alternate StorageAgents which will work seamlessly with large applications that requires Exactly once semantics.
HDFS read/write latency is limited and doesn't improve beyond certain point because of disk io & staging writes. Having alternate strategy to this check-pointing in fault tolerant distributed in-memory grid would ensure application stability and performance is not impacted by checkpointing
This feature will add below functionalities
- A KeyValue store interface which is used by In-memory checkpointing storage agent.
- Abstract implementation of KeyValue storage agent which can be configured with concrete implementation of KeyValue store for checkpointing.
- Concrete implementation of In memory storage agent for Apache Geode
This feature depends on below APEX core feature
https://issues.apache.org/jira/browse/APEXCORE-283
- Interface for storage agent to provide application id
- Stram client changes to pass applicationId