Uploaded image for project: 'Apache Apex Malhar'
  1. Apache Apex Malhar
  2. APEXMALHAR-2066

JDBC poller input operator

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Done
    • None
    • 3.5.0
    • None
    • None

    Description

      Create a JDBC poller input operator that has the following features.

      1. poll from external jdbc store asynchronously in the input operator.
      2. polling frequency and batch size should be configurable.
      3. should be idempotent.
      4. should be partition-able.
      5. should be batch + polling capable.

      Assumptions for idempotency & partitioning,
      1.User needs to provide tableName,dbConnection,setEmitColumnList,look-up key.
      2.Optionally batchSize,pollInterval,Look-up key and a where clause can be given.
      3.This operator uses static partitioning to arrive at range queries for exactly once reads.
      This operator will create a configured number of non-polling static partitions for fetching the existing data in the table. And an additional
      single partition for polling additive data.
      4.Assumption is that there is an ordered column using which range queries can be formed.
      The key column, based on which the polling will happen, is any column which has ever increasing values and supports greater than and less
      than operations in SQL.
      5.If an emitColumnList is provided, please ensure that the keyColumn is the first column in the list
      6.Range queries are formed using the JdbcMetaDataUtility Output - comma separated list of the emit columns eg columnA,columnB,columnC
      7. Only newly added data which has increasing ids will be fetched by the
      polling jdbc partition

      Per window the first and the last key processed is saved using the FSWindowDataManager - (<lowerBound,UpperBound>,operatorId,windowId).This (lowerBound,upperBoundPair) is then used for recovery.The queries are constructed using the JDBCMetaDataUtility.

      JDBCMetaDataUtility
      A utility class used to retrieve the metadata for a given unique key of a SQL table. This class would emit range queries based on a primary index given.

      Attachments

        Activity

          People

            devendra.tagare devendra tagare
            ashwinchandrap Ashwin Putta
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: