Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26849

Introduce new option to Kafka source: offset by timestamp (starting/ending)

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:
      None

      Description

      Now Kafka source provides options to specify custom offset per topic partition to set where to start reading from, and where to stop reading.

      I'd like to introduce new options to specify timestamp per topic (not topic partition - we can support but it would be unlikely for us to set timestamp per partition) to fetch offset via timestamp and start reading from and stop reading.

      The characteristic of new options would be very similar to existing options. For example, in streaming query, ending timestamp option would not be valid, and starting timestamp option only affects when query starts - if query restores from checkpoint the option would not be in effect.

      The new timestamp option would take precedence over offset option.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                kabhwan Jungtaek Lim
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: