Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.2, 2.1.0
    • Component/s: Structured Streaming
    • Labels:
      None
    • Target Version/s:

      Description

      Right now you can only run a Streaming Query starting from either the earliest or latests offsets available at the moment the query is started. Sometimes this is a lot of data. It would be nice to be able to do the following:

      • seek to user specified offsets for manually specified topicpartitions

      currently agreed on plan:

      Mutually exclusive subscription options (only assign is new to this ticket)

      .option("subscribe","topicFoo,topicBar")
      .option("subscribePattern","topic.*")
      .option("assign","""{"topicfoo": [0, 1],"topicbar": [0, 1]}""")
      

      where assign can only be specified that way, no inline offsets

      Single starting position option with three mutually exclusive types of value

      .option("startingOffsets", "earliest" | "latest" | """{"topicFoo": {"0": 1234, "1": -2}, "topicBar":{"0": -1}}""")
      

      startingOffsets with json fails if any topicpartition in the assignments doesn't have an offset.

        Attachments

          Activity

            People

            • Assignee:
              cody@koeninger.org Cody Koeninger
              Reporter:
              marmbrus Michael Armbrust
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: