Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-1417

Clear and recreate intermediate and metadata streams for batch processing

    Details

    • Type: Improvement
    • Status: In Progress
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      For each run of a batch application, we need to clear the internal streams from the previous run and recreate new ones.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user xinyuiscool opened a pull request:

          https://github.com/apache/samza/pull/297

          SAMZA-1417: Clear and recreate intermediate and metadata streams for batch processing

          For each run of a batch application, we need to clear the internal streams from the previous run and recreate new ones. This patch introduces the following:
          1) isBatch flag to StreamSpec
          2) app.mode (BATCH/STREAM) in the application config
          3) app.runId and use it to generate the internal topics for each run.

          run.id generation is not addressed in this pr. There will be another patch to resolve it for both yarn and standalone. For now, this patch only works for yarn.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/xinyuiscool/samza SAMZA-1417

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/samza/pull/297.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #297



          Show
          githubbot ASF GitHub Bot added a comment - GitHub user xinyuiscool opened a pull request: https://github.com/apache/samza/pull/297 SAMZA-1417 : Clear and recreate intermediate and metadata streams for batch processing For each run of a batch application, we need to clear the internal streams from the previous run and recreate new ones. This patch introduces the following: 1) isBatch flag to StreamSpec 2) app.mode (BATCH/STREAM) in the application config 3) app.runId and use it to generate the internal topics for each run. run.id generation is not addressed in this pr. There will be another patch to resolve it for both yarn and standalone. For now, this patch only works for yarn. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xinyuiscool/samza SAMZA-1417 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/297.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #297

            People

            • Assignee:
              xinyu Xinyu Liu
              Reporter:
              xinyu Xinyu Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Development