Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-882

Detect partition count changes in input streams

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.10.1
    • Component/s: None
    • Labels:
      None

      Description

      This is a known issue where any change in the partition count in the upstream affects the Samza job and it needs to be restarted. In such scenarios, we experience data loss or incorrect processing because the application logic depends on the partitioning strategy. It is worsened by the fact that we don't even have a good mechanism to detect such a change.

      As a first-step towards detection, I propose that we modify the stream metadata cache maintained in Samza such that when there a change in partition count, we increment a gauge metric. This way we can at least attach a hook to monitor when this happens and take necessary actions.

      However, in the long-term, we need to come up with a better strategy for handling this.

        Attachments

        1. SAMZA-882-1.patch
          35 kB
          Navina Ramesh
        2. SAMZA-882-0.patch
          35 kB
          Navina Ramesh

          Issue Links

            Activity

              People

              • Assignee:
                navina Navina Ramesh
                Reporter:
                navina Navina Ramesh
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: