Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-662

Samza auto-creates changelog stream without sufficient partitions when container number > 1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.0
    • 0.9.1
    • container
    • None

    Description

      We currently provide auto-create for changelog streams. However, when there are more than 1 containers, it is possible that Samza creates a changelog stream with insufficient partitions.

      Reason:
      assume we have an input stream with 3 partitions and then we assign 3 containers for this job. According to the JobCoordinator, we will get:

      Container(Model) InputStream Partition Changelog Partition
      0 0 0
      1 1 1
      2 2 2

      If Container 0 is brought up first, it calls

          val maxChangeLogStreamPartitions = containerModel.getTasks.values
                  .max(Ordering.by { task:TaskModel => task.getChangelogPartition.getPartitionId })
                  .getChangelogPartition.getPartitionId + 1
      

      in SamzaContainer.

      The maxChangeLogStreamPartition is 1. So we will auto-create a changelog stream with only 1 partitions.

      Similarly, if the Container 2 is brought up first, we will get a stream with 2 partitions.

      Attachments

        1. SAMZA-662-0.9.1-branch.patch
          3 kB
          Jakob Homan
        2. SAMZA-662.v1.patch
          3 kB
          Guozhang Wang

        Issue Links

          Activity

            People

              guozhang Guozhang Wang
              closeuris Yan Fang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: