Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-17578

Union of 2 SideOutputs behaviour incorrect

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Strange behaviour when using union() to merge outputs of 2 DataStreams, where both are sourced from SideOutputs.

      See example code with comments demonstrating the issue:

        def main(args: Array[String]): Unit = {
          val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
      
          val input = env.fromElements(1, 2, 3, 4)
      
          val oddTag = OutputTag[Int]("odds")
          val evenTag = OutputTag[Int]("even")
      
          val all =
            input.process {
              (value: Int, ctx: ProcessFunction[Int, Int]#Context, out: Collector[Int]) => {
                if (value % 2 != 0)
                  ctx.output(oddTag, value)
                else
                  ctx.output(evenTag, value)
              }
            }
      
          val odds = all.getSideOutput(oddTag)
          val evens = all.getSideOutput(evenTag)
      
          // These print correctly
          //
          odds.print                  // -> 1, 3
          evens.print                 // -> 2, 4
      
          // This prints incorrectly - BUG?
          //
          odds.union(evens).print       // -> 2, 2, 4, 4
          evens.union(odds).print       // -> 1, 1, 3, 3
      
          // Another test to understand normal behaviour of .union, using normal inputs
          //
          val odds1 = env.fromElements(1, 3)
          val evens1 = env.fromElements(2, 4)
      
          // Union of 2 normal inputs
          //
          odds1.union(evens1).print   // -> 1, 2, 3, 4
      
          // Union of a normal input plus an input from a sideoutput
          //
          odds.union(evens1).print    // -> 1, 2, 3, 4
          evens1.union(odds).print    // -> 1, 2, 3, 4
      
          //
          // So it seems that when both inputs are from sideoutputs that it behaves incorrectly... BUG?
      
          env.execute("Test job")
        }
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            damjad Danish Amjad
            drshade Tom Wells
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment