Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20567

Failure to bind when using explode and collect_set in streaming

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • SQL
    • None

    Description

      Here is a small test case:

        test("count distinct") {
          val inputData = MemoryStream[(Int, Seq[Int])]
      
          val aggregated =
            inputData.toDF()
              .select($"*", explode($"_2") as 'value)
              .groupBy($"_1")
              .agg(size(collect_set($"value")))
              .as[(Int, Int)]
      
          testStream(aggregated, Update)(
            AddData(inputData, (1, Seq(1, 2))),
            CheckLastBatch((1, 2))
          )
        }
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            marmbrus Michael Armbrust
            marmbrus Michael Armbrust
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment