Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16995

TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.0.1, 2.1.0
    • SQL
    • None

    Description

      A TreeNodeException is thrown when executing the following minimal example in Spark 2.0. Crucial is that the column q is generated with lit/expr.

      import spark.implicits._
      case class test (x: Int, q: Int)
      
      val d = Seq(1).toDF("x")
      d.withColumn("q", lit(0)).as[test].groupByKey(_.x).flatMapGroups{case (x, iter) => List()}.show
      d.withColumn("q", expr("0")).as[test].groupByKey(_.x).flatMapGroups{case (x, iter) => List()}.show
      
      // this works fine
      d.withColumn("q", lit(0)).as[test].groupByKey(_.x).count()
      

      The exception is: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute, tree: x#5

      A possible workaround is to write the dataframe to disk before grouping and mapping.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            viirya L. C. Hsieh Assign to me
            cperriard C├ędric Perriard
            Votes:
            0 Vote for this issue
            Watchers:
            3 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment