Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29890

Unable to fill na with 0 with duplicate columns

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.2, 2.1.3, 2.2.3, 2.3.3, 2.4.3
    • 2.4.5, 3.0.0
    • Spark Shell
    • None

    Description

      Trying to fill out na values with 0.

      scala> :paste
      // Entering paste mode (ctrl-D to finish)
      val parent = spark.sparkContext.parallelize(Seq((1,2),(3,4),(5,6))).toDF("nums", "abc")
      val c1 = parent.filter(lit(true))
      val c2 = parent.filter(lit(true))
      c1.join(c2, Seq("nums"), "left")
      .na.fill(0).show
      9/11/14 04:24:24 ERROR org.apache.hadoop.security.JniBasedUnixGroupsMapping: error looking up the name of group 820818257: No such file or directory
      org.apache.spark.sql.AnalysisException: Reference 'abc' is ambiguous, could be: abc, abc.;
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolve(LogicalPlan.scala:213)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:117)
        at org.apache.spark.sql.Dataset.resolve(Dataset.scala:220)
        at org.apache.spark.sql.Dataset.col(Dataset.scala:1246)
        at org.apache.spark.sql.DataFrameNaFunctions.org$apache$spark$sql$DataFrameNaFunctions$$fillCol(DataFrameNaFunctions.scala:443)
        at org.apache.spark.sql.DataFrameNaFunctions$$anonfun$7.apply(DataFrameNaFunctions.scala:500)
        at org.apache.spark.sql.DataFrameNaFunctions$$anonfun$7.apply(DataFrameNaFunctions.scala:492)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at org.apache.spark.sql.DataFrameNaFunctions.fillValue(DataFrameNaFunctions.scala:492)
        at org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:171)
        at org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:155)
        at org.apache.spark.sql.DataFrameNaFunctions.fill(DataFrameNaFunctions.scala:134)
        ... 54 elided

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            imback82 Terry Kim Assign to me
            sandeshyapuram sandeshyapuram
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment