Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10073

Python withColumn for existing column name not consistent with scala

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 1.5.0
    • 1.5.0
    • SQL
    • None

    Description

      The same code as below works in Scala (replacing the old column with the new one).

      from pyspark.sql import Row
      df = sc.parallelize([Row(a=1)]).toDF()
      df.withColumn("a", df.a).select("a")
      ---------------------------------------------------------------------------
      AnalysisException                         Traceback (most recent call last)
      <ipython-input-4-d5a4f4132506> in <module>()
            1 from pyspark.sql import Row
            2 df = sc.parallelize([Row(a=1)]).toDF()
      ----> 3 df.withColumn("a", df.a).select("a")
      
      /home/ubuntu/databricks/spark/python/pyspark/sql/dataframe.py in select(self, *cols)
          764         [Row(name=u'Alice', age=12), Row(name=u'Bob', age=15)]
          765         """
      --> 766         jdf = self._jdf.select(self._jcols(*cols))
          767         return DataFrame(jdf, self.sql_ctx)
          768 
      
      /home/ubuntu/databricks/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
          536         answer = self.gateway_client.send_command(command)
          537         return_value = get_return_value(answer, self.gateway_client,
      --> 538                 self.target_id, self.name)
          539 
          540         for temp_arg in temp_args:
      
      /home/ubuntu/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
           38             s = e.java_exception.toString()
           39             if s.startswith('org.apache.spark.sql.AnalysisException: '):
      ---> 40                 raise AnalysisException(s.split(': ', 1)[1])
           41             if s.startswith('java.lang.IllegalArgumentException: '):
           42                 raise IllegalArgumentException(s.split(': ', 1)[1])
      
      AnalysisException: Reference 'a' is ambiguous, could be: a#894L, a#895L.;
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            davies Davies Liu
            marmbrus Michael Armbrust
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment