Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16425

SparkR summary() fails on column of type logical

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.6.1
    • Fix Version/s: 2.0.0
    • Component/s: SparkR, SQL
    • Labels:
      None
    • Environment:

      Databricks.com

      Description

      I created a DataFrame. I added a logical column to the DataFrame using:
      sdfAdults <- withColumn(sdfAdults, "IsGT50K", sdfAdults$gt50==" <=50K")
      The resulting column was reported using str() as being of type logical, with values TRUE and FALSE.

      I subsequently ran the command:
      summary(sdfAdults)
      The command failed reporting that the mean could not be calculated on a column of type logical.

      Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
      org.apache.spark.sql.AnalysisException: cannot resolve 'avg(IsGT50K)' due to data type mismatch: function average requires numeric types, not BooleanType;
      at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
      at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:65)
      at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
      at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
      at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
      at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
      at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
      at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
      at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
      at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
      at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      at scala.collection.AbstractIterator.to(Iterator.scala:1157)

        Attachments

          Activity

            People

            • Assignee:
              dongjoon Dongjoon Hyun
              Reporter:
              neil@dewar-us.com Neil Dewar

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment