Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19317

UnsupportedOperationException: empty.reduceLeft in LinearSeqOptimized

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 2.1.0
    • None
    • Spark Core
    • None

    Description

      I wish I had more of a simple reproducible case to give, but I got the below exception while selecting null values in one of the columns of a dataframe.
      My client code that failed was
      df.filter(filterExp).count()
      where the filter expression was something like "someCall.isin() || someCall.isNulll". Based on the stack trace, I think the problem may be the "isin()" in the expression - since there are no values to match in it.
      There were 412 nulls out of 716,000 total rows for the column being filtered.
      The exception seems to indicate that spark is trying to do reduceLeft on an empty list, but the dataset is not empty.

      java.lang.UnsupportedOperationException: empty.reduceLeftscala.collection.LinearSeqOptimized$class.reduceLeft(LinearSeqOptimized.scala:137) scala.collection.immutable.List.reduceLeft(List.scala:84) scala.collection.TraversableOnce$class.reduce(TraversableOnce.scala:208) scala.collection.AbstractTraversable.reduce(Traversable.scala:104) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:90) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:54) scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:61) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$1.applyOrElse(InMemoryTableScanExec.scala:54) scala.PartialFunction$Lifted.apply(PartialFunction.scala:223) scala.PartialFunction$Lifted.apply(PartialFunction.scala:219) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$2.apply(InMemoryTableScanExec.scala:95) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$2.apply(InMemoryTableScanExec.scala:94) scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) scala.collection.immutable.List.foreach(List.scala:381) scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) scala.collection.immutable.List.flatMap(List.scala:344) org.apache.spark.sql.execution.columnar.InMemoryTableScanExec.(InMemoryTableScanExec.scala:94) org.apache.spark.sql.execution.SparkStrategies$InMemoryScans$$anonfun$6.apply(SparkStrategies.scala:306) org.apache.spark.sql.execution.SparkStrategies$InMemoryScans$$anonfun$6.apply(SparkStrategies.scala:306) org.apache.spark.sql.execution.SparkPlanner.pruneFilterProject(SparkPlanner.scala:96) org.apache.spark.sql.execution.SparkStrategies$InMemoryScans$.apply(SparkStrategies.scala:302) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:62) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:62) scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:77) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:74) scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) scala.collection.Iterator$class.foreach(Iterator.scala:893) scala.collection.AbstractIterator.foreach(Iterator.scala:1336) scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:74) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:66) scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:77) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:74) scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) scala.collection.Iterator$class.foreach(Iterator.scala:893) scala.collection.AbstractIterator.foreach(Iterator.scala:1336) scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:74) org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:66) scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92) org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:79) org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:75) org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:84) org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:84) org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2774) org.apache.spark.sql.Dataset.count(Dataset.scala:2404) mypackage.Selection(Selection.scala:34) 
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              barrybecker4 Barry Becker
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: