Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44040

Incorrect result after count distinct

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.3.2, 3.4.0
    • 3.3.3, 3.4.1, 3.5.0
    • Spark Core
    • None

    Description

      When i try to call count after distinct function for Decimal null field, spark return incorrect result starting from spark 3.4.0.
      A minimal example to reproduce:

      import org.apache.spark.sql.types._
      import org.apache.spark.sql.{Column, DataFrame, Dataset, Row, SparkSession}
      import org.apache.spark.sql.types.{StringType, StructField, StructType}
      val schema = StructType( Array(
      StructField("money", DecimalType(38,6), true),
      StructField("reference_id", StringType, true)
      ))

      val payDf = spark.createDataFrame(sc.emptyRDD[Row], schema)

      val aggDf = payDf.agg(sum("money").as("money")).withColumn("name", lit("df1"))
      val aggDf1 = payDf.agg(sum("money").as("money")).withColumn("name", lit("df2"))
      val unionDF: DataFrame = aggDf.union(aggDf1)
      unionDF.select("money").distinct.show // return correct result
      unionDF.select("money").distinct.count // return 2 instead of 1
      unionDF.select("money").distinct.count == 1 // return false

      This block of code returns some assertion error and after that an incorrect count (in spark 3.2.1 everything works fine and i get correct result = 1):

      scala> unionDF.select("money").distinct.show // return correct result
      java.lang.AssertionError: assertion failed:
      Decimal$DecimalIsFractional
      while compiling: <console>
      during phase: globalPhase=terminal, enteringPhase=jvm
      library version: version 2.12.17
      compiler version: version 2.12.17
      reconstructed args: -classpath /Users/aleksandrov/.ivy2/jars/org.apache.spark_spark-connect_2.12-3.4.0.jar:/Users/aleksandrov/.ivy2/jars/io.delta_delta-core_2.12-2.4.0.jar:/Users/aleksandrov/.ivy2/jars/io.delta_delta-storage-2.4.0.jar:/Users/aleksandrov/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar:/Users/aleksandrov/.ivy2/jars/org.antlr_antlr4-runtime-4.9.3.jar -Yrepl-class-based -Yrepl-outdir /private/var/folders/qj/_dn4xbp14jn37qmdk7ylyfwc0000gr/T/spark-f37bb154-75f3-4db7-aea8-3c4363377bd8/repl-350f37a1-1df1-4816-bd62-97929c60a6c1

      last tree to typer: TypeTree(class Byte)
      tree position: line 6 of <console>
      tree tpe: Byte
      symbol: (final abstract) class Byte in package scala
      symbol definition: final abstract class Byte extends (a ClassSymbol)
      symbol package: scala
      symbol owners: class Byte
      call site: constructor $eval in object $eval in package $line19

      == Source file context for tree position ==

      3
      4object $eval {
      5lazyval $result = $line19.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.res0
      6lazyval $print: root.java.lang.String = {
      7 $line19.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw
      8
      9""
      at scala.reflect.internal.SymbolTable.throwAssertionError(SymbolTable.scala:185)
      at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1525)
      at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514)
      at scala.reflect.internal.Symbols$Symbol.flatOwnerInfo(Symbols.scala:2353)
      at scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:3346)
      at scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:3348)
      at scala.reflect.internal.Symbols$ModuleClassSymbol.sourceModule(Symbols.scala:3487)
      at scala.reflect.internal.Symbols.$anonfun$forEachRelevantSymbols$1$adapted(Symbols.scala:3802)
      at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
      at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
      at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
      at scala.reflect.internal.Symbols.markFlagsCompleted(Symbols.scala:3799)
      at scala.reflect.internal.Symbols.markFlagsCompleted$(Symbols.scala:3805)
      at scala.reflect.internal.SymbolTable.markFlagsCompleted(SymbolTable.scala:28)
      at scala.reflect.internal.pickling.UnPickler$Scan.finishSym$1(UnPickler.scala:324)
      at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:342)
      at scala.reflect.internal.pickling.UnPickler$Scan.readSymbolRef(UnPickler.scala:645)
      at scala.reflect.internal.pickling.UnPickler$Scan.readType(UnPickler.scala:413)
      at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$readSymbol$10(UnPickler.scala:357)
      at scala.reflect.internal.pickling.UnPickler$Scan.at(UnPickler.scala:188)
      at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:357)
      at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$run$1(UnPickler.scala:96)
      at scala.reflect.internal.pickling.UnPickler$Scan.run(UnPickler.scala:88)
      at scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:47)
      at scala.tools.nsc.symtab.classfile.ClassfileParser.unpickleOrParseInnerClasses(ClassfileParser.scala:1173)
      at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:467)
      at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:160)
      at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:146)
      at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:129)
      at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:343)
      at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250)
      at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.load(SymbolLoaders.scala:269)
      at scala.reflect.internal.Symbols$Symbol.exists(Symbols.scala:1104)
      at scala.reflect.internal.Symbols$Symbol.toOption(Symbols.scala:2609)
      at scala.tools.nsc.interpreter.IMain.translateSimpleResource(IMain.scala:340)
      at scala.tools.nsc.interpreter.IMain$TranslatingClassLoader.findAbstractFile(IMain.scala:354)
      at scala.reflect.internal.util.AbstractFileClassLoader.findResource(AbstractFileClassLoader.scala:76)
      at java.base/java.lang.ClassLoader.getResource(ClassLoader.java:1401)
      at java.base/java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:1737)
      at scala.reflect.internal.util.RichClassLoader$.classAsStream$extension(ScalaClassLoader.scala:89)
      at scala.reflect.internal.util.RichClassLoader$.classBytes$extension(ScalaClassLoader.scala:81)
      at scala.reflect.internal.util.ScalaClassLoader.classBytes(ScalaClassLoader.scala:131)
      at scala.reflect.internal.util.ScalaClassLoader.classBytes$(ScalaClassLoader.scala:131)
      at scala.reflect.internal.util.AbstractFileClassLoader.classBytes(AbstractFileClassLoader.scala:41)
      at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:70)
      at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
      at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:576)
      at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
      at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
      at org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:75)
      at org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:317)
      at org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:8895)
      at org.codehaus.janino.UnitCompiler.reclassifyName(UnitCompiler.java:9115)
      at org.codehaus.janino.UnitCompiler.reclassifyName(UnitCompiler.java:8806)
      at org.codehaus.janino.UnitCompiler.reclassify(UnitCompiler.java:8667)
      at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:7194)
      at org.codehaus.janino.UnitCompiler.access$18100(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$26.visitAmbiguousName(UnitCompiler.java:6785)
      at org.codehaus.janino.UnitCompiler$26.visitAmbiguousName(UnitCompiler.java:6784)
      at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:4603)
      at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6784)
      at org.codehaus.janino.UnitCompiler.access$15100(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$25.visitLvalue(UnitCompiler.java:6745)
      at org.codehaus.janino.UnitCompiler$25.visitLvalue(UnitCompiler.java:6742)
      at org.codehaus.janino.Java$Lvalue.accept(Java.java:4528)
      at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6742)
      at org.codehaus.janino.UnitCompiler.access$14400(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$23.visitRvalue(UnitCompiler.java:6690)
      at org.codehaus.janino.UnitCompiler$23.visitRvalue(UnitCompiler.java:6681)
      at org.codehaus.janino.Java$Rvalue.accept(Java.java:4495)
      at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6681)
      at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9392)
      at org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:7486)
      at org.codehaus.janino.UnitCompiler.access$16100(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$25.visitMethodInvocation(UnitCompiler.java:6756)
      at org.codehaus.janino.UnitCompiler$25.visitMethodInvocation(UnitCompiler.java:6742)
      at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:5470)
      at org.codehaus.janino.UnitCompiler.getType(UnitCompiler.java:6742)
      at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:9590)
      at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9475)
      at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:9391)
      at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5232)
      at org.codehaus.janino.UnitCompiler.access$9300(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$16.visitMethodInvocation(UnitCompiler.java:4735)
      at org.codehaus.janino.UnitCompiler$16.visitMethodInvocation(UnitCompiler.java:4711)
      at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:5470)
      at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:4711)
      at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:5854)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:4101)
      at org.codehaus.janino.UnitCompiler.access$6300(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$13.visitAssignment(UnitCompiler.java:4057)
      at org.codehaus.janino.UnitCompiler$13.visitAssignment(UnitCompiler.java:4040)
      at org.codehaus.janino.Java$Assignment.accept(Java.java:4864)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:4040)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2523)
      at org.codehaus.janino.UnitCompiler.access$1800(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1580)
      at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:3209)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1661)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1646)
      at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1579)
      at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$Block.accept(Java.java:3115)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2659)
      at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1581)
      at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$IfStatement.accept(Java.java:3284)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1661)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1646)
      at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1579)
      at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$Block.accept(Java.java:3115)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2637)
      at org.codehaus.janino.UnitCompiler.access$1900(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1581)
      at org.codehaus.janino.UnitCompiler$6.visitIfStatement(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$IfStatement.accept(Java.java:3284)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1661)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1646)
      at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1579)
      at org.codehaus.janino.UnitCompiler$6.visitBlock(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$Block.accept(Java.java:3115)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2001)
      at org.codehaus.janino.UnitCompiler.access$2200(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$6.visitWhileStatement(UnitCompiler.java:1584)
      at org.codehaus.janino.UnitCompiler$6.visitWhileStatement(UnitCompiler.java:1575)
      at org.codehaus.janino.Java$WhileStatement.accept(Java.java:3389)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1575)
      at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1661)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3658)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3329)
      at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1447)
      at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1420)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:829)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1026)
      at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$3.visitMemberClassDeclaration(UnitCompiler.java:425)
      at org.codehaus.janino.UnitCompiler$3.visitMemberClassDeclaration(UnitCompiler.java:418)
      at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1533)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:418)
      at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1397)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:864)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:442)
      at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$3.visitPackageMemberClassDeclaration(UnitCompiler.java:422)
      at org.codehaus.janino.UnitCompiler$3.visitPackageMemberClassDeclaration(UnitCompiler.java:418)
      at org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1688)
      at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:418)
      at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:392)
      at org.codehaus.janino.UnitCompiler.access$000(UnitCompiler.java:236)
      at org.codehaus.janino.UnitCompiler$2.visitCompilationUnit(UnitCompiler.java:363)
      at org.codehaus.janino.UnitCompiler$2.visitCompilationUnit(UnitCompiler.java:361)
      at org.codehaus.janino.Java$CompilationUnit.accept(Java.java:371)
      at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:361)
      at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:264)
      at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:294)
      at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:288)
      at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:267)
      at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:82)
      at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1496)
      at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1586)
      at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1583)
      at org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
      at org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
      at org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
      at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
      at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
      at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
      at org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
      at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1443)
      at org.apache.spark.sql.execution.WholeStageCodegenExec.liftedTree1$1(WholeStageCodegenExec.scala:726)
      at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:725)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:195)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:246)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:243)
      at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:191)
      at org.apache.spark.sql.execution.UnionExec.$anonfun$doExecute$5(basicPhysicalOperators.scala:699)
      at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
      at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
      at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
      at scala.collection.TraversableLike.map(TraversableLike.scala:286)
      at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
      at scala.collection.AbstractTraversable.map(Traversable.scala:108)
      at org.apache.spark.sql.execution.UnionExec.doExecute(basicPhysicalOperators.scala:699)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:195)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:246)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:243)
      at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:191)
      at org.apache.spark.sql.execution.InputAdapter.inputRDD(WholeStageCodegenExec.scala:527)
      at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs(WholeStageCodegenExec.scala:455)
      at org.apache.spark.sql.execution.InputRDDCodegen.inputRDDs$(WholeStageCodegenExec.scala:454)
      at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:498)
      at org.apache.spark.sql.execution.aggregate.AggregateCodegenSupport.inputRDDs(AggregateCodegenSupport.scala:89)
      at org.apache.spark.sql.execution.aggregate.AggregateCodegenSupport.inputRDDs$(AggregateCodegenSupport.scala:88)
      at org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:47)
      at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:751)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:195)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:246)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:243)
      at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:191)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.inputRDD$lzycompute(ShuffleExchangeExec.scala:135)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.inputRDD(ShuffleExchangeExec.scala:135)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.mapOutputStatisticsFuture$lzycompute(ShuffleExchangeExec.scala:140)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.mapOutputStatisticsFuture(ShuffleExchangeExec.scala:139)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeLike.$anonfun$submitShuffleJob$1(ShuffleExchangeExec.scala:68)
      at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:246)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:243)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeLike.submitShuffleJob(ShuffleExchangeExec.scala:68)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeLike.submitShuffleJob$(ShuffleExchangeExec.scala:67)
      at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.submitShuffleJob(ShuffleExchangeExec.scala:115)
      at org.apache.spark.sql.execution.adaptive.ShuffleQueryStageExec.shuffleFuture$lzycompute(QueryStageExec.scala:181)
      at org.apache.spark.sql.execution.adaptive.ShuffleQueryStageExec.shuffleFuture(QueryStageExec.scala:181)
      at org.apache.spark.sql.execution.adaptive.ShuffleQueryStageExec.doMaterialize(QueryStageExec.scala:183)
      at org.apache.spark.sql.execution.adaptive.QueryStageExec.materialize(QueryStageExec.scala:82)
      at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$5(AdaptiveSparkPlanExec.scala:266)
      at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$5$adapted(AdaptiveSparkPlanExec.scala:264)
      at scala.collection.Iterator.foreach(Iterator.scala:943)
      at scala.collection.Iterator.foreach$(Iterator.scala:943)
      at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
      at scala.collection.IterableLike.foreach(IterableLike.scala:74)
      at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
      at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
      at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$1(AdaptiveSparkPlanExec.scala:264)
      at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
      at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.getFinalPhysicalPlan(AdaptiveSparkPlanExec.scala:236)
      at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:381)
      at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:354)
      at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:4177)
      at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:3161)
      at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:4167)
      at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:526)
      at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:4165)
      at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
      at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
      at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
      at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
      at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
      at org.apache.spark.sql.Dataset.withAction(Dataset.scala:4165)
      at org.apache.spark.sql.Dataset.head(Dataset.scala:3161)
      at org.apache.spark.sql.Dataset.take(Dataset.scala:3382)
      at org.apache.spark.sql.Dataset.getRows(Dataset.scala:284)
      at org.apache.spark.sql.Dataset.showString(Dataset.scala:323)
      at org.apache.spark.sql.Dataset.show(Dataset.scala:809)
      at org.apache.spark.sql.Dataset.show(Dataset.scala:768)
      at org.apache.spark.sql.Dataset.show(Dataset.scala:777)
      at $line19.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:29)
      at $line19.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:33)
      at $line19.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:35)
      at $line19.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:37)
      at $line19.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(<console>:39)
      at $line19.$read$$iw$$iw$$iw$$iw$$iw.<init>(<console>:41)
      at $line19.$read$$iw$$iw$$iw$$iw.<init>(<console>:43)
      at $line19.$read$$iw$$iw$$iw.<init>(<console>:45)
      at $line19.$read$$iw$$iw.<init>(<console>:47)
      at $line19.$read$$iw.<init>(<console>:49)
      at $line19.$read.<init>(<console>:51)
      at $line19.$read$.<init>(<console>:55)
      at $line19.$read$.<clinit>(<console>)
      at $line19.$eval$.$print$lzycompute(<console>:7)
      at $line19.$eval$.$print(<console>:6)
      at $line19.$eval.$print(<console>)
      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.base/java.lang.reflect.Method.invoke(Method.java:566)
      at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
      at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
      at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
      at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
      at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
      at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
      at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
      at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
      at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
      at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:865)
      at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:733)
      at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:435)
      at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:456)
      at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:239)
      at org.apache.spark.repl.Main$.doMain(Main.scala:78)
      at org.apache.spark.repl.Main$.main(Main.scala:58)
      at org.apache.spark.repl.Main.main(Main.scala)
      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.base/java.lang.reflect.Method.invoke(Method.java:566)
      at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
      at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
      at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
      at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
      at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
      at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
      at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
      at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      error: error while loading Decimal, class file '/Users/aleksandrov/Projects/apache/spark-3.4.0-bin-hadoop3/jars/spark-catalyst_2.12-3.4.0.jar(org/apache/spark/sql/types/Decimal.class)' is broken
      (class java.lang.RuntimeException/error reading Scala signature of Decimal.class: assertion failed:
      Decimal$DecimalIsFractional
      while compiling: <console>
      during phase: globalPhase=terminal, enteringPhase=jvm
      library version: version 2.12.17
      compiler version: version 2.12.17
      reconstructed args: -classpath /Users/aleksandrov/.ivy2/jars/org.apache.spark_spark-connect_2.12-3.4.0.jar:/Users/aleksandrov/.ivy2/jars/io.delta_delta-core_2.12-2.4.0.jar:/Users/aleksandrov/.ivy2/jars/io.delta_delta-storage-2.4.0.jar:/Users/aleksandrov/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar:/Users/aleksandrov/.ivy2/jars/org.antlr_antlr4-runtime-4.9.3.jar -Yrepl-class-based -Yrepl-outdir /private/var/folders/qj/_dn4xbp14jn37qmdk7ylyfwc0000gr/T/spark-f37bb154-75f3-4db7-aea8-3c4363377bd8/repl-350f37a1-1df1-4816-bd62-97929c60a6c1

      last tree to typer: TypeTree(class Byte)
      tree position: line 6 of <console>
      tree tpe: Byte
      symbol: (final abstract) class Byte in package scala
      symbol definition: final abstract class Byte extends (a ClassSymbol)
      symbol package: scala
      symbol owners: class Byte
      call site: constructor $eval in object $eval in package $line19

      == Source file context for tree position ==

      3
      4object $eval {
      5lazyval $result = res0
      6lazyval $print: root.java.lang.String = {
      7 $iw
      8
      9"" )
      -----

      money

      -----

      null

      -----

      scala> unionDF.select("money").distinct.count // return 2 instead of 1
      res1: Long = 2 

      scala> unionDF.select("money").distinct.count == 1 // return False
      res2: Boolean = false

      Attachments

        Issue Links

          Activity

            People

              yumwang Yuming Wang
              boltonidze Aleksandr Aleksandrov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: