Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34515

Fix NPE if InSet contains null value during getPartitionsByFilter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.1.2, 3.2.0
    • 3.1.2, 3.2.0
    • SQL
    • None

    Description

      Spark will convert InSet to `>= and <=` if it's values size over `spark.sql.hive.metastorePartitionPruningInSetThreshold` during pruning partition . At this case, if values contain a null, we will get such exception 

       

      java.lang.NullPointerException
       at org.apache.spark.unsafe.types.UTF8String.compareTo(UTF8String.java:1389)
       at org.apache.spark.unsafe.types.UTF8String.compareTo(UTF8String.java:50)
       at scala.math.LowPriorityOrderingImplicits$$anon$3.compare(Ordering.scala:153)
       at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
       at java.util.TimSort.sort(TimSort.java:220)
       at java.util.Arrays.sort(Arrays.java:1438)
       at scala.collection.SeqLike.sorted(SeqLike.scala:659)
       at scala.collection.SeqLike.sorted$(SeqLike.scala:647)
       at scala.collection.AbstractSeq.sorted(Seq.scala:45)
       at org.apache.spark.sql.hive.client.Shim_v0_13.convert$1(HiveShim.scala:772)
       at org.apache.spark.sql.hive.client.Shim_v0_13.$anonfun$convertFilters$4(HiveShim.scala:826)
       at scala.collection.immutable.Stream.flatMap(Stream.scala:489)
       at org.apache.spark.sql.hive.client.Shim_v0_13.convertFilters(HiveShim.scala:826)
       at org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:848)
       at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitionsByFilter$1(HiveClientImpl.scala:750)
      

      Attachments

        Activity

          People

            ulysses XiDuo You
            ulysses XiDuo You
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: