Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34584

When insert into a partition table with a illegal partition value, DSV2 behavior different as others

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.1
    • 3.1.2, 3.2.0
    • SQL
    • None

    Description

      With below UT 

        test("test insert to partition with wrong value") {
          withTable("t") {
            val binaryStr = "Spark SQL"
            val binaryHexStr = Hex.hex(UTF8String.fromString(binaryStr).getBytes).toString
            sql("CREATE TABLE t(name STRING, part DATE) USING PARQUET PARTITIONED BY (part)")
            sql(s"INSERT INTO t PARTITION(part = X'$binaryHexStr') VALUES('a')")
            sql("SELECT * FROM t").show()
          }
        }
      

      Result :

      [info] DSV2SQLInsertTestSuite:
      21:35:32.369 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      +----+----+
      |name|part|
      +----+----+
      |   a|null|
      +----+----+
      [info] - test insert to partition with wrong value (4 seconds, 41 milliseconds)
      21:35:37.639 WARN org.apache.spark.sql.DSV2SQLInsertTestSuite:===== POSSIBLE THREAD LEAK IN SUITE o.a.s.sql.DSV2SQLInsertTestSuite, thread names: rpc-boss-3-1, shuffle-boss-6-1 =====
      [info] FileSourceSQLInsertTestSuite:
      [info] - test insert to partition with wrong value *** FAILED *** (200 milliseconds)
      [info]   java.time.DateTimeException: Cannot cast X'537061726B2053514C' to DateType.
      [info]   at org.apache.spark.sql.catalyst.util.DateTimeUtils$.$anonfun$stringToDateAnsi$1(DateTimeUtils.scala:471)
      [info]   at scala.Option.getOrElse(Option.scala:189)
      [info]   at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToDateAnsi(DateTimeUtils.scala:471)
      [info]   at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToDate$2(Cast.scala:509)
      [info]   at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToDate$2$adapted(Cast.scala:509)
      [info]   at org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:301)
      [info]   at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToDate$1(Cast.scala:509)
      [info]   at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:850)
      [info]   at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:482)
      [info]   at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:66)
      [info]   at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:54)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:316)
      [info]   at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:316)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$3(TreeNode.scala:321)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:406)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:242)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:404)
      [info]   at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:357)
      [info]   at
      

      Attachments

        Activity

          People

            cloud_fan Wenchen Fan
            angerszhuuu angerszhu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: