Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21739

timestamp partition would fail in v2.2.0

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.2.1, 2.3.0
    • Component/s: SQL
    • Labels:
      None

      Description

      The spark v2.2.0 introduce TimeZoneAwareExpression, which causes bugs if we select data from a table with timestamp partitions.
      The steps to reproduce it:

      spark.sql("create table test (foo string) parititioned by (ts timestamp)")
      spark.sql("insert into table test partition(ts = 1) values('hi')")
      spark.table("test").show()
      

      The root cause is that TableReader.scala#230 try to cast the string to timestamp regardless if the timeZone exists.

      Here is the error stack trace

      java.util.NoSuchElementException: None.get
        at scala.None$.get(Option.scala:347)
        at scala.None$.get(Option.scala:345)
        at org.apache.spark.sql.catalyst.expressions.TimeZoneAwareExpression$class.timeZone(datetimeExpressions.scala:46)
        at org.apache.spark.sql.catalyst.expressions.Cast.timeZone$lzycompute(Cast.scala:172)                                                                                         at org.apache.spark.sql.catalyst.expressions.Cast.timeZone(Cast.scala:172)
        at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToTimestamp$1$$anonfun$apply$24.apply(Cast.scala:253)
        at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToTimestamp$1$$anonfun$apply$24.apply(Cast.scala:253)
        at org.apache.spark.sql.catalyst.expressions.Cast.org$apache$spark$sql$catalyst$expressions$Cast$$buildCast(Cast.scala:201)
        at org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToTimestamp$1.apply(Cast.scala:253)
        at org.apache.spark.sql.catalyst.expressions.Cast.nullSafeEval(Cast.scala:533)
        at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:327)
        at org.apache.spark.sql.hive.HadoopTableReader$$anonfun$5$$anonfun$fillPartitionKeys$1$1.apply(TableReader.scala:230)
        at org.apache.spark.sql.hive.HadoopTableReader$$anonfun$5$$anonfun$fillPartitionKeys$1$1.apply(TableReader.scala:228)
      

        Attachments

          Activity

            People

            • Assignee:
              donnyzone Feng Zhu
              Reporter:
              zhihao wangzhihao
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: