Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2899

Fix DataFormatter usages removed in Spark 3.2

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.11.0
    • spark
    • None
    • 0

    Description

       

      Trying to read that is partitioned on a string field ("product_category") rather than by date, leads to `NoSuchMethodError`

      scala> val readDf: DataFrame =
           |   spark.read.option(DataSourceReadOptions.ENABLE_DATA_SKIPPING.key(), "false").format("hudi").load(outputPath)
      java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.util.DateFormatter$.apply(Ljava/time/ZoneId;)Lorg/apache/spark/sql/catalyst/util/DateFormatter;
        at org.apache.spark.sql.execution.datasources.Spark3ParsePartitionUtil.parsePartition(Spark3ParsePartitionUtil.scala:32)
        at org.apache.hudi.HoodieFileIndex.$anonfun$getAllQueryPartitionPaths$3(HoodieFileIndex.scala:559)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.map(TraversableLike.scala:286)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at org.apache.hudi.HoodieFileIndex.getAllQueryPartitionPaths(HoodieFileIndex.scala:511)
        at org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:575)
        at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:360)
        at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:157)
        at org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
        at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
        at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
        at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:188)
        ... 68 elided 

       

       

      Attachments

        Issue Links

          Activity

            People

              biyan900116@gmail.com Yann Byron
              alexey.kudinkin Alexey Kudinkin
              Raymond Xu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 0.5h
                  0.5h
                  Remaining:
                  Remaining Estimate - 0.5h
                  0.5h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified