Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-8724 Bug fixes - Phase 1
  3. HUDI-7952

Incorrect partition pruning when TimestampBasedKeyGenerator is used in partition column

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • 1.0.1
    • None
    • None

    Description

      Fix of ClassCastException in https://issues.apache.org/jira/browse/HUDI-7709 with nulls as partition columns values could lead to an empty query results.

      HoodieFileIndex.listFiles() would return Seq of PartitionDirectory with null values.

       

      But there is another problem with range filters on partition column.

      For instance, we have UNIX_TIMESTAMP in column ts.

      And the table is also partitioned by ts with

      hoodie.keygen.timebased.output.dateformat = "yyyy-MM-dd HH"

      For execution of query like:

      SELECT ... WHERE ts BETWEEN 1078016000 and 1718953003 ...

      it's not possible to filter rows properly.

       

      Attachments

        Activity

          People

            danny0405 Danny Chen
            geserdugarov Geser Dugarov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: