Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1428

Clean old fileslice is invalid

    XMLWordPrintableJSON

    Details

      Description

      Reproduce : 

      Table type MERGE_ON_READ and set hoodie.cleaner.commits.retained=2

      Do insert into table three times  into same partition

      And it will create follow parquet file

      a2c57b73-5ba9-4744-9027-640075a179ec-0_0-213-1740_20201201200149.parquet

      a2c57b73-5ba9-4744-9027-640075a179ec-0_0-176-1702_20201201195638.parquet

      a2c57b73-5ba9-4744-9027-640075a179ec-0_0-139-1664_20201201195219.parquet

      a2c57b73-5ba9-4744-9027-640075a179ec-0_0-103-1632_20201201193835.parquet

      And 20201201200149.parquet is newest 

      The old parquet files have not be deleted when do client.clean()

      The reason is CleanPlanner  init commitTimeline with hoodieTable.getCompletedCommitTimeline();

      And getCompletedCommitTimeline method not include DELTA_COMMIT_ACTION

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                henryz steven zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: