Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1717

Metadata Table reader does not show correct view of the metadata

    XMLWordPrintableJSON

    Details

      Description

      Dataset timeline: C1 C2 C3 Compaction.inflight C4 C5

      Metadata timeline: DC1 DC2 DC3. (DC=deltaCommit)

      Assume the dataset timeline has some completed commits (C1, C2 ... C5) and an async compaction operation in progress. Also assume that the metadata table is synced only till C3.

      The MetadataTableWriter will not sync any more instants to the Metadata Table since an incomplete instant is present next (Compaction.inflight).

      The same sync logic is also used by the MetadataReader to perform the in-memory merge of timeline. Hence, the reader will also not consider C4 and C5  thereby providing an incorrect and older view of the FileSlices and FileGroups. 

      Any future ingestion into this table MAY insert data into older versions of the FileSlices which will end up being a data loss when queried.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pwason Prashant Wason
                Reporter:
                pwason Prashant Wason
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: