Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4994

DatahubSyncTool does not correctly re-ingest soft-deleted entities

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • meta-sync

    Description

      Datahub has a notion of soft-deletes (the entity still exists in the database with a status=removed:true). Such entities could get re-ingested with new properties at a later time, such that the older one gets overwritten. The current implementation in DatahubSyncTool does not handle this scenario. It fails to update the status flag to removed:false during ingest, which means the entity won't surface in the Datahub UI at all.

      Ref: See sections on Soft Delete and Hard Delete in the Datahub docs: https://datahubproject.io/docs/how/delete-metadata/#soft-delete-the-default

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pramodbiligiri Pramod Biligiri
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: