Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-603

HoodieDeltaStreamer should periodically fetch table schema update

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      HoodieDeltaStreamer create SchemaProvider instance and delegate to DeltaSync for periodical sync. However, default implementation of SchemaProvider does not refresh schema, which can change due to schema evolution. DeltaSync snapshot the schema when it creates writeClient, using the SchemaProvider instance or pick up from source, and the schema for writeClient is not refreshed during the loop of Sync.

      I think this needs to be addressed to support schema evolution fully.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Pratyaksh Pratyaksh Sharma
            yx3zhu Yixue Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment