Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-5324

Spark SQL MERGE INTO statement should always do upsert if there's matching update clause

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • index, spark-sql
    • None

    Description

      UPDATED

      Aforementioned issue was actually a result of misconfiguration of the Merge Into statement – MIT was using "insert" operation instead of "upsert".

      Real issue though is that MIT implicitly predicates using "upsert" operation onto whether "preCombine" config is set. Instead, it should always specify operation as "upsert", since MIT allows to specify updating semantics w/o requiring presence of the "preCombine" field

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            alexey.kudinkin Alexey Kudinkin Assign to me
            guoyihua Ethan Guo

            Dates

              Created:
              Updated:

              Slack

                Issue deployment