Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2250

Document odd interaction between upserts and Spark Datasets

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.0
    • 1.6.0
    • spark

    Description

      We need to document a specific behavior of Spark Datasets that runs contrary to how Kudu works.

      Say you have 3 columns "k, x, y" where k is the primary key.

      You run a first insert on a row "k=1, x=2, y=3".

      Now you upsert "k=1, y=4".

      Using any Kudu API, the full row would now be "k=1, x=2, y=4" but with Datasets you have "k=1, x=NULL, y=4". This means that Datasets put a null value when some columns aren't specified.

      Attachments

        Issue Links

          Activity

            People

              fwang29 Fengling Wang
              jdcryans Jean-Daniel Cryans
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: