[KUDU-2250] Document odd interaction between upserts and Spark Datasets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.6.0
Fix Version/s: 1.6.0
Component/s: spark
Labels:
- newbie

Target Version/s:

Backlog

Description

We need to document a specific behavior of Spark Datasets that runs contrary to how Kudu works.

Say you have 3 columns "k, x, y" where k is the primary key.

You run a first insert on a row "k=1, x=2, y=3".

Now you upsert "k=1, y=4".

Using any Kudu API, the full row would now be "k=1, x=2, y=4" but with Datasets you have "k=1, x=NULL, y=4". This means that Datasets put a null value when some columns aren't specified.

Attachments

Issue Links

is superceded by

KUDU-2371 Allow Kudu-Spark upsert API to ignore NULL column values

Resolved

Activity

People

Assignee:: Fengling Wang

Reporter:: Jean-Daniel Cryans

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 28/Dec/17 20:39

Updated:: 30/Apr/18 23:32

Resolved:: 30/Apr/18 23:32