[HUDI-481] Support SQL-like method - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: cli
Labels:
None

Epic Link:
Table Format APIs

Description

As we know, Hudi use spark datasource api to upsert data. For example, if we want to update a data, we need to get the old row's data first, and use upsert method to update this row.
But there's another situation where someone just wants to update one column of data. If we use a sql to describe, it is update table set col1 = X where col2 = Y. This is something hudi cannot deal with directly at present, we can only get all the data involved as a dataset first and then merge it.
So I think maybe we can create a new subproject to process the batch data in an sql-like method. For example.

val hudiTable = new HudiTable(path)
hudiTable.update.set("col1 = X").where("col2 = Y")
hudiTable.delete.where("col3 = Z")
hudiTable.commit

It may also extend the functionality and support jdbc-like RFC schemes: https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller

Hope every one can provide some suggestions to see if this plan is feasible.

Attachments

Issue Links

relates to

HUDI-1341 hudi cli command such as rollback 、bootstrap support spark sql implement

Open

HUDI-5144 User-friendly Table APIs to perform CRUD operations and table services

Open

Activity

People

Assignee:: Unassigned

Reporter:: cdmikechen

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 30/Dec/19 00:29

Updated:: 10/Mar/23 01:50