[KUDU-1945] Support generation of surrogate primary keys (or tables with no PK) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: client, master, tablet
Labels:
- roadmap-candidate

Target Version/s:

Backlog

Description

Many use cases have data where there is no "natural" primary key. For example, a web log use case mostly cares about partitioning and not about precise sorting by timestamp, and timestamps themselves are not necessarily unique. Rather than forcing users to come up with their own surrogate primary keys, Kudu should support some kind of "auto_increment" equivalent which generates primary keys on insertion. Alternatively, Kudu could support tables which are partitioned but not internally sorted.

The advantages would be:

Kudu can pick primary keys on insertion to guarantee that there is no compaction required on the table (eg always assign a new key higher than any existing key in the local tablet). This can improve write throughput substantially, especially compared to naive PK generation schemes that a user might pick such as UUID, which would generate a uniform random-insert workload (worst case for performance)
Make Kudu easier to use for such use cases (no extra client code necessary)

Attachments

Issue Links

relates to

KUDU-1879 Support table without a primary key

Open

IMPALA-11809 Support non unique primary key for Kudu

Resolved

IMPALA-11906 Impala Doc: Support non unique primary key for Kudu table

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Todd Lipcon

Votes:: 3 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 17/Mar/17 07:05

Updated:: 21/Jul/23 05:28