[KUDU-2485] Enhance Kudu-Spark docs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.7.1
Fix Version/s: None
Component/s: documentation, spark
Labels:
None

Target Version/s:

1.8.0

Description

Users often get confused about the right way to use the Kudu-Spark integration. The most common dangerous result is that they create multiple Kudu clients, sometimes even one per task. It's pretty easy to overwhelm the master in this way, e.g., with a 2 second batch window and a client per task in a Spark streaming job. We should take our current minimal Spark docs and provide better examples and bigger, louder, redder warnings about making extra Kudu clients. Users should be directed to use the KuduContext exclusively. When a client is needed, the client instance inside the KuduContext should be used.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: William Berkeley

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 25/Jun/18 22:47

Updated:: 03/Jun/20 03:03