[KUDU-3060] Add a tool to identify potential performance bottlenecks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: CLI, perf, ui
Labels:
- hackathon
- roadmap-candidate

Description

When we hear users wondering why their workloads are slower than expected, some common questions arise. It'd be great if we had a single tool (or a single webpage) that aggregated and displayed useful information for a specific tablet or table. Things like, for a specific table:

How many partitions and replicas exist for the table.
For those replicas, how they are distributed across tablet servers.
For those tablet servers, what the block cache configuration is, and what the current block cache stats (hit ratio, evictions, etc) are.
For those tablet servers, which tablets have been written to recently.
For those tablet servers, which tablets within the target table have been written to recently.
For those tablet servers, how many active and non-expired scanners exist.
For those tablet servers, which tablets within the target table have been read from recently.
For those tablet servers, how many ongoing tablet copies there are both to and from the server.
For those tablet servers, how many data directories there are.
For the data directories on those tablet servers, how many replicas are spreading data in each directory, how many blocks there are in each, and how much space is available in each.

The list could go on and on. It probably makes sense to break the diagnostics into different phases or goals, maybe along the lines of 1) identifying hotspots of workloads and lag across tablet servers (e.g. a ton of writes going to a single tserver), and 2) digging into a single tablet server to understand how it's provisioned and whether that provisioning is sufficient.

Attachments

Issue Links

relates to

KUDU-2643 Add tools to do basic analysis on the metrics logs

Open

KUDU-2782 Implement distributed tracing support in Kudu

Open

Activity

People

Assignee:: Unassigned

Reporter:: Andrew Wong

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Feb/20 23:03

Updated:: 02/Nov/22 11:51