[DRILL-4276] Need a way to check on status of drillbits - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Resolved
Affects Version/s: None
Fix Version/s: 1.14.0
Component/s: Execution - Monitoring
Labels:
None

Description

So I had this situation when cluster started with 8 nodes and 2 went down for some reason.

As a user, my only way to detect this situation:

query failed because something started to execute on a node and failed because it went down (and for that I have to comb through the logs to find a warning)
my queries are extremely slow, because my queries started to execute after node went down and got deregistered from zookeeper.
somebody just stopped drillbit on a particular node

Since there is no central place (apart from zookeeper) where information on participating nodes is kept, when I queried sys.drillbits, I got 6 nodes, as if 2 others never existed ...There is beauty in flexibilty, but in real life situation when you have more than 20 nodes, things can get out control quickly.

Since zookeeper has this information in the first place, can we enhance sys.drillbits table to have drillbit status as zookeeper sees it ?

This can also help with testing and automating test cases that test for failure conditions like that.

Attachments

Issue Links

is required by

DRILL-6289 Cluster view should show more relevant information

Resolved

relates to

DRILL-6289 Cluster view should show more relevant information

Resolved

Activity

People

Assignee:: Kunal Khatua

Reporter:: Victoria Markman

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 15/Jan/16 18:48

Updated:: 20/Jun/18 22:56

Resolved:: 04/Jun/18 18:53