Details
-
Epic
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 3.4.0
-
None
-
None
-
Improve data cache metrics
-
ghx-label-12
Description
Currently, the data cache has the following metrics:
impala-server.io-mgr.remote-data-cache-hit-bytes impala-server.io-mgr.remote-data-cache-miss-bytes impala-server.io-mgr.remote-data-cache-total-bytes impala-server.io-mgr.remote-data-cache-dropped-byte
There are several questions that these metrics will not answer, especially when we start to consider changes to eviction algorithms. Here are some questions that we may want to be able to answer:
- How much memory is being used to track metadata?
- What is the distribution of size of entries in the cache?
- How many entries are in the cache?
- What are the hit/miss counts (as opposed to the hit bytes)?
- What is the actual disk usage (as seen by the OS)?
This is an epic to track adding metrics to answer these questions (and other similar questions).