IMPALA-8542, we added an access trace to the data cache. This outputs a JSON entry with information about each hit/miss. Currently it is controlled by the data_cache_enable_tracing startup parameter, and it will trace all accesses to a single file. There are a few enhancements that would make this easier to enable:
- Limit the number of access trace entries stored on disk to avoid unlimited disk usage. This can be done by switching to use be/src/util/simple-logger.h rather than a single file. The number of retained entries should be configurable.
- Implement a way to trace a subset of accesses (e.g. 5%)
- Optionally provide a way to start/stop logging without restart (e.g. via the WebUI)