Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
The issue was found during a sanity test run when the count of all rows from all the guideposts didn't match the actual number of rows in the table.
DefaultStatisticsCollector#collectStatistics() method iterates over a list of cells and keeps track of size of KV's. If the size exceeds guideposts width, it adds an entry to GuidePostsInfo using GuidePostsInfoBuilder#addGuidePostOnCollection() method.
However for the last batch of rows that don't cross the threshold of GUIDE_POSTS_WIDTH, the code doesn't create any entry for it using the Builder class. In an ideal case, we would want to cover that scenario by introducing a small guide post with the corresponding row key and the size of the that guidepost (since we can persist both the things to SYSTEM.STATS table). This is also because GUIDE_POSTS_WIDTH is an estimate/best effort for distribution of data.