Details
-
Improvement
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
0.5.0
-
None
Description
Currently we use a pseudo-estimated value to calculate the scan size which does not take the actual size of data into account.
HBase, through o.a.h.h.client.HBaseAdmin.getClusterStatus(), provides a way to retrieve the actual data size of each region. We can use this to approximate the size of scan and use it to improve the scan parallelization.
Attachments
Attachments
Issue Links
- is duplicated by
-
DRILL-535 HbaseGroupScan should return more accurate size information.
- Resolved