Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Bulk load flushes rows to HFile in batches of size HBASE_ROWSET_VSBB_SIZE. The default value for this cqd is 1024. Aflush size of 1024 rows is small, particularly for narrow tables like TPC-H lineitem (~150 bytes per row).
Increasing the flush size to 10,000 or 20,000 rows caused performance to improve by more than 100%. Please configure code to determine a more ideal flush size for a given table.