Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.90.0, 0.90.1, 0.92.0
-
None
-
None
-
None
Description
We got into an issue today where we were using HFileOutputFormat to perform an incremental load on an already-large cluster. Because bulk-loaded files don't have a sequence ID, they are put in the front of the StoreFile list. This resulted in the following StoreFile ordering
2GB (bulk) => 25GB => 2GB => ...
So this triggered a 30+GB major compaction for every single region. Optimally, we would like bulk import files to be ordered in the compaction list at the time of insertion so this can be a much smaller compaction and rely on StoreFile age for major compaction trigger.
Attachments
Issue Links
- is related to
-
HBASE-3690 Option to Exclude Bulk Import Files from Minor Compaction
- Closed
- relates to
-
HBASE-1923 Bulk incremental load into an existing table
- Closed
- requires
-
HBASE-2856 TestAcidGuarantee broken on trunk
- Closed