Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
Description
The mapreduce package provides two Reducer implementations, KeyValueSortReducer and PutSortReducer, which are used by Import, ImportTsv, and WALPlayer in conjunction with the HFileOutputFormat. Both of these implementations make use of a TreeSet to sort values matching a key. This reducer will OOM when rows are large.
A better solution would be to implement secondary sort of the values. That way hadoop sorts the records, spilling to disk when necessary.
Attachments
Issue Links
- is duplicated by
-
HBASE-14339 HBase Bulk Load and super wide rows
- Closed
- is related to
-
HBASE-13897 OOM may occur when Import imports a row with too many KeyValues
- Closed
-
HBASE-14833 HFileOutputFormat2 should allow for custom reducer logic
- Closed