The mapreduce package provides two Reducer implementations, KeyValueSortReducer and PutSortReducer, which are used by Import, ImportTsv, and WALPlayer in conjunction with the HFileOutputFormat. Both of these implementations make use of a TreeSet to sort values matching a key. This reducer will OOM when rows are large.
A better solution would be to implement secondary sort of the values. That way hadoop sorts the records, spilling to disk when necessary.