It's similar to
HBASE-8768. Only thing is we need to write custom mapper and reducer in Phoenix. In Map phase we just need to get row key from primary key columns and write the full text of a line as usual(to ensure sorting). In reducer we need to get actual key values by running upsert query.
It's basically reduces lot of map output to write to disk and data need to be transferred through network.