Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.1.0
-
None
-
None
-
Reviewed
Description
Currently the Acid compactor is implemented as generated MR job (CompactorMR.java).
It could also be expressed as a Hive query that reads from a given partition and writes data back to the same partition. This will merge the deltas and 'apply' the delete events. The simplest would be to just use Insert Overwrite but that will change all ROW__IDs which we don't want.
Need to implement this in a way that preserves ROW__IDs and creates a new base_x directory to handle Major compaction.
Minor compaction will be investigated separately.
Attachments
Attachments
Issue Links
- depends upon
-
HIVE-20738 Enable Delete Event filtering in VectorizedOrcAcidRowBatchReader
- Closed
- is blocked by
-
ORC-437 Make acid schema checks case insensitive
- Closed
- is related to
-
HIVE-20823 Make Compactor run in a transaction
- Closed
- is required by
-
HIVE-21165 ACID: pass query hint to the writers to write hive.acid.key.index
- Open
-
HIVE-21164 ACID: explore how we can avoid a move step during inserts/compaction
- Closed
- relates to
-
HIVE-20934 ACID: Query based compactor for minor compaction
- Closed
- links to