Details
-
New Feature
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Currently, Tajo does not manage partitioned directly. In Tajo, each partition is just a directory. For each query, a logical planner traverses matched directories in HDFS according to partition predicates.
This approach is not efficient especially in the environment where the number of partitions are very large. It also makes partition management hard.
Tajo should manage partitions directly by using ALTER TABLE ADD/DROP PARTITION statements. A number of partition entries should be stored in the underlying database that catalog uses.
Synopsis of ALTER TABLE ADD/DROP PARTITION
ALTER TABLE table_name [IF NOT EXISTS] ADD COLUMN PARTITION (key1 = 'val2', key2 = 'val2', ...) WITH ('prop_key' = 'prop_val', ...) LOCATION '...'; ALTER TABLE table_name [IF EXISTS] DROP COLUMN PARTITION (key1 [=|<|<=|>|>=|!=] 'val1', key2 ...,);