Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
0.10.0
-
None
-
None
Description
Creating hcatalog table using creating tables and alter table add partition is most used approach.However at times the incoming files can come with header row/column names.
In such cases it would be good feature to be able skip header/rows.
Suggestions below:
hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data' -skip header"
hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data' -skip [n]"
hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'" -DskipRow=1
– can choose with bounded array (rows) for selecting rows for table
hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data' -rows[2:]" // from first row till all
hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data' -rows[2:100]" // from first row till 100 rows
Correct place for this feature in hive or hcat?or with -D can be handled in hcat?
Thanks
Rekha