Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
Description
INSERT OVERWRITE LOCAL DIRECTORY is great at what it does, but the output is never instantly useful. Why? Because it is a whole bunch of gzipped files.
It would be lovely if I could tell Hive what it should do with its output when inserting into a local directory. For example, to automatically pipe its output through something. Really, I want to
gunzip -c *.gz | perl -p -e 's/\cA/\t/g' > filename
...but I would settle for piping every reducer's output through gunzip -c | perl -p -e 's/\cA/\t/g' and then having Hive save the result to whatever it would have used for a filename but without the .gz.