Description
ORC (Optimized Row Columnar) is a very popular open source format adopted in some major
components in Hadoop ecosystem. It is also used by a lot of users. The advantages of
supporting ORC storage in HAWQ are in two folds: firstly, it makes HAWQ more Hadoop native
which interacts with other components more easily; secondly, ORC stores some meta info for
query optimization, thus, it might potentially outperform two native formats (i.e., AO, Parquet) if it
is available.
The implementation can be based on the framework proposed in HAWQ-786.