Description
This JIRA adds ORC (Optimized Row Columnar) file format support in Crunch. Three modes are supported for ORC serialization/deserialization:
–
1) Orcs.orcs(): using OrcStructs as the deserialized objects to provide high performance
2) Orcs.reflects(): using Java reflection to support POJOs as the deserialized objects
3) Orcs.tuples(): using Crunch Tuples as the deserialized objects to leverage performance and user-friendliness