Details
Description
Perhaps I'm doing something wrong or is actually working as expected.
Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file.
Code and FileDump attached.
ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector( Integer.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA); Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), OrcFile.writerOptions(new Configuration()) .compress(CompressionKind.NONE) .inspector(inspector) .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) .version(OrcFile.Version.V_0_12) ); for (int i = 0; i < 1000000; ++i) { w.addRow(123); } w.close();