1. Big data test. Take a look at ql/src/test/queries/clientpositive/groupby_bigdata.q to see how we generate big data sets. done. added a rcfile_bigdata.q and added some test codes in TestRCFile 2. Complex column types: Take a look at ./ql/src/test/queries/clientpositive/input_lazyserde.q done. added a input_columnarserde.q 3.ObjectInspectorFactory.getColumnarStructObjectInspector: I think you don't need byte separator and boolean lastColumnTakesRest. Just remove them. done. 4.ColumnarStruct.init: Can you cache/reuse the ByteArrayRef() instead of doing ByteArrayRef br = new ByteArrayRef() every time? done 5.ColumnarStruct: comments should mention the difference against LazyStruct is that it reads data through init(BytesRefArrayWritable cols). done. 6. Can you put all changes to serde2.lazy package into a new package called serde2.columnar? done. 7. It seems there are a lot of shared code between LazySimpleSerDe and ColumnarSerDe, e.g. a lot of functionalities in init and serialize. Can you refactor LazySimpleSerde and put those common functionalities into public static methods, so that ColumnarSerDe can directly call? You might also want to put the configurations of the LazySimpleSerDe (nullString, separators, etc) into a public static Class, so that the public static methods will return it. done. 8.RCFile.readFields is not very efficient (see below). I think we should lazily decompress the stream instead of decompress all of it and return the decompressor. The reason is that decompressed data can be very big and easily go out-of-memory (if we consider 1:10 or more compression ratio). done. By adding a new column plain length variable in file, which is used to change while (deflatFilter.available() > 0) valBuf.write(valueIn, 1); to sth like valBuf.write(valueIn,columnPlainLen); 9.Lazy deserialization/decompression not done. Because lazy decompression seems much inefficient than the bulk decompression we current used. 10.Column-hinting. not done. 11.compress each column directly, it means keep one codec for each column and write data directly to the column's corresponding compression stream. Currently RCFile buffers all the data first. When buffered data is greater than a config, compress each column separately and flush them out. The direct compression strategy can increase the compression ratio.(this is not related the severe read performance degradation problem) done. 12.make continuous skips to a single skip, that way it would increase the bytes need skipped and increase the probability of not executing statements in the above if block. done.