Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.2.0, 4.0.0
-
None
Description
Per gopalv's recommendation tried running streaming ingest with and without zlib. Following are the numbers
Compression: NONE
Total rows committed: 93800000
Throughput: 1563333 rows/second
$ hdfs dfs -du -s -h /apps/hive/warehouse/prasanth.db/culvert
14.1 G /apps/hive/warehouse/prasanth.db/culvert
Compression: ZLIB
Total rows committed: 92100000
Throughput: 1535000 rows/second
$ hdfs dfs -du -s -h /apps/hive/warehouse/prasanth.db/culvert
7.4 G /apps/hive/warehouse/prasanth.db/culvert
ZLIB is getting us 2x compression and only 2% lesser throughput. We should enable ZLIB by default for streaming ingest.