Description
Here are results of DateTimeRebaseBenchmark on the current master branch:
Save timestamps to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ after 1582 59877 59877 0 1.7 598.8 0.0X before 1582 61361 61361 0 1.6 613.6 0.0X Load timestamps from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ after 1582, vec off 48197 48288 118 2.1 482.0 1.0X after 1582, vec on 38247 38351 128 2.6 382.5 1.3X before 1582, vec off 53179 53359 249 1.9 531.8 0.9X before 1582, vec on 44076 44268 269 2.3 440.8 1.1X
The results of the same benchmark on Spark 2.4.6-SNAPSHOT:
Save timestamps to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ after 1582 18858 18858 0 5.3 188.6 1.0X before 1582 18508 18508 0 5.4 185.1 1.0X Load timestamps from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ after 1582, vec off 14063 14177 143 7.1 140.6 1.0X after 1582, vec on 5955 6029 100 16.8 59.5 2.4X before 1582, vec off 14119 14126 7 7.1 141.2 1.0X before 1582, vec on 5991 6007 25 16.7 59.9 2.3X
Here is the PR with DateTimeRebaseBenchmark backported to 2.4: https://github.com/MaxGekk/spark/pull/27
Attachments
Issue Links
- links to