This issue is highly intermittent that only seems to occurs with spark engine when the query has a GROUPBY clause. The following is the testcase.
Incorrect results once in a while:
1) Not reproducible with HoMR.
2) Not an issue when running from spark-shell.
3) Not reproducible when the column data type is String or double. Only reproducible with decimal data types. Also works fine for decimal datatype if you cast decimal as string on read and cast it back to decimal on select.
4) Occurs with parquet and text file format as well. (havent tried with other formats).
5) Occurs in both scenarios when table data is within encryption zone and outside.
6) Even in clusters where this is reproducible, this occurs once in like 20 times or more.
7) Occurs with both Beeline and Hive CLI.
8) Reproducible only when there is a a groupby clause.