Description
SqlTestUtils.stripSparkFilter needs to make copies of the UTF8Strings, eg., with FromUnsafeProjection to avoid returning duplicates of the same row (see SPARK-9459).
Right now, this isn't causing any problems, since the parquet string predicate pushdown is turned off (see SPARK-11153). However I ran into this while trying to get the predicate pushdown to work with a different version of parquet. Without this fix, there were errors like:
[info] !== Correct Answer - 4 == == Spark Answer - 4 == [info] ![1] [2] [info] [2] [2] [info] ![3] [4] [info] [4] [4] (QueryTest.scala:127)
I figure its worth making this change now while I ran into it. PR coming shortly