Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.9.6
-
None
-
None
Description
Currently, query results that are persistent bags (such as Spark RDDs or Flink DataSets) are coerced to in-memory bags before they are printed. This needs to be done at any place in the query result. This came out when we used matrix factorization (query/factorization.mrql) because its output are the two matrix factors (datasets) in a tuple. It was reported by Ahmed Ulde. It will be fixed by adding a new case for tuple projection in the query evaluator that coerces persistent datasets to bags when they appear in the query output.