Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.6.1, 1.7.0, 1.8.0
-
None
Description
Once DATAFU-167 is merged, datafu-spark will support Spark versions up to 2.4.5. However, because our implementation of collectLimitedList extends Spark's collect, and because its interface was changed in 2.4.6, compilation is broken for us.
Here is the relevant line from collectLimitedList: https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala#L104)
Here is the compilation warning:
/Users/eyal/git/datafu/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala:104: class CollectLimitedList needs to be abstract, since: it has 3 unimplemented members. /** As seen from class CollectLimitedList, the missing signatures are as follows. * For convenience, these are usable as stub implementations. */ // Members declared in org.apache.spark.sql.catalyst.expressions.aggregate.Collect protected val bufferElementType: org.apache.spark.sql.types.DataType = ??? protected def convertToBufferElement(value: Any): Any = ??? // Members declared in org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate def eval(buffer: scala.collection.mutable.ArrayBuffer[Any]): Any = ??? case class CollectLimitedList(child: Expression, ^ one error found FAILURE: Build failed with an exception.
We need to either 1) update our implementation, and drop support for older versions (and then release this in our version 1.8.0) or 2) copy the code in a backwards compatible way.
Please note that you can replicate this compilation error on the master branch even without merging DATAFU-167 by running:
./gradlew :datafu-spark:test -PscalaVersion=2.11 -PsparkVersion=2.4.6 --tests "DataFrame*"
Attachments
Attachments
Issue Links
- is blocked by
-
DATAFU-167 Fix Scala Python Bridge support in Spark 2 minor version updates
- Resolved
- is required by
-
DATAFU-169 Support Spark 3.x
- Closed