Uploaded image for project: 'Apache Sedona'
  1. Apache Sedona
  2. SEDONA-319

RS_AddBandFromArray does not always produce serializable rasters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0

    Description

      Sometimes RS_AddBandFromArray produces non-serializable rasters. As far as we know, adding a new band to a raster with UInt8 pixel values will always produce non-serializable results. Here is a code snippet to reproduce this problem using a GeoTiff image in core/src/test/resources/:

      var df = sparkSession.read.format("binaryFile").load(resourceFolder + "raster/test3.tif")
      df = df.selectExpr("RS_FromGeoTiff(content) as raster", "RS_BandAsArray(RS_FromGeoTiff(content), 1) as band")
      df = df.selectExpr("RS_AddBandFromArray(raster, band, 2)")
      df.collect()
      

      The stacktrace is as follows:

      Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (bogon executor driver): java.lang.IllegalArgumentException: No Serializers available for the ColorModel.
      	at javax.media.jai.remote.SerializableRenderedImage.<init>(SerializableRenderedImage.java:507)
      	at javax.media.jai.remote.SerializableRenderedImage.<init>(SerializableRenderedImage.java:390)
      	at org.apache.sedona.common.raster.Serde.serialize(Serde.java:35)
      	at org.apache.spark.sql.sedona_sql.expressions.raster.RS_AddBandFromArray.$anonfun$eval$1(MapAlgebra.scala:814)
      	at scala.Option.map(Option.scala:230)
      	at org.apache.spark.sql.sedona_sql.expressions.raster.RS_AddBandFromArray.eval(MapAlgebra.scala:814)
      	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
      	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
      	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
      	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:365)
      	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
      	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
      	at org.apache.spark.scheduler.Task.run(Task.scala:136)
      	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
      	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:750)
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kontinuation Kristin Cowalcijk
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m