Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.6.0
Description
This problem is reported by users on Discord. They found that RS_ZonalStats does not work with a raster tile in EPSG:4326. Using the attached data you can see that the zonal stats computed are mostly NaN:
rawDf = spark.read.format("binaryFile").option("pathGlobFilter", "*.tiff").load("zonal_stats_issue/data_andalusia") rawDf.createOrReplaceTempView("rawdf") rasterDf = sedona.sql(""" SELECT RS_FromGeoTiff(content) as tile, path FROM rawdf """) rasterDf.createOrReplaceTempView("l8imgs") parcels = ShapefileReader.readToGeometryRDD(sedona, "zonal_stats_issue/parcelas") parcles_df = Adapter.toDf(parcels, sedona) parcles_df.createOrReplaceTempView("parcels") features_df = sedona.sql(""" WITH matched_tile AS ( SELECT path, tile, geometry, idPanel FROM l8imgs, parcels WHERE ST_Intersects(RS_Envelope(tile), parcels.geometry) OR ST_Within(RS_Envelope(tile), parcels.geometry) ) SELECT path, idPanel, RS_ZonalStats(tile, geometry, 1, 'mean') as stats_mean FROM matched_tile """) features_df.show(1000, False). # <-- Lots of NaN here.
Output:
+----------------------------------------------------+--------------------+------------------+ |path |idPanel |stats_mean | +----------------------------------------------------+--------------------+------------------+ |zonal_stats_issue/data_andalusia/s2_20240604_01.tiff|14:38:0:0:14:9002:2 |NaN | |zonal_stats_issue/data_andalusia/s2_20240604_01.tiff|14:38:0:0:14:32:4 |NaN | |zonal_stats_issue/data_andalusia/s2_20240604_01.tiff|14:38:0:0:14:32:3 |NaN | |zonal_stats_issue/data_andalusia/s2_20240604_01.tiff|14:38:0:0:14:30:2 |NaN | |zonal_stats_issue/data_andalusia/s2_20240604_01.tiff|14:38:0:0:14:26:3 |NaN | |zonal_stats_issue/data_andalusia/s2_20240604_01.tiff|14:38:0:0:14:27:4 |NaN | ...
This problem is caused by incorrect rasterization of the parcel geometries when the reference raster has scaleX/scaleY smaller than 1. There's some bad double->int casting when computing the extent of the result of rasterization, which is:
1. Unnecessary when we're using the extent of the reference raster
2. Problematic when handling rasters with non-integral scaleX or scaleY values
This bug affects the following RS functions:
- RS_AsRaster
- RS_ZonalStats
- RS_ZonalStatsAll
Attachments
Attachments
Issue Links
- links to