Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
With complex polygons using prepared geometries can improve query performance by an order of magnitude.
A test, where I had 1M points and 5k polygons, a simple broadcast join and count with ST_Contains had a performance increase from 40s down to 10s (4x improvement).
points.join(broadcast(polygons), expr("ST_Contains(polygon, point)")).count()
If the relative number of points to polygons increases, then the speedup gets better. For
points.union(points).join(broadcast(polygons), expr("ST_Contains(polygon, point)")).count()
it is 6x (70s -> 12s).
Attachments
Attachments
Issue Links
- links to