Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Apache Sedona allows configuring the dominant side of spatial partitioning and the side on which Sedona builds spatial indices (See Apache Sedona - parameters). However, the way Apache Sedona defines the left and right sides of the join is quite counterintuitive.
For queries such as SELECT * FROM A JOIN B ON ST_(left, right), or df_a.join(df_b, expr("ST_(left, right)")), the left-side relation and right-side relation were completely determined by the join condition: ST_*(left, right). The relation left references to is the left-side relation, and the relation right references to is the right-side relation.
For example, the left side relation of the following query is `df_msb`, even though it appears to be on the right side of the join:
SELECT * FROM df_pickup JOIN df_msb ON ST_Contains(df_msb.geom, df_pickup.pickup)
If we replace the join condition with ST_Within(df_pickup.pickup, df_msb.geom), df_pickup becomes the left-side relation.
A more intuitive way is to always treat df_pickup as the left-side relation, regardless of how the join condition is written.
Attachments
Issue Links
- links to