Uploaded image for project: 'Apache Sedona'
  1. Apache Sedona
  2. SEDONA-261

Cannot run distance join using broadcast index join when the distance expression references to attributes from the right-side relation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.3.1
    • 1.4.0

    Description

      The following distance join query won't run using broadcast index join:

      SELECT * FROM df1 JOIN df2 ON ST_Distance(df1.geom, df2.geom) < df2.dist
      

      The exception raised by Sedona is as follows:

      Couldn't find dist#8638 in [id#8583,geom#8589]
      java.lang.IllegalStateException: Couldn't find dist#8638 in [id#8583,geom#8589]
      	at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:80)
      	at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1.applyOrElse(BoundAttribute.scala:73)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
      	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:528)
      	at org.apache.spark.sql.catalyst.expressions.BindReferences$.bindReference(BoundAttribute.scala:73)
      	at org.apache.spark.sql.sedona_sql.strategy.join.SpatialIndexExec.doExecuteBroadcast(SpatialIndexExec.scala:54)
      

      If the distance expression references attribute from the left-side relation, the distance join will run without problem when using broadcast index join.

      SELECT * FROM df1 JOIN df2 ON ST_Distance(df1.geom, df2.geom) < df1.dist
      

      The space-partitioned distance join does not have this problem.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kontinuation Kristin Cowalcijk
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m