Uploaded image for project: 'Apache Sedona'
  1. Apache Sedona
  2. SEDONA-19

Global indexing does not work with SQL joins

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.1

    Description

      According to the documentation, global indexing can be used with SQL joins and is enabled by default. But the code path here: 

      https://github.com/apache/incubator-sedona/blob/master/sql/src/main/scala/org/apache/spark/sql/sedona_sql/strategy/join/TraitJoinQueryExec.scala#L125

      calls the JoinParams constructor here:

      https://github.com/apache/incubator-sedona/blob/master/core/src/main/java/org/apache/sedona/core/spatialOperator/JoinQuery.java#L434

      which always sets useIndex to false. This prevents indexing from being possible via SQL queries, and the non-indexed join doesn't work well with large datasets (separate issue, loads all of the non-window objects of each partition into memory at once and quickly runs out of memory)

      Also, this python adapter uses the arguments incorrectly as well here:

      https://github.com/apache/incubator-sedona/blob/master/python-adapter/src/main/scala/org.apache.sedona.python.wrapper/adapters/JoinParamsAdapter.scala#L29

      Need to update the signature of JoinParams, probably just add one with all four parameters

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            kimahriman Adam Binford
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment