Uploaded image for project: 'Apache Sedona'
  1. Apache Sedona
  2. SEDONA-8

ST_Transform slow due to lock contention

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0

    Description

      This issue was reported by GitHub user devyn as GitHub issue 456. See the attachment.

       

      ST_Transform should cache values from CRS (decode, findMathTransform) manually in a thread-local cache to avoid waiting around for locks on the caches internal to CRS. GeoSpark uses the CRS utilities in a way that I don't think was anticipated by the authors of geotools by looking up the same spatial referencing information for every single row across many threads.

       

      The synchronization inside the caches that geotools' CRS utility singleton eventually references mean that the vast majority of ST_Transform work ends up single threaded within each executor.

       

      Do an ST_Transform on a large set of data with a single executor and watch thread execution (either by CPU usage, or with VisualVM) - threads end up waiting their turn for access to the cache in CRS

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            jiayu Jia Yu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment