Uploaded image for project: 'Apache Sedona'
  1. Apache Sedona
  2. SEDONA-8

ST_Transform slow due to lock contention

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0

    Description

      This issue was reported by GitHub user devyn as GitHub issue 456. See the attachment.

       

      ST_Transform should cache values from CRS (decode, findMathTransform) manually in a thread-local cache to avoid waiting around for locks on the caches internal to CRS. GeoSpark uses the CRS utilities in a way that I don't think was anticipated by the authors of geotools by looking up the same spatial referencing information for every single row across many threads.

       

      The synchronization inside the caches that geotools' CRS utility singleton eventually references mean that the vast majority of ST_Transform work ends up single threaded within each executor.

       

      Do an ST_Transform on a large set of data with a single executor and watch thread execution (either by CPU usage, or with VisualVM) - threads end up waiting their turn for access to the cache in CRS

       

      Attachments

        1. geospark-issues.csv
          381 kB
          Jia Yu

        Activity

          People

            Unassigned Unassigned
            jiayu Jia Yu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: