Uploaded image for project: 'Apache HAWQ'
  1. Apache HAWQ
  2. HAWQ-1030

User hang due to poor spin-lock/LWLock performance under high concurrency

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.1.0.0-incubating
    • Component/s: Core
    • Labels:
      None

      Description

      Some clients have recently reported apparent hangs with their applications. In all cases the symptoms were the same:

      • All sessions appear to be hung in LWLockAcquire or Release, specifically s_lock
      • there is a high number of concurrent sessions (close to 100)
      • System is not actually hung, normally processing resumes after some period of time when all sessions have completed their locking work

      The postgresql developer community has found several issues with performance under high concurrency (> 32 sessions) in the spin-lock mechanism we've inherited in HAWQ. This ultimately has been corrected in 9.5 with a replacement to the spin-lock mechanism and appears to provide a significant boost to query performance.

      The actual fix is in commit: ab5194e6f617a9a9e7aadb3dd1cee948a42d0755

      Only 1 line commit to s_lock.c could help address this and would be easy enough to cherry-pick: b03d196be055450c7260749f17347c2d066b4254

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mli Ming LI
                Reporter:
                mli Ming LI
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: