OpenJPA
  1. OpenJPA
  2. OPENJPA-825

slices: hangs with multithreaded true

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-M2
    • Fix Version/s: 2.0.0-M1
    • Component/s: slice
    • Labels:
      None

      Description

      When I turned on openjpa.Multithreaded as a possible fix for another bug, I see that the system hangs. Attached are going to be a log file, and jstack, showing how it system hung on the very first query. ( it did execute a few find() operations, but those are not executed via ParallelExecutor ).

      1. hang-multithread-2.jstack
        40 kB
        Fernando Padilla
      2. hang-multithread.txt
        177 kB
        Fernando Padilla
      3. hang-multithread.jstack
        35 kB
        Fernando Padilla

        Issue Links

          Activity

          Hide
          Pinaki Poddar added a comment -

          Two primary objectives or hard constraints
          1. Slice must execute database operations in parallel. Otherwise the basic purpose of working with distributed data in an efficient manner is defeated.
          2. OpenJPA's threading model should not be altered. Threading model is hard to retrofit and OpenJPA's current threading model is battle-tested. Any fundamental alteration is a risk not worth taking.

          One approach to meet the above criteria and address this difficult reported issue is to incorporate a variation on threading model for Slice module. Let each slice run on a specialized SliceThread which acts like a child of a parent 'user' thread. The parent thread is the thread that invoked the Broker/EntityManager operation. When the control reaches to Slice for executing database operation, let Slice spawn specialized SliceThread. Let a SliceThread use its parent's lock. But do this only for Broker and not for Query (which has its own reentrant lock).

          Show
          Pinaki Poddar added a comment - Two primary objectives or hard constraints 1. Slice must execute database operations in parallel. Otherwise the basic purpose of working with distributed data in an efficient manner is defeated. 2. OpenJPA's threading model should not be altered. Threading model is hard to retrofit and OpenJPA's current threading model is battle-tested. Any fundamental alteration is a risk not worth taking. One approach to meet the above criteria and address this difficult reported issue is to incorporate a variation on threading model for Slice module. Let each slice run on a specialized SliceThread which acts like a child of a parent 'user' thread. The parent thread is the thread that invoked the Broker/EntityManager operation. When the control reaches to Slice for executing database operation, let Slice spawn specialized SliceThread. Let a SliceThread use its parent's lock. But do this only for Broker and not for Query (which has its own reentrant lock).
          Hide
          Fernando Padilla added a comment -

          another jstack of openjpa slices hanging: within the StateManager...

          1) I have applied the patch from OPENJPA-826, that removes the first hang spot in QueryImpl.isUnique
          (this was the hang shown in the first stack trace I attached)

          2) I have disabled the current work around to this bug applied by Pinaki
          (the work around currently turns off ParallelExecutor when Multithreaded=true)
          (I have disabled the work around to test a parallel execution, to find and fix hangs)

          Show
          Fernando Padilla added a comment - another jstack of openjpa slices hanging: within the StateManager... 1) I have applied the patch from OPENJPA-826 , that removes the first hang spot in QueryImpl.isUnique (this was the hang shown in the first stack trace I attached) 2) I have disabled the current work around to this bug applied by Pinaki (the work around currently turns off ParallelExecutor when Multithreaded=true) (I have disabled the work around to test a parallel execution, to find and fix hangs)
          Hide
          Fernando Padilla added a comment -

          I have to add that because Slices does execute the Query using multiple threads ( ST1, ST2, ST3 mentioned above ). It is explicitly accessing the EM in a multi-threaded manner, thus breaking the whole assumptions that you can only access the EM with one thread.

          THUS:

          Slices REQUIRES openjpa.Multithreaded=true !!!!!!!!!!!!

          Conceptually, if you will be executing the EM with multiple threads, you have to use openjpa.Multithreaded=true. And if you don't you will see corruption and weird behavior, which another issue (OPENJPA-820) shows exactly that Slices with openjpa.Multithread=false can have erratic behavior, with invalid results for queries. Thus if you run Slices with openjpa.Multithreaded=false, then any query you run could be returning invalid values!

          I just wanted to go on record stating that this bug (OPENJPA-825) is an either/or proposition, or only an issue if you run with openjpa.Multithreaded=true, because currently Slices requires openjpa.Multithreaded=true. Thus though the easy solution is #1, that is the most un-performant of any of them, so it's a great short-term solution. And #3 is useless because deadlocks ARE GUARANTEED to occur (OPENJPA-820), because most of the code uses a global lock, there is no way around deadlocks. #2 is the only long-term solution...

          Show
          Fernando Padilla added a comment - I have to add that because Slices does execute the Query using multiple threads ( ST1, ST2, ST3 mentioned above ). It is explicitly accessing the EM in a multi-threaded manner, thus breaking the whole assumptions that you can only access the EM with one thread. THUS: Slices REQUIRES openjpa.Multithreaded=true !!!!!!!!!!!! Conceptually, if you will be executing the EM with multiple threads, you have to use openjpa.Multithreaded=true. And if you don't you will see corruption and weird behavior, which another issue ( OPENJPA-820 ) shows exactly that Slices with openjpa.Multithread=false can have erratic behavior, with invalid results for queries. Thus if you run Slices with openjpa.Multithreaded=false, then any query you run could be returning invalid values! I just wanted to go on record stating that this bug ( OPENJPA-825 ) is an either/or proposition, or only an issue if you run with openjpa.Multithreaded=true, because currently Slices requires openjpa.Multithreaded=true. Thus though the easy solution is #1, that is the most un-performant of any of them, so it's a great short-term solution. And #3 is useless because deadlocks ARE GUARANTEED to occur ( OPENJPA-820 ), because most of the code uses a global lock, there is no way around deadlocks. #2 is the only long-term solution...
          Hide
          Pinaki Poddar added a comment -

          Multithreaded mode of OpenJPA kernel and parallel database operation in Slice
          ===============================================================

          Background
          a) OpenJPA allows multiple threads to execute a single instance of EntityManager. By default, OpenJPA assumes that a single thread is invoking operations on a particular instance of EntityManager.
          openjpa.Multithreaded configuration property can be set to true to signal multithreaded acceess. Under multithreaded mode, OpenJPA kernel classes acquire an instance-level Reentrant lock before almost any of its methods and
          releases the lock on the method's finally block.

          b) Slice executes most of the frequent database operations (query, flush) on individual slice in separate threads.

          These two threading models conflict and give rise to a classic deadlock scenario as follows:

          1. Assume EntityManager instance em, a Query instance Q created by E, openjpa.Multithreaded=true, a user thread UT and three slices S1,S2,S3

          2. T calls em.x() or Q.y()

          3. em/Q acquires a reentrant lock L on thread UT and invokes lower-layer method which eventually invokes Slice operations

          4. Slice spawns three threads ST1, ST2, ST3 and on each of these threads invoke identical operation S.z()

          5. If S.z() on ST1 invokes any operation of em/Q then ST1 can not acquire L as it is acquired by em/Q in step 3 and yet to be released. The architecture of Slice makes it typical that S.z() invokes one or more method on em or Q.

          6. em.x()/Q.y() can not release L till S.z() finishes

          7. S.z() can not finish because ST1 waits for L to be released by UT

          Observations
          ==========
          a) openjpa.Multithreaded is a non-default option and single threaded operation on em is more prevalent. Note that single threaded access does not imply that em can only be invoked only on UT. It is perfectly permissible to
          start a transaction of em on some thread UT, commit the transaction and then start another transaction on same em on a different thread UT2 under default mode of openjpa.Multithreaded=false.

          b) Execution of common database operations on each slice S1, S2,... in parallel has definite performance benefit and should be the default choice. And this is at par with the default choice of openjpa.Multithreaded=false

          Possible solutions
          ===============
          1. Under openjpa.Multithreaded=true, execute database operation on all slices on the same user thread UT. Otherwise, execute database operation on each slice on separate thread
          2. Modify OpenJPA kernel's threading model to make it more fine-grained, read/write sensitive
          3. Detect deadlock and throw exception as ST1 waits on L

          My preferred solution is (1).

          Show
          Pinaki Poddar added a comment - Multithreaded mode of OpenJPA kernel and parallel database operation in Slice =============================================================== Background a) OpenJPA allows multiple threads to execute a single instance of EntityManager. By default, OpenJPA assumes that a single thread is invoking operations on a particular instance of EntityManager. openjpa.Multithreaded configuration property can be set to true to signal multithreaded acceess. Under multithreaded mode, OpenJPA kernel classes acquire an instance-level Reentrant lock before almost any of its methods and releases the lock on the method's finally block. b) Slice executes most of the frequent database operations (query, flush) on individual slice in separate threads. These two threading models conflict and give rise to a classic deadlock scenario as follows: 1. Assume EntityManager instance em, a Query instance Q created by E, openjpa.Multithreaded=true, a user thread UT and three slices S1,S2,S3 2. T calls em.x() or Q.y() 3. em/Q acquires a reentrant lock L on thread UT and invokes lower-layer method which eventually invokes Slice operations 4. Slice spawns three threads ST1, ST2, ST3 and on each of these threads invoke identical operation S.z() 5. If S.z() on ST1 invokes any operation of em/Q then ST1 can not acquire L as it is acquired by em/Q in step 3 and yet to be released. The architecture of Slice makes it typical that S.z() invokes one or more method on em or Q. 6. em.x()/Q.y() can not release L till S.z() finishes 7. S.z() can not finish because ST1 waits for L to be released by UT Observations ========== a) openjpa.Multithreaded is a non-default option and single threaded operation on em is more prevalent. Note that single threaded access does not imply that em can only be invoked only on UT. It is perfectly permissible to start a transaction of em on some thread UT, commit the transaction and then start another transaction on same em on a different thread UT2 under default mode of openjpa.Multithreaded=false. b) Execution of common database operations on each slice S1, S2,... in parallel has definite performance benefit and should be the default choice. And this is at par with the default choice of openjpa.Multithreaded=false Possible solutions =============== 1. Under openjpa.Multithreaded=true, execute database operation on all slices on the same user thread UT. Otherwise, execute database operation on each slice on separate thread 2. Modify OpenJPA kernel's threading model to make it more fine-grained, read/write sensitive 3. Detect deadlock and throw exception as ST1 waits on L My preferred solution is (1).
          Hide
          Fernando Padilla added a comment -

          the log file to show how the system hung on a query...

          Show
          Fernando Padilla added a comment - the log file to show how the system hung on a query...
          Hide
          Fernando Padilla added a comment -

          jstack, showing where the query hangs..

          Show
          Fernando Padilla added a comment - jstack, showing where the query hangs..

            People

            • Assignee:
              Pinaki Poddar
              Reporter:
              Fernando Padilla
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development