Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3967

DAGImpl: dag lock is unfair and can starve the writers

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Found when debugging HIVE-20103, that a reader arriving when another reader is active can postpone a writer from obtaining a write-lock.

      This is fundamentally bad for the DAGImpl as useful progress can only happen when the writeLock is held.

        public void handle(DAGEvent event) {
      ...
          try {
            writeLock.lock();
      
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00007efb02246f40> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
              at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
              at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1162)
              at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:149)
              at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:2251)
              at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:2242)
              at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180)
              at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)
              at java.lang.Thread.run(Thread.java:745)
      

      while read-lock is passed around between

             at org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:901)
              at org.apache.tez.dag.app.dag.impl.DAGImpl.getDAGStatus(DAGImpl.java:940)
              at org.apache.tez.dag.api.client.DAGClientHandler.getDAGStatus(DAGClientHandler.java:73)
      

      calls.

      Attachments

        1. TEZ-3967.03.patch
          29 kB
          László Bodor
        2. TEZ-3967.02.patch
          17 kB
          László Bodor
        3. TEZ-3967.01.patch
          8 kB
          László Bodor

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            abstractdog László Bodor
            gopalv Gopal Vijayaraghavan

            Dates

              Created:
              Updated:

              Slack

                Issue deployment