Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13799

TestEditLogTailer#testTriggersLogRollsForAllStandbyNN fails due to missing synchronization between rollEditsRpcExecutor and tailerThread shutdown

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.0.0
    • 3.2.0, 3.1.2
    • ha
    • None
    • Reviewed

    Description

      TestEditLogTailer#testTriggersLogRollsForAllStandbyNN unit test is failing in our internal environment with following error,

      java.lang.AssertionError: Test resulted in an unexpected exit
      	at org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testTriggersLogRollsForAllStandbyNN(TestEditLogTailer.java:245)

      This test failure is due to following error during shutdown of the MiniDfsCluster

      2018-07-31 21:59:27,806 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1965)) - Shutting down the Mini HDFS Cluster
      2018-07-31 21:59:27,806 [main] FATAL hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1968)) - Test resulted in an unexpected exit
      1: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1ce1d2b6 rejected from java.util.concurrent.ThreadPoolExecutor@12263f5a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
      	at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265)
      	at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:441)
      	at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
      	at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
      	at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:482)
      	at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
      Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1ce1d2b6 rejected from java.util.concurrent.ThreadPoolExecutor@12263f5a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
      	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
      	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
      	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
      	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
      	at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:681)
      	at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:351)
      	at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:411)
      	... 4 more

      It looks like the EditLogTailer class is not handling the shutdown correctly. Specifically the EditLogTailer#stop() method shuts down the rollEditsRpcExecutor executor service before setting the tailerThread#shouldRun flag. This is a race condition since the tailerThread can try to submit a new task to this executor service which has been asked to shutdown. If that happens, it will receive an unexpected RejectedExecutionException, resulting in a test failure. The solution should be to properly synchronize shutdown of tailerThread with rollEditsRpcExecutor.

      Attachments

        1. HDFS-13799-001.patch
          1 kB
          Hrishikesh Gadre

        Issue Links

          Activity

            People

              hgadre Hrishikesh Gadre
              hgadre Hrishikesh Gadre
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: