Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-22867

The ForkJoinPool in CleanerChore will spawn thousands of threads in our cluster with thousands table

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.0.0, 2.3.0, 2.2.1, 2.1.6
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Replace the ForkJoinPool in CleanerChore by ThreadPoolExecutor which can limit the spawn thread size and avoid the master GC frequently. The replacement is an internal implementation in CleanerChore, so no config key change, the upstream users can just upgrade the hbase master without any other change.
      Show
      Replace the ForkJoinPool in CleanerChore by ThreadPoolExecutor which can limit the spawn thread size and avoid the master GC frequently. The replacement is an internal implementation in CleanerChore, so no config key change, the upstream users can just upgrade the hbase master without any other change.
    • Tags:
      master

      Description

      The thousands of spawned threads make the safepoint cost 80+s in our Master JVM processs.

      2019-08-15,19:35:35,861 INFO [main-SendThread(zjy-hadoop-prc-zk02.bj:11000)] org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 82260ms for sessionid 0x1691332e2d3aae5, closing socket connection and at
      tempting reconnect
      

      The stdout from JVM (can see from here there're 9126 threads & sync cost 80+s)

      vmop                    [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
      32358.859: ForceAsyncSafepoint              [    9126         67            474    ]      [     1    28 86596    87   101    ]  0
      

      Also we got the jstack:

      $ cat 31162.stack.1  | grep 'ForkJoinPool-1-worker' | wc -l
      8648
      

      It's a dangerous bug, make it as blocker.

        Attachments

        1. 191318.stack
          638 kB
          Zheng Hu
        2. 191318.stack.1
          773 kB
          Zheng Hu
        3. 31162.stack.1
          14.84 MB
          Zheng Hu

          Issue Links

            Activity

              People

              • Assignee:
                openinx Zheng Hu
                Reporter:
                openinx Zheng Hu
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: