Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15879

Fix HiveMetaStoreChecker.checkPartitionDirs method

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.3.0
    • Metastore
    • None

    Description

      HIVE-15803 fixes the msck hang issue in HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if the Threadpool has any spare threads. If not it uses single threaded listing of the files.

          if (pool != null) {
            synchronized (pool) {
              // In case of recursive calls, it is possible to deadlock with TP. Check TP usage here.
              if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
                useThreadPool = true;
              }
      
              if (!useThreadPool) {
                if (LOG.isDebugEnabled()) {
                  LOG.debug("Not using threadPool as active count:" + pool.getActiveCount()
                      + ", max:" + pool.getMaximumPoolSize());
                }
              }
            }
          }
      

      Based on the java doc of getActiveCount() below

      Returns the approximate number of threads that are actively executing tasks.

      it returns only approximate number of threads and it cannot be guaranteed that it always returns the exact number of active threads. This still exposes the method implementation to the msck hang bug in rare corner cases.

      We could either:
      1. Use a atomic counter to track exactly how many threads are actively running
      2. Relook at the method itself to make it much simpler. Like eg, look into the possibility of changing the recursive implementation to an iterative implementation where worker threads pick tasks from a queue until the queue is empty.

      Attachments

        1. HIVE-15879.01.patch
          27 kB
          Vihang Karajgaonkar
        2. HIVE-15879.02.patch
          27 kB
          Vihang Karajgaonkar
        3. HIVE-15879.03.patch
          28 kB
          Vihang Karajgaonkar
        4. HIVE-15879.04.patch
          28 kB
          Vihang Karajgaonkar

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vihangk1 Vihang Karajgaonkar Assign to me
            vihangk1 Vihang Karajgaonkar
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment