Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11900

Hedged reads thread pool creation not synchronized



    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 3.2.0, 3.1.1
    • Component/s: hdfs-client
    • Labels:
    • Target Version/s:


      Non-static synchronized method initThreadsNumForHedgedReads can't synchronize the access to the static class variable HEDGED_READ_THREAD_POOL.

        private static ThreadPoolExecutor HEDGED_READ_THREAD_POOL;
        private synchronized void initThreadsNumForHedgedReads(int num) {

      2 DFS clients may update the same static variable in a race because the lock is on each DFS client object, not on the shared DFSClient class object.

      There are 2 possible fixes:
      1. "Global thread pool": Change initThreadsNumForHedgedReads to static
      2. "Per-client thread pool": Change HEDGED_READ_THREAD_POOL to non-static

      From the description for property dfs.client.hedged.read.threadpool.size:

      to a positive number. The threadpool size is how many threads to dedicate
      to the running of these 'hedged', concurrent reads in your client.

      it seems to indicate the thread pool is per DFS client.

      Let's assume we go with #1 "Global thread pool". One DFS client has the property set to 10 in its config, while the other client has the property set to 5 in its config, what is supposed to the size of the global thread pool? 5? 10? Or 15?

      The 2nd fix seems more reasonable to me.


        1. HDFS-11900.001.patch
          1 kB
          John Zhuge

          Issue Links



              • Assignee:
                jzhuge John Zhuge
                jzhuge John Zhuge
              • Votes:
                0 Vote for this issue
                5 Start watching this issue


                • Created: