Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5276

FileSystem.Statistics got performance issue on multi-thread read/write.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.4-alpha
    • Fix Version/s: 2.3.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().

        Attachments

        1. ThreadLocalStat.patch
          3 kB
          Binglin Chang
        2. TestFileSystemStatistics.java
          2 kB
          Binglin Chang
        3. jstack-trace.PNG
          118 kB
          Chengxiang Li
        4. hdfs-test.PNG
          122 kB
          Chengxiang Li
        5. HDFSStatisticTest.java
          2 kB
          Chengxiang Li
        6. HDFS-5276.003.patch
          15 kB
          Colin P. McCabe
        7. HDFS-5276.002.patch
          14 kB
          Colin P. McCabe
        8. HDFS-5276.001.patch
          14 kB
          Colin P. McCabe
        9. DisableFSReadWriteBytesStat.patch
          3 kB
          Binglin Chang

          Issue Links

            Activity

              People

              • Assignee:
                cmccabe Colin P. McCabe
                Reporter:
                chengxiang li Chengxiang Li
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: