Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5276

FileSystem.Statistics got performance issue on multi-thread read/write.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.4-alpha
    • 2.3.0
    • None
    • None

    Description

      FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().

      Attachments

        1. DisableFSReadWriteBytesStat.patch
          3 kB
          Binglin Chang
        2. HDFS-5276.001.patch
          14 kB
          Colin McCabe
        3. HDFS-5276.002.patch
          14 kB
          Colin McCabe
        4. HDFS-5276.003.patch
          15 kB
          Colin McCabe
        5. HDFSStatisticTest.java
          2 kB
          Chengxiang Li
        6. hdfs-test.PNG
          122 kB
          Chengxiang Li
        7. jstack-trace.PNG
          118 kB
          Chengxiang Li
        8. TestFileSystemStatistics.java
          2 kB
          Binglin Chang
        9. ThreadLocalStat.patch
          3 kB
          Binglin Chang

        Issue Links

          Activity

            People

              cmccabe Colin McCabe
              chengxiang li Chengxiang Li
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: