Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5276

FileSystem.Statistics got performance issue on multi-thread read/write.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.4-alpha
    • Fix Version/s: 2.3.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().

      1. HDFS-5276.003.patch
        15 kB
        Colin Patrick McCabe
      2. TestFileSystemStatistics.java
        2 kB
        Binglin Chang
      3. HDFS-5276.002.patch
        14 kB
        Colin Patrick McCabe
      4. HDFS-5276.001.patch
        14 kB
        Colin Patrick McCabe
      5. ThreadLocalStat.patch
        3 kB
        Binglin Chang
      6. DisableFSReadWriteBytesStat.patch
        3 kB
        Binglin Chang
      7. jstack-trace.PNG
        118 kB
        Chengxiang Li
      8. hdfs-test.PNG
        122 kB
        Chengxiang Li
      9. HDFSStatisticTest.java
        2 kB
        Chengxiang Li

        Issue Links

          Activity

          Chengxiang Li created issue -
          Chengxiang Li made changes -
          Field Original Value New Value
          Attachment HDFSStatisticTest.java [ 12605850 ]
          Attachment hdfs-test.PNG [ 12605851 ]
          Attachment jstack-trace.PNG [ 12605852 ]
          Chengxiang Li made changes -
          Description FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead(). FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's say more than 30 threads). so it may cause serious performance issue. during our spark test profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().
          Binglin Chang made changes -
          Link This issue is related to HADOOP-5318 [ HADOOP-5318 ]
          Binglin Chang made changes -
          Attachment DisableFSReadWriteBytesStat.patch [ 12605866 ]
          Binglin Chang made changes -
          Attachment ThreadLocalStat.patch [ 12606126 ]
          Binglin Chang made changes -
          Attachment ThreadLocalStat.patch [ 12606126 ]
          Binglin Chang made changes -
          Attachment ThreadLocalStat.patch [ 12606127 ]
          Colin Patrick McCabe made changes -
          Assignee Colin Patrick McCabe [ cmccabe ]
          Colin Patrick McCabe made changes -
          Attachment HDFS-5276.001.patch [ 12607862 ]
          Colin Patrick McCabe made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Target Version/s 2.3.0 [ 12324588 ]
          Colin Patrick McCabe made changes -
          Attachment HDFS-5276.002.patch [ 12607889 ]
          Binglin Chang made changes -
          Attachment TestFileSystemStatistics.java [ 12608143 ]
          Colin Patrick McCabe made changes -
          Attachment HDFS-5276.003.patch [ 12608402 ]
          Colin Patrick McCabe made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 2.3.0 [ 12324588 ]
          Resolution Fixed [ 1 ]
          Arun C Murthy made changes -
          Fix Version/s 2.3.0 [ 12325255 ]
          Fix Version/s 2.4.0 [ 12324588 ]
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Colin Patrick McCabe
              Reporter:
              Chengxiang Li
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development