Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1823

Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.22.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method fetches information from DataNode so it is slow. It becomes the bottleneck of the RaidNode. It will be nice if we can make this more efficient.

        Activity

        Hide
        Scott Chen added a comment -

        Here's the corresponding jstack:

                at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
                at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
                at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
                at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
                - locked <0x00002aaab7e19810> (a sun.nio.ch.Util$1)
                - locked <0x00002aaab7e197f8> (a java.util.Collections$UnmodifiableSet)
                - locked <0x00002aaab7e19468> (a sun.nio.ch.EPollSelectorImpl)
                at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
                at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332)
                at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
                at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
                at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
                at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
                at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
                - locked <0x00002aaae427a320> (a java.io.BufferedInputStream)
                at java.io.DataInputStream.readShort(DataInputStream.java:295)
                at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1436)
                at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1698)
                - locked <0x00002aaae4264f38> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
                at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1815)
                - locked <0x00002aaae4264f38> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
                at java.io.DataInputStream.read(DataInputStream.java:83)
                at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
                at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
                at org.apache.hadoop.fs.HarFileSystem.fileStatusInIndex(HarFileSystem.java:441)
                at org.apache.hadoop.fs.HarFileSystem.getFileStatus(HarFileSystem.java:616)
                at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:541)
                at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:561)
                at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:639)
                at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655)
                at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655)
                at org.apache.hadoop.raid.RaidNode.selectFiles(RaidNode.java:594)
                at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:63)
                at org.apache.hadoop.raid.RaidNode$TriggerMonitor.doProcess(RaidNode.java:374)
                at org.apache.hadoop.raid.RaidNode$TriggerMonitor.run(RaidNode.java:313)
                at java.lang.Thread.run(Thread.java:619)
        
        Show
        Scott Chen added a comment - Here's the corresponding jstack: at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0x00002aaab7e19810> (a sun.nio.ch.Util$1) - locked <0x00002aaab7e197f8> (a java.util.Collections$UnmodifiableSet) - locked <0x00002aaab7e19468> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked <0x00002aaae427a320> (a java.io.BufferedInputStream) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1436) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1698) - locked <0x00002aaae4264f38> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1815) - locked <0x00002aaae4264f38> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187) at org.apache.hadoop.fs.HarFileSystem.fileStatusInIndex(HarFileSystem.java:441) at org.apache.hadoop.fs.HarFileSystem.getFileStatus(HarFileSystem.java:616) at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:541) at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:561) at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:639) at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655) at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655) at org.apache.hadoop.raid.RaidNode.selectFiles(RaidNode.java:594) at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:63) at org.apache.hadoop.raid.RaidNode$TriggerMonitor.doProcess(RaidNode.java:374) at org.apache.hadoop.raid.RaidNode$TriggerMonitor.run(RaidNode.java:313) at java.lang. Thread .run( Thread .java:619)
        Hide
        Scott Chen added a comment -

        In the patch, when performing getFileStatus() in recursing the policy, we do listStatus() instead.
        And we put the result in a map. This will reduce the number of RPCs to NN.

        There is no unit test. This is an optimization and the code path is covered by the original tests: TestRaidNode, TestRaidPurge and TestRaidHar.

        Show
        Scott Chen added a comment - In the patch, when performing getFileStatus() in recursing the policy, we do listStatus() instead. And we put the result in a map. This will reduce the number of RPCs to NN. There is no unit test. This is an optimization and the code path is covered by the original tests: TestRaidNode, TestRaidPurge and TestRaidHar.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12447914/MAPREDUCE-1823.txt
        against trunk revision 957437.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447914/MAPREDUCE-1823.txt against trunk revision 957437. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/266/console This message is automatically generated.

          People

          • Assignee:
            Scott Chen
            Reporter:
            Scott Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development