Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-548

TestFsck takes nearly 10 minutes to run - a quarter of the entire hdfs-test time

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      run-test-hdfs was run repeatedly (n = 16) and the execution time of each test was recorded. Testfsck was by far the longest test, with a median run time of 564 seconds, nearly 10 minutes. This is a quarter of the entire time spent over the average 40 minute run for all the tests, and much too long for any individual test.

      1. slowFsck.patch
        0.6 kB
        Hairong Kuang

        Activity

        Hide
        Jakob Homan added a comment -

        Here are the test times: (n = 16, mean=512.4sec, median=564.8sec, stddev=126.07sec), for those who love numbers.

        TestFsck
        execution
        time(sec)
        512.39
        564.80
        126.07
        652.04
        628.93
        602.27
        569.37
        465.24
        219.35
        385.75
        486.79
        617.26
        636.08
        586.59
        408.39
        320.59
        588.53
        470.80
        560.22
        Show
        Jakob Homan added a comment - Here are the test times: (n = 16, mean=512.4sec, median=564.8sec, stddev=126.07sec), for those who love numbers. TestFsck execution time(sec) 512.39 564.80 126.07 652.04 628.93 602.27 569.37 465.24 219.35 385.75 486.79 617.26 636.08 586.59 408.39 320.59 588.53 470.80 560.22
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast?

        Show
        Tsz Wo Nicholas Sze added a comment - Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast?
        Hide
        Jakob Homan added a comment -

        Blast. I forgot to pull the mean, median and stddev from the row when I copied it. Here is the correct table:

        TestFsck
        execution
        time(sec)
        652.04
        628.93
        602.27
        569.37
        465.24
        219.35
        385.75
        486.79
        617.26
        636.08
        586.59
        408.39
        320.59
        588.53
        470.80
        560.22
        Show
        Jakob Homan added a comment - Blast. I forgot to pull the mean, median and stddev from the row when I copied it. Here is the correct table: TestFsck execution time(sec) 652.04 628.93 602.27 569.37 465.24 219.35 385.75 486.79 617.26 636.08 586.59 408.39 320.59 588.53 470.80 560.22
        Hide
        Jakob Homan added a comment -

        Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast?

        I messed up the chart, but the standard deviation for the test is high in general.

        Show
        Jakob Homan added a comment - Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast? I messed up the chart, but the standard deviation for the test is high in general.
        Hide
        Hairong Kuang added a comment -

        TestFsck#testFsckMove corrupts a block file on disk and then depends on block reports to notify NN of the corrupt blocks. However, with the change in HADOOP-4584, block reports are always sent from in-memory volumeMap. The volumeMap depends on block scanner to scan the disk and update the in-memory volumeMap if there is a difference. The test needs to make the scanner period shorter.

        Show
        Hairong Kuang added a comment - TestFsck#testFsckMove corrupts a block file on disk and then depends on block reports to notify NN of the corrupt blocks. However, with the change in HADOOP-4584 , block reports are always sent from in-memory volumeMap. The volumeMap depends on block scanner to scan the disk and update the in-memory volumeMap if there is a difference. The test needs to make the scanner period shorter.
        Hide
        Hairong Kuang added a comment -

        With this change, TestFsck runs around 1 minute.

        Show
        Hairong Kuang added a comment - With this change, TestFsck runs around 1 minute.
        Hide
        Suresh Srinivas added a comment -

        +1. Thanks Hairong for catching this.

        Show
        Suresh Srinivas added a comment - +1. Thanks Hairong for catching this.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12416801/slowFsck.patch
        against trunk revision 805203.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416801/slowFsck.patch against trunk revision 805203. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/console This message is automatically generated.
        Hide
        Hairong Kuang added a comment -

        I just committed this!

        Show
        Hairong Kuang added a comment - I just committed this!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #55 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/55/)
        . TestFsck takes nearly 10 minutes to run. Contributed by Hairong Kuang.

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #55 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/55/ ) . TestFsck takes nearly 10 minutes to run. Contributed by Hairong Kuang.

          People

          • Assignee:
            Hairong Kuang
            Reporter:
            Jakob Homan
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development