Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-548

TestFsck takes nearly 10 minutes to run - a quarter of the entire hdfs-test time

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      run-test-hdfs was run repeatedly (n = 16) and the execution time of each test was recorded. Testfsck was by far the longest test, with a median run time of 564 seconds, nearly 10 minutes. This is a quarter of the entire time spent over the average 40 minute run for all the tests, and much too long for any individual test.

      1. slowFsck.patch
        0.6 kB
        Hairong Kuang

        Activity

        Jakob Homan created issue -
        Hide
        Jakob Homan added a comment -

        Here are the test times: (n = 16, mean=512.4sec, median=564.8sec, stddev=126.07sec), for those who love numbers.

        TestFsck
        execution
        time(sec)
        512.39
        564.80
        126.07
        652.04
        628.93
        602.27
        569.37
        465.24
        219.35
        385.75
        486.79
        617.26
        636.08
        586.59
        408.39
        320.59
        588.53
        470.80
        560.22
        Show
        Jakob Homan added a comment - Here are the test times: (n = 16, mean=512.4sec, median=564.8sec, stddev=126.07sec), for those who love numbers. TestFsck execution time(sec) 512.39 564.80 126.07 652.04 628.93 602.27 569.37 465.24 219.35 385.75 486.79 617.26 636.08 586.59 408.39 320.59 588.53 470.80 560.22
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast?

        Show
        Tsz Wo Nicholas Sze added a comment - Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast?
        Hide
        Jakob Homan added a comment -

        Blast. I forgot to pull the mean, median and stddev from the row when I copied it. Here is the correct table:

        TestFsck
        execution
        time(sec)
        652.04
        628.93
        602.27
        569.37
        465.24
        219.35
        385.75
        486.79
        617.26
        636.08
        586.59
        408.39
        320.59
        588.53
        470.80
        560.22
        Show
        Jakob Homan added a comment - Blast. I forgot to pull the mean, median and stddev from the row when I copied it. Here is the correct table: TestFsck execution time(sec) 652.04 628.93 602.27 569.37 465.24 219.35 385.75 486.79 617.26 636.08 586.59 408.39 320.59 588.53 470.80 560.22
        Hide
        Jakob Homan added a comment -

        Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast?

        I messed up the chart, but the standard deviation for the test is high in general.

        Show
        Jakob Homan added a comment - Surprisingly, the min is 126.07, which is outside 3sd. If we assume normal distribution, the probability of getting it is < 0.003 but you got it in 1/16. What makes it running so fast? I messed up the chart, but the standard deviation for the test is high in general.
        Hairong Kuang made changes -
        Field Original Value New Value
        Assignee Hairong Kuang [ hairong ]
        Hide
        Hairong Kuang added a comment -

        TestFsck#testFsckMove corrupts a block file on disk and then depends on block reports to notify NN of the corrupt blocks. However, with the change in HADOOP-4584, block reports are always sent from in-memory volumeMap. The volumeMap depends on block scanner to scan the disk and update the in-memory volumeMap if there is a difference. The test needs to make the scanner period shorter.

        Show
        Hairong Kuang added a comment - TestFsck#testFsckMove corrupts a block file on disk and then depends on block reports to notify NN of the corrupt blocks. However, with the change in HADOOP-4584 , block reports are always sent from in-memory volumeMap. The volumeMap depends on block scanner to scan the disk and update the in-memory volumeMap if there is a difference. The test needs to make the scanner period shorter.
        Hide
        Hairong Kuang added a comment -

        With this change, TestFsck runs around 1 minute.

        Show
        Hairong Kuang added a comment - With this change, TestFsck runs around 1 minute.
        Hairong Kuang made changes -
        Attachment slowFsck.patch [ 12416801 ]
        Hairong Kuang made changes -
        Project Hadoop Common [ 12310240 ] Hadoop HDFS [ 12310942 ]
        Key HADOOP-6083 HDFS-548
        Component/s test [ 12312916 ]
        Component/s test [ 12311440 ]
        Hairong Kuang made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.21.0 [ 12314046 ]
        Fix Version/s 0.21.0 [ 12314046 ]
        Hide
        Suresh Srinivas added a comment -

        +1. Thanks Hairong for catching this.

        Show
        Suresh Srinivas added a comment - +1. Thanks Hairong for catching this.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12416801/slowFsck.patch
        against trunk revision 805203.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416801/slowFsck.patch against trunk revision 805203. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/70/console This message is automatically generated.
        Hide
        Hairong Kuang added a comment -

        I just committed this!

        Show
        Hairong Kuang added a comment - I just committed this!
        Hairong Kuang made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #55 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/55/)
        . TestFsck takes nearly 10 minutes to run. Contributed by Hairong Kuang.

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #55 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/55/ ) . TestFsck takes nearly 10 minutes to run. Contributed by Hairong Kuang.
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        59d 20h 13m 1 Hairong Kuang 17/Aug/09 21:15
        Patch Available Patch Available Resolved Resolved
        19h 6m 1 Hairong Kuang 18/Aug/09 16:22
        Resolved Resolved Closed Closed
        371d 4h 26m 1 Tom White 24/Aug/10 20:48

          People

          • Assignee:
            Hairong Kuang
            Reporter:
            Jakob Homan
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development